[HN Gopher] Genie 3: A new frontier for world models
       ___________________________________________________________________
        
       Genie 3: A new frontier for world models
        
       Author : bradleyg223
       Score  : 996 points
       Date   : 2025-08-05 14:08 UTC (8 hours ago)
        
 (HTM) web link (deepmind.google)
 (TXT) w3m dump (deepmind.google)
        
       | 93po wrote:
       | I wouldn't want to be a hollywood production studio or game
       | developer right now.
        
         | tkgally wrote:
         | Same here. Though if I were a 17-year-old film fan or gamer
         | with an imaginative drive, I would be really excited about the
         | powerful creative tools that might become available to me soon.
        
           | Mouvelie wrote:
           | I don't know if you had the teenage years I had, but there
           | would be A LOT of NSFW content made on that thing.
        
         | mclau157 wrote:
         | Hollywood maybe for small scenes, but gamers would quickly
         | realize and destroy this level of quality and continuity vs. a
         | 3D game engine with defined meshes
        
           | 93po wrote:
           | i meant more in 5 years when its significantly better
        
         | ducktective wrote:
         | Well, what _would_ you want to be? Frontend dev? Mobile dev?
         | Script writer? Logo designer? Junior lawyer?
        
           | 93po wrote:
           | housewife
        
             | assword wrote:
             | Housewife to an AI CEO might be a good gig, if you can take
             | the beatings.
        
         | myaccountonhn wrote:
         | I actually think indie game dev is quite safe from AI (well its
         | already insanely competitive). It might change the field, or
         | shrink the market but I think AI has a chance at replacing
         | workers where the only metric that matters is $$$ and
         | productivity. I just don't see myself consuming, for example,
         | an AI generated autobiography or any AI generated book. As long
         | as enough people feel that way the market will continue to be
         | there.
        
           | rane wrote:
           | Is it though? Won't AI make the barrier of entry to indie
           | game dev even lower, as assets and code will be able to be
           | created effortlessly.
        
         | hooverd wrote:
         | There's already a glut of open world slop!
        
         | vaenaes wrote:
         | A common refrain among the least creative people in the world.
        
       | yanis_t wrote:
       | > Text rendering. Clear and legible text is often only generated
       | when provided in the input world description.
       | 
       | Reminds me of when image AIs weren't able to generate text. It
       | wasn't too long until they fixed it.
        
         | reactordev wrote:
         | And made hands 10x worse. Now hands are good, text is good,
         | image is good, so we'll have to play where's Waldo all over
         | again trying to find the flaw. It's going to eventually get to
         | a point where it's one of those infinite zoom videos where the
         | AI watermark is the size of 1/3rd of a pixel.
         | 
         | What I'd really love to see more of is augmented video. Like,
         | the stormtrooper vlogs. Runway has some good stuff but man is
         | it all expensive.
        
           | maerF0x0 wrote:
           | someone mentioned physics. Which might be an interesting
           | conundrum because an important characteristic of games is
           | that some part of them is both novel and unrealistic.
           | (They're less fun if they're too real)
        
             | reactordev wrote:
             | It depends on the genre. Simulation "games" tend to love
             | realism of simulation while providing an accelerated time.
             | Others like you said, are more fun with plausible physics
             | or physics that bend the rules a little bit. Sometimes, a
             | game just about funky physics becomes a hit - Goat
             | Simulator.
             | 
             | Walking/Running/Steps have already been solved pretty well
             | with NN's, but simulation of vehicle engines and vehicle
             | physics have not. Not to my knowledge. I suspect iRacing
             | would be extremely interested in such a model.
             | 
             |  _edit_
             | 
             | I take it back, PINN's are a thing and now I have a new
             | rabbit hole...
        
         | TheAceOfHearts wrote:
         | I wouldn't say that the text problem has been fully fixed. It
         | has certainly gotten a lot better, but even gpt-image-1 still
         | fails occasionally when generating text.
        
         | yencabulator wrote:
         | Note that the prompt and the generated chalkboard disagree on
         | whether there's a dash or not.
        
       | yanis_t wrote:
       | And unfortunately not possible to play around for the general
       | public.
        
         | cj wrote:
         | > World models are also a key stepping stone on the path to
         | AGI, since they make it possible to train AI agents in an
         | unlimited curriculum of rich simulation environments.
         | 
         | I don't think Humans are the target market for this model, at
         | least right now.
         | 
         | Sounds like the use case is creating worlds for AI agents to
         | play in.
        
         | hodgehog11 wrote:
         | This kind of announcement without an appropriate demo to verify
         | their claims is pretty common with DeepMind at this point. They
         | barely even discuss their limitations, so as always, this
         | should be taken with a grain of salt.
        
           | Miraste wrote:
           | Most of the big labs never go into their models' limitations.
           | OpenAI does it best, despite their inveterate hype-building.
           | Their releases always have a reasonable limitations section,
           | usually with text/image/video examples of failures.
        
         | qingcharles wrote:
         | Here's a (weaker) competitor that's live:
         | 
         | https://odyssey.world/introducing-interactive-video
        
         | romanovcode wrote:
         | Yeah, honestly - what's the point of announcing it then?
         | 
         | I DECLARE BANKRUPTCY vibes here
        
       | zb3 wrote:
       | Yet another unavailable model from Google.. if I can't use it, I
       | don't care. Tell me about it when it's ready to use.
        
         | Tadpole9181 wrote:
         | What a strange take. Do you not care about news coming from the
         | James Webb Telescope either, just because you can't play with
         | the telescope personally?
         | 
         | It's a whitepaper release to share the STOTA research. This
         | doesn't seem like an economically viable model, nor does it
         | look polished enough to be practically usable.
        
           | growthwtf wrote:
           | I think it's a perfectly valid take coming from some
           | intersection of an engineering mindset and FOSS culture. And,
           | the comparison you bring up is a bit of a category error.
           | 
           | We know how James Webb works and it's developed by an
           | international consortium of researchers. One of our most
           | trusted international institutions, and very verifiable.
           | 
           | We do not know how Genie works, it is unverifiable to non-
           | Google researchers, and there are not enough technical
           | details to move much external teams forward. Worst case, this
           | page could be a total fabrication intended to derail
           | competition by lying about what Google is _actually_ spending
           | their time on.
           | 
           | We really don't know.
           | 
           | I don't say this to defend the other comment and say you're
           | wrong, because I empathize with both points. But I do think
           | that treating Google with total credulity would be a mistake,
           | and the James Webb comparison is a disservice to the JW team.
        
           | zb3 wrote:
           | James Webb Telescope is not something that can be - and is
           | released. AI models are, and others are announcing them when
           | they're available, but DeepMind introduces noise here with
           | their "trust us, that works, now go away" approach.
        
             | delusional wrote:
             | > James Webb Telescope is not something that can be - and
             | is released
             | 
             | I would actually turn that around. The Telescope is
             | released. It's flying around up there taking photos. If
             | they kept it in some garage while releasing flashy PR pages
             | about how groundbreaking it is, then I'd be pretty
             | skeptical.
        
           | alganet wrote:
           | It's a bitter take, but your comparison with JWST is invalid.
           | 
           | The main product of the telescope is its data, not the
           | ability for anyone to play with the instruments.
           | 
           | The main product of the model is the ability for anyone to
           | play with it.
           | 
           | Strange rebutal.
        
         | esafak wrote:
         | You don't have to be the customer of every product (that
         | affects you).
        
       | brotchie wrote:
       | First AI thing that's made me feel a bit of derealization...
       | 
       | ...and this is the worst the capabilities will ever be.
       | 
       | Watching the video created a glimmer of doubt that perhaps my
       | current reality is a future version of myself, or some other
       | consciousness, that's living its life in an AI hallucinated
       | environment.
        
         | curwin wrote:
         | Same, even worse that the world was generated on the fly--
         | nobody even hand-crafted it. That makes it even more
         | depressing.
        
           | drstewart wrote:
           | ... so like the real world?
        
           | thatfrenchguy wrote:
           | I mean, Minecraft worlds are generated on the fly and they've
           | unleashed quite a it of creativity in children
        
         | simianparrot wrote:
         | The same argument can be used for anything.
         | 
         | Personal jetpacks are the worst they'll ever be. Doesn't mean
         | they're any close to being useful.
        
           | delusional wrote:
           | It's also just wrong. Plenty of things get worse.
        
           | crossbody wrote:
           | Not really. E.g. clay tablets, physical books will not be
           | meaningfully better.
        
             | suddenlybananas wrote:
             | Yes, but they're not going to get any worse.
        
               | crossbody wrote:
               | "not getting worse" is a pretty low bar
        
               | simianparrot wrote:
               | Which is precisely my point. It's the _lowest_ bar
               | possible.
        
               | yencabulator wrote:
               | Books are printed on worse paper these days that doesn't
               | last as long.
        
           | Workaccount2 wrote:
           | While true, it's the speed of improvement that gives the
           | statement gravity.
        
           | westoncb wrote:
           | The difference is the incentive to improve, and actual
           | present rate of improvement, for models like this is far
           | higher than it is for jetpacks. (That and certain intrinsic
           | features at least suggest the route to improvement is roughly
           | "more of the same," vs "needs massive unknown breakthrough".)
        
           | r0fl wrote:
           | Trillions of dollars are not being invested in making jet
           | packs any better
           | 
           | Your comparison is incorrect
        
             | rkozik1989 wrote:
             | You say assuming money is limitless and investor patience
             | for returns is endless.
        
             | dragonwriter wrote:
             | And if trillions of dollars were being invested in that, it
             | would mean lots of investors being disappointed in a few
             | years, not that jet packs were close to being useful.
             | 
             | Not sure if that's what you are trying to say about AI, or
             | not.
        
           | ekianjo wrote:
           | > Personal jetpacks are the worst they'll ever be
           | 
           | Have they become better over the past 20 years?
        
         | danvoell wrote:
         | My take as well. Feels like something that is going to be
         | plugged into my brain when I'm drooling in a nursing home.
        
           | RALaBarge wrote:
           | Think of all of the suffering it will prevent
        
           | dlivingston wrote:
           | Literally, The Matrix. (I just rewatched the first one for
           | the first time in a decade and forgot how damn good of a
           | movie it is.)
        
         | MarkusQ wrote:
         | > First AI thing that's made me feel a bit of derealization...
         | 
         | > ...and this is the worst the capabilities will ever be.
         | 
         | I guess if this bothers you (and I can see how it might) you
         | can take some small comfort in thinking that (due to
         | enshitification) this could in fact be the _best_ the
         | capabilities will ever be.
        
           | Philpax wrote:
           | Once it has been proven to be possible, other companies
           | [1][2][3] can and will reproduce it, and will attempt to push
           | the frontier. As far as we know, there's no bottleneck that's
           | stalling development here.
           | 
           | [1]: https://www.worldlabs.ai/
           | 
           | [2]: https://wayfarerlabs.ai/
           | 
           | [3]: https://runwayml.com/research/introducing-general-world-
           | mode...
        
         | dandellion wrote:
         | I suggest you go to google/bing/whatever floats your boat and
         | search "it will only get better" then filter results earlier
         | than 2010. Things that I just found that were going to "only
         | get better":
         | 
         | - Google search
         | 
         | - Web browsers
         | 
         | - Web content
         | 
         | - Internet Explorer
         | 
         | - Music
         | 
         | - Flight process at Mosul airport
         | 
         | - Star Wars
        
           | dist-epoch wrote:
           | Google search is better. It's just that now you ask the LLM
           | what you want to find out.
        
             | zanellato19 wrote:
             | Google search is absolutely not better.
        
               | konart wrote:
               | Depends on a perspective though.
        
             | iammrpayments wrote:
             | I was going to say it was maybe better for advertisers but
             | the auction has gone up and the dashboard has much less
             | data due to legal restrictions
        
           | Philpax wrote:
           | Those are worse due to economic and cultural reasons, not
           | technological reasons. The technology itself will only get
           | better.
           | 
           | (Also, implying that music has gotten worse is a boomer-ass
           | take. It might not be to your liking, but there's more of it
           | than ever before, and new sonic frontiers are being
           | discovered every day.)
        
           | x187463 wrote:
           | None of those have a quantifiable definition of 'better'. The
           | current range of AI models have very easily measured metrics.
        
             | internetter wrote:
             | Very much disagree. Current AI benchmarks are quite
             | arbitrary as evidenced by the ability of a model to be
             | fitted to a particular benchmark. Like the closest
             | benchmark to objectivity is "does it answer this question
             | factually" and benchmarks like that are just as failable
             | really because who decides what questions we ask? The same
             | struggles happen when we try to measure human intelligence.
             | The more complex the algorithm the harder it is to quantify
             | because there are so many parameters. I could easily
             | contrive some "search engine benchmark", but it wouldn't be
             | that useful because it's only adherent to my own subjective
             | definition of what it means for a search engine to be good.
        
           | Terretta wrote:
           | > _Star Wars_
           | 
           | And then you watched Mandalorian and Andor?
           | 
           | Jokes aside, Google Search _results_ are worse thanks to so
           | much web content being just ad scaffolding, but the
           | interesting one here is music.
           | 
           | Music is typically imagined to be its best at whatever ages
           | one most listened to it, partly trained in and partly thanks
           | to meanings/memories/nostalgia attached to it. As a
           | consequence, for most _everyone_ , more recent music seems to
           | be "getting worse"!
           | 
           | That said, and back to the SEO effect on Google Results, I'd
           | argue mass distribution/advertising/marketing has resulted in
           | most audio airtime getting objectively* less complex, but if
           | one turns off the mass distribution, and looks around, there
           | seems to be plenty of just as good -- even building on what
           | came before -- music to be found.
           | 
           | * https://www.researchgate.net/publication/387975100_Decoding
           | _...
        
           | Dardalus wrote:
           | Are you really trying to say that these models aren't going
           | to get better from here? You think that the insane progress
           | of the last 5 years just stops right here?
        
         | torginus wrote:
         | If it helps, if you look at the biology of human vision, you
         | find out things like the width of your cone of sharp vision is
         | about 2 degrees, or the size of your thumb held out at arms
         | length.
         | 
         | Due to this physical limitation, what you 'see' in front of
         | you, widely accepted as ground truth reality, cannot possibly
         | real, its a hallucination produced by your brain.
         | 
         | Your brain, compared to the sensory richness of reality you
         | experience around you, has very limited direct inputs from the
         | outside world, it must construct a rich internal model based on
         | this.
         | 
         | It's very weird (at least to me), that the boundary between
         | reality and assumption (basically educated guessing) is very
         | arbitrary, and definitely only exists in our heads.
        
         | remir wrote:
         | That's pretty much the basis for the simulation theory. See
         | also "My Big TOE" (Theory of Everything) from Tom Campbell.
        
         | swax wrote:
         | It's an unsettling feeling as what's more complicated - all the
         | atoms and galaxies, trillions of life forms, the unimaginable
         | distances of our universe OR a relatively simple world model
         | that is our conscious experience and nothing else.
        
         | j_timberlake wrote:
         | No it's good, you're ahead of the curve, most people aren't
         | there yet.
         | 
         | The next step is to realize that, if life is a cheap
         | simulation, not everyone might have... uh... fully simulated
         | minds. Player Characters vs NPCs is what gamers would say,
         | though it doesn't have to be binary like that, and the term NPC
         | has already been ruined by social media rants. (Also, NPC is a
         | bad insult because most of the coolest characters in games are
         | NPC rivals or bosses or whatnot.)
        
       | NoScopeNinja wrote:
       | It sounds cool that Genie 3 can make whole worlds you can
       | explore, but I wonder how soon regular people will actually get
       | to try it out?
        
         | qingcharles wrote:
         | These guys are working on the same thing and have a real demo
         | you can play:
         | 
         | https://odyssey.world/introducing-interactive-video
        
           | yoavm wrote:
           | Wow, a few years ago, if you've shown me this and Genie 3,
           | I'd assume there were at least 10 years of development
           | between them. This looks worse than Doom.
        
             | qingcharles wrote:
             | The rate of change is insane these days. I remember Sora
             | launching and thinking "wow" and within weeks it looked
             | like hot garbage.
        
       | modeless wrote:
       | Consistency over multiple minutes _and_ it runs in real time at
       | 720p? I did not expect world models to be this good yet.
       | 
       | > Genie 3's consistency is an emergent capability
       | 
       | So this just happened from scaling the model, rather than being a
       | consequence of deliberate architecture changes?
       | 
       | Edit: here is some commentary on limitations from someone who
       | tried it: https://x.com/tejasdkulkarni/status/1952737669894574264
       | 
       | > - Physics is still hard and there are obvious failure cases
       | when I tried the classical intuitive physics experiments from
       | psychology (tower of blocks).
       | 
       | > - Social and multi-agent interactions are tricky to handle.
       | 1vs1 combat games do not work
       | 
       | > - Long instruction following and simple combinatorial game
       | logic fails (e.g. collect some points / keys etc, go to the door,
       | unlock and so on)
       | 
       | > - Action space is limited
       | 
       | > - It is far from being a real game engines and has a long way
       | to go but this is a clear glimpse into the future.
       | 
       | Even with these limitations, this is still bonkers. It suggests
       | to me that world models may have a bigger part to play in
       | robotics and real world AI than I realized. Future robots may
       | learn in their dreams...
        
         | kfarr wrote:
         | Bitter lesson strikes again!
        
           | nxobject wrote:
           | _Especially_ given the goal of a world model using a rasters-
           | only frame-by-frame approach. Holy shit.
        
         | ivape wrote:
         | _So this just happened from scaling the model_
         | 
         | Unbelievable. How is this not a miracle? So we're just
         | stumbling onto breakthroughs?
        
           | silveraxe93 wrote:
           | Is it actually unbelievable?
           | 
           | It's basically what every major AI lab head is saying from
           | the start. It's the peanut gallery that keeps saying they are
           | lying to get funding.
        
             | ivape wrote:
             | It's akin to us sending a rocket to space and immediately
             | discovering a wormhole. Sure, there's a lot of science
             | about what's out there, but to discover all this in our
             | first few trips to orbit ...
        
               | silveraxe93 wrote:
               | Lemme start by saying this is objectively amazing. But I
               | just really wouldn't call it a breakthrough.
               | 
               | We had one breakthrough a couple of years ago with GPT-3,
               | where we found that neural networks / transformers +
               | scale does wonders. Everything else has been a smooth
               | continuous improvement. Compare today's announcement to
               | Genie-2[1] release less than 1 year ago.
               | 
               | The speed is insane, but not surprising if you put in
               | context on how fast AI is advancing. Again, nothing
               | _new_. Just absurdly fast continuous progress.
               | 
               | [1] -
               | https://deepmind.google/discover/blog/genie-2-a-large-
               | scale-...
        
               | ducktective wrote:
               | Wasn't the model winning gold in IMO result of a
               | breakthrough? I doubt an stochastic parrot can solve math
               | at IMO level...
        
               | Philpax wrote:
               | As far as we know, it was "just" scale on depth (model
               | capability) and breadth (multiple agents working at the
               | same time).
        
               | bakuninsbart wrote:
               | Why wouldn't it? I still have to hear one convincing
               | argument how our brain isn't working as a function of
               | probable next best actions. When you look at amoebas
               | work, and animals that are somewhere between them and us
               | in intelligence, and then us, it is a very similar kind
               | of progression we see with current LLMs, from almost no
               | state of the world, to a pretty solid one.
        
               | pantalaimon wrote:
               | Joscha Bach postulates that what we call consciousness
               | must be something rather simple, an emergent property
               | present in all sufficiently complex biological organisms.
               | 
               | We don't inherit any software, so cognitive function must
               | bootstrap itself from it's underlying structure alone.
               | 
               | https://media.ccc.de/v/38c3-self-models-of-loving-grace
        
               | glenstein wrote:
               | >We don't inherit any software, so cognitive function
               | must bootstrap itself from it's underlying structure
               | alone.
               | 
               | Hardware and software, as metaphors applied to biology, I
               | think are better understood as a continuum than a binary,
               | and if we don't inherit any software (is that true?), we
               | at least inherit assembly code.
        
               | pantalaimon wrote:
               | > we don't inherit any software (is that true?), we at
               | least inherit assembly code
               | 
               | To stay with the metaphor, DNA could be rather understood
               | as firmware that runs on the cell. What I mean with
               | software is the 'mind' that runs on a collection of
               | cells. Things like language, thoughts and ideas.
               | 
               | There is also a second level of software that runs not on
               | a single mind alone, but collection of minds, to form
               | cliques or a societies. But this is not encoded in genes,
               | but in memes.
        
               | glenstein wrote:
               | I think we have some notion of a proto-grammar or ability
               | to linguistically conceptualize, probably at the level of
               | some primordial conceptual units that are more
               | fundamental than language, thoughts and ideas in the
               | concrete forms we generally understand them to have.
               | 
               | I think it's like Chomsky said, that we don't learn this
               | infrastructure for understanding language any more than a
               | bird "learns" their feathers. But I might be losing track
               | of what you're suggesting is software in the metaphor. I
               | think I'm broadly on board with your characterization of
               | DNA, the mind and memes generally though.
        
               | airstrike wrote:
               | At the most fundamental level, is it even linguistic?
               | Would Tarzan speak at all?
        
               | suddenlybananas wrote:
               | Children (who aren't alone) will invent languages to
               | communicate between each other, see Nicaraguan Sign
               | Language.
        
               | quesera wrote:
               | The emergent property theory seems logical, but I'm also
               | partial to the quantum-tunneling-miasma theory which
               | basically posits that there could be something fairly
               | complex going on, and we just lack the ability to
               | observe/measure it in our current physics. (Although I
               | have difficulty coherently separating this theory from
               | faith-based beliefs)
        
               | CharlieDigital wrote:
               | > We don't inherit any software
               | 
               | I wonder, though. Many animal species just "know" how to
               | perform certain complex actions without being taught the
               | way humans have to be taught. Building a nest, for
               | example.
               | 
               | If you say that this is emergent from the "underlying
               | structure alone", doesn't this mean that it would still
               | be "inherited" software (though in this case, maybe we
               | think of it like punch cards).
        
               | pantalaimon wrote:
               | That's interesting indeed - or take spiders building
               | nets. So there must be some 'microcode' that does get
               | inherited like physical features.
               | 
               | But then you have things like language or societal
               | customs that are purely 'software'.
        
               | tim333 wrote:
               | We inherit ~2GB of digital data as DNA. Quite how that
               | turns into nest building how tos is not yet known but it
               | must happen somehow.
        
             | JeremyNT wrote:
             | Even as a layman and AI skeptic, _to me_ this entirely
             | matches my expectations, and something like this seemed
             | like it was basically inevitable as of the first demos of
             | video rendering responding to user input (a year ago?
             | maybe?).
             | 
             | Not to detract from what has been done here in any way, but
             | it all seems entirely consistent with the types of progress
             | we have seen.
             | 
             | It's also no surprise to me that it's from Google, who I
             | suspect is better situated than any of its AI competitors,
             | even if it is sometimes slow to show progress publicly.
        
             | glenstein wrote:
             | >It's basically what every major AI lab head is saying from
             | the start.
             | 
             | I suppose it depends what you count as "the start". The
             | _idea_ of AI as a real research project has been around
             | since at least the 1950s. And I 'm not a programmer or
             | computer scientist, but I'm a philosophy nerd and I know
             | debates about what computers can or can't do started around
             | then. One side of the debate was that it awaited new
             | conceptual and architectural breakthroughs.
             | 
             | I also think you can look at, say, Ted Talks on the topic,
             | with guys like Jeff Hawkins presenting the problem as one
             | of searching for conceptual breakthroughs, and I think
             | similar ideas of such a search have been at the center of
             | Douglas Hofstadter's career.
             | 
             | I think in all those cases, they would have treated "more
             | is different" like an absence of nuance, because there was
             | supposed to be a puzzle to solve (and in a sense there is,
             | and there has been, in terms of vector space and back
             | propagation and so on, but it wasn't necessarily clear that
             | physics could "pop out" emergently from such a foundation).
        
               | jonas21 wrote:
               | When they say "the start", I think they mean the start of
               | the LLM era (circa 2017). The story of this era has been
               | that scaling to more data and more compute will always
               | beat clever algorithms and conceptual breakthroughs (i.e.
               | Rich Sutton's Bitter Lesson [1]).
               | 
               | [1]
               | http://www.incompleteideas.net/IncIdeas/BitterLesson.html
        
           | spaceman_2020 wrote:
           | becoming really, really hard to refute the Simulation Theory
        
           | shreezus wrote:
           | There are a lot of "interesting" emergent behaviors that
           | happen just a result of scaling.
           | 
           | Kind of like how a single neuron doesn't do much, but connect
           | 100 billion of them and well...
        
         | diwank wrote:
         | > Future robots may learn in their dreams...
         | 
         | So prescient. I definitely think this will be a thing in the
         | near future ~12-18 months time horizon
        
           | neom wrote:
           | I'm invested in a startup that is doing something unrelated
           | robotics, but they're spending a lot of time in Shenzhen, I
           | keep a very close eye on robotics and was talking to their
           | CTO about what he is seeing in China, versions of this are
           | already being implemented.
        
           | dingnuts wrote:
           | what is a robot dream when there is clearly no consciousness?
           | 
           | What's with this insane desire for anthropomorphism? What do
           | you even MEAN learn in its dreams? Fine-tuning overnight?
           | Just say that!
        
             | gavinray wrote:
             | > What's with this insane desire for anthropomorphism?
             | 
             | Devil's advocate: Making the assumption that consciousness
             | is uniquely human, and that humans are "special" is just as
             | ludicrous.
             | 
             | Whether a computational medium is carbon-based or silicon-
             | based seems irrelevant. Call it "carbon-chauvinism".
        
               | bakuninsbart wrote:
               | That's not even a devil's advocate, many other animals
               | clearly have consciousness, at least if we're not
               | solipsistic. There have been many very dangerous
               | precedents in medicine where people have been declared
               | "brain dead" only to awake and remember.
               | 
               | Since consciousness is closely linked to being a moral
               | patient, it is all the more important to err on the side
               | of caution when denying qualia to other beings.
        
               | mandolingual wrote:
               | "Consciousness" is an overloaded thought killer that
               | swerves all conversation into obfuscated semantic
               | arguments. One person will be talking about 'internality'
               | and self-image (in the testable, mechanical sense that
               | you could argue Chain of Thought models already have in a
               | petty way) and the other will be grappling with the
               | concept of qualia and the ineffable nature of human
               | experience.
        
             | olddustytrail wrote:
             | Yes, and an object in OOP isn't really a physical object.
             | And a string isn't really a thin bit of rope.
             | 
             | No-one cares. It's just terminology.
        
           | Aco- wrote:
           | "Do Androids Dream of Electric Sheep?"
        
           | casenmgreen wrote:
           | I may be wrong, but this seems to make no sense.
           | 
           | A neural net can produce information outside of its original
           | data set, but it is all and directly derived from that
           | initial set. There are fundamental information constraints
           | here. You cannot use a neural net to itself generate from its
           | existing data set wholly new and original full quality
           | training data for itself.
           | 
           | You can use a neural net to generate data, and you can train
           | a net on that data, but you'll end up with something which is
           | no good.
        
             | schmidtleonard wrote:
             | We are miles away from the fundamental constraint. We know
             | that our current training methodologies are scandalously
             | data inefficient compared to human/animal brains.
             | Augmenting observations with dreams has long been theorized
             | to be (part of) the answer.
        
               | vanviegen wrote:
               | > current training methodologies are scandalously data
               | inefficient compared to human/animal brains
               | 
               | Are you sure? I've been ingesting boatloads of high
               | definition multi-sensory real-time data for quite a few
               | decades now, and I hardly remember any of it. Perhaps the
               | average quality/diversity of LLM training data has been
               | higher, but they sure remember a hell of a lot more of it
               | than I ever could.
        
             | neom wrote:
             | I might be misunderstanding your comment so sorry if so.
             | Robots have sensors and RL is a thing, they can collect
             | real world data and then processing and consolidating real
             | world experiences during downtime (or in real time),
             | running simulations to prepare for scenarios, and updating
             | models based on the day's collected data. The way I saw it
             | that I thought was impressive was the robot understood the
             | scene, but didn't know how the scene would respond to it's
             | actions, so it gens videos of the possible scenarios, and
             | then picks the best ones and models it's actuation based on
             | it's "imagination".
        
             | hnuser123456 wrote:
             | It's feasible you could have a personal neural net that
             | fine-tunes itself overnight to make less inference mistakes
             | in the future.
        
             | exe34 wrote:
             | Any idea how humans do it? Where do they get novel
             | information from?
        
             | Demplolo wrote:
             | I actually think you can.
             | 
             | The LLM has plenty of experts and approaches etc.
             | 
             | Give it tool access let it formulate it's own experiments
             | etc.
             | 
             | The only question here is if it becomes a / the singularity
             | because of this, gets stuck in some local minimum or
             | achieves random perfection and random local minimum
             | locations.
        
             | scarmig wrote:
             | Humans are dependent on their input data (through lifetime
             | learning and, perhaps, information encoded in the brain
             | from evolution), and yet they can produce out of
             | distribution information. How?
             | 
             | There is an uncountably large number of models that
             | perfectly replicate the data they're trained on; some
             | generalize out of distribution much better. Something like
             | dreaming _might_ be a form of regularization: experimenting
             | with simpler structures that perform equally well on
             | training data but generalize better (e.g. by discovering
             | simple algorithms that reproduce the data equally well as
             | pure memorization but require simpler neural circuits than
             | the memorizing circuits).
             | 
             | Once you have those better generalizing circuits, you can
             | generate data that not only matches the input data in
             | quality but potentially exceeds it, if the priors built
             | into the learning algorithm match the real world.
        
               | delusional wrote:
               | Computers aren't humans.
               | 
               | We have truly reached peak hackernews here.
        
               | stavros wrote:
               | Humans produce out-of-distribution data all the time, yet
               | if you had a teacher making up facts and teaching them to
               | your kids, you would probably complain.
        
               | scarmig wrote:
               | Humans also sometimes hallucinate and produce non-
               | sequitors.
        
               | suddenlybananas wrote:
               | Maybe you do, but people don't "hallucinate". Lying or
               | being mistaken is a very different thing.
        
             | tim333 wrote:
             | Humans can learn from visualising situations and thinking
             | through different scenarios. I don't see why AI / robots
             | can't do similar. In fact I think quite a lot of training
             | for things like Tesla self driving is done in simulation.
        
             | thecupisblue wrote:
             | This is definitely one of the potential issues that might
             | happen to embodied agents/robots/bodies trained on the
             | "world model". As we are training a model for the real
             | world based on a model that simulates the real world, the
             | glitches in the world simulator model will be incorporated
             | into the training. There will be edge cases due to this
             | layered "overtraining", where a robot/agent/body will
             | expect Y to happen but X will happen, causing unpredictable
             | behaviour.I assume that a generic world agent will be able
             | to autocorrect, but this could also lead to dangerous
             | issues.
             | 
             | I.e. if the simulation has enough videos of firefighters
             | breaking glass where it seems to drop instantaneously and
             | in the world sim it always breaks, a firefighter robot
             | might get into a problem when confronted with unbreakable
             | glass, as it expects it to break as always, leading to a
             | loop of trying to shatter the glass instead of performing
             | another action.
        
         | casenmgreen wrote:
         | The guy who tried was invite by Google to try it.
         | 
         | He seems to me too enthusiastic, such that I feel Google asked
         | him in particular because they trusted him to write very
         | positively.
        
           | alphabetting wrote:
           | I doubt there was a condition on writing positively. Other
           | people who tested have said this won't replace engines.
           | https://togelius.blogspot.com/2025/08/genie-3-and-future-
           | of-...
        
             | echelon wrote:
             | > What I don't think this technology will do is replace
             | game engines. I just don't see how you could get the very
             | precise and predictable editing you have in a regular game
             | engine from anything like the current model. The real
             | advantage of game engines is how they allow teams of game
             | developers to work together, making small and localized
             | changes to a game project.
             | 
             | I've been thinking about this a while and it's obvious to
             | me:
             | 
             | Put Minecraft (or something similar) under the hood. You
             | just need data structures to encode the world. To enable
             | mutation, location, and persistence.
             | 
             | If the model is given additional parameters such as a
             | "world mesh", then it can easily persist where things are,
             | what color or texture they should be, etc.
             | 
             | That data structure or server can be running independently
             | on CPU-bound processes. Genie or whatever "world model" you
             | have is just your renderer.
             | 
             | It probably won't happen like this due to monopolistic
             | forces, but a nice future might be a future where you could
             | hot swap renderers between providers yet still be playing
             | the same game as your friends - just with different looks
             | and feels. Experiencing the world differently all at the
             | same time. (It'll probably be winner take all, sadly, or
             | several independent vertical silos.)
             | 
             | If I were Tim Sweeny at Epic Games, I'd immediately drop
             | all work on Unreal Engine and start looking into this tech.
             | Because this is going to shore them up on both the gaming
             | and film fronts.
        
               | K0balt wrote:
               | As a renderer, given a POV, lighting conditions, and
               | world mesh might be a very, very good system. Sort of a
               | tight MCP connection to the world-state.
               | 
               | I think in this context, it could be amazing for game
               | creation.
               | 
               | I'd imagine you would provide item descriptions to vibe-
               | code objects and behavior scripts, set up some initial
               | world state(maps), populated with objects made of objects
               | - hierarchically vibe-modeled, make a few renderings to
               | give inspirational world-feel and textures, and vibe-tune
               | the world until you had the look and feel you want. Then
               | once the textures and models and world were finalised, it
               | would be used as the rendering context.
               | 
               | I think this is a place that there is enough feedback
               | loops and supervision that with decent tools along these
               | lines, you could 100x the efficiency of game development.
               | 
               | It would blow up the game industry, but also spawn a
               | million independent one or two person studios producing
               | some really imaginative niche experiences that could be
               | much, much more expansive (like a AAA title) than the
               | typical indie-studio product.
        
               | echelon wrote:
               | > you could 100x the efficiency of game development.
               | 
               | > It would blow up the game industry, but also spawn a
               | million independent one or two person studios producing
               | some really imaginative niche experiences that could be
               | much, much more expansive (like a AAA title) than the
               | typical indie-studio product.
               | 
               | All video games become Minecraft / Roblox / VRChat. You
               | don't need AAA studios. People can make and share their
               | own games with friends.
               | 
               | Scary realization: YouTube becomes YouGame and Google
               | wins the Internet forever.
        
               | keithwhor wrote:
               | You've just described what Roblox is already doing.
        
               | echelon wrote:
               | Roblox can't beat Google in AI. Roblox has network
               | effects with users, but on an old school tech platform
               | where users can't magic things into existence.
               | 
               | I've seen Roblox's creative tools, even their GenAI
               | tools, but they're bolted on. It's the steam powered
               | horse problem.
        
             | phkahler wrote:
             | But can we use it to create movies one scene at a time?
        
             | SequoiaHope wrote:
             | You don't ask people to speak how you want, you simply only
             | invite people who already have a history of speaking how
             | you want. This phenomena is explained in detail I. Noam
             | Chomsky's work around mass media (eg NY Times doesn't tell
             | their editors what to do exactly, but only hire editors who
             | already want to say what NY Times wants, or have a certain
             | world view). The same can be applied to social media
             | reviews. Invite the person who gives glowing reviews all
             | the time.
        
               | delusional wrote:
               | Do you know where Noam makes that argument? I've been
               | trying to figure out where I picked it up years ago. I'd
               | like to revisit it to deepen my understanding. It's a
               | pretty universal insight.
        
               | kevindamm wrote:
               | I think it was in "Manufacturing Consent" by Edward S.
               | Herman and Noam Chomsky.
               | 
               | https://en.wikipedia.org/wiki/Manufacturing_Consent#:~:te
               | xt=...
               | 
               | https://www.goodreads.com/book/show/12617.Manufacturing_C
               | ons...
               | 
               | Though this is often associated with his and Herman's
               | "Propaganda Model," Chomsky has also commented that the
               | same appears in scholarly literature, despite the overt
               | propaganda forces of ownership and advertisement being
               | absent:
               | 
               | https://en.wikipedia.org/wiki/Propaganda_model#:~:text=Ch
               | oms...
        
             | make3 wrote:
             | It wouldn't be surprising if a structured version of this
             | with state cached per room for example could be used in a
             | game.
             | 
             | & you're basically seeing GPT-3 and saying it will never be
             | used in any serious application.. the rate of improvement
             | in their model is insane
        
               | echelon wrote:
               | Don't put the world state into the model. Use the model
               | as a renderer of whatever objects the "engine" throws at
               | it.
               | 
               | Use the CPU and RAM for world state, then pass it off to
               | the model to render.
               | 
               | Regardless of how this is done, Unreal Engine with all of
               | its bells and whistles is toast. That C++ pile of
               | engineering won't outdo something this flexible.
        
               | rpcope1 wrote:
               | How many watts and how much capital does it take to run
               | this model? How many watts and how much capital does it
               | take to run unity or unreal? I suspect there's a huge
               | discrepancy here, among other things.
        
           | echelon wrote:
           | I don't know. I wasn't there and I'm excited.
           | 
           | I think this puts Epic Games, Nintendo, and the whole lot
           | into a very tough spot if this tech takes off.
           | 
           | I don't see how Unreal Engine, with its voluminous and
           | labyrinthine tomes of impenetrable legacy C++ code, survives
           | this. Unreal Engine is a mess, gamers are unhappy about it,
           | and it's a PITA to develop with. I certainly hate working
           | with it.
           | 
           | Innovator's Dilemma fast approaching the entire gaming
           | industry and they don't even see it coming it's happening so
           | fast.
           | 
           | Exciting that building games could become as easy as having
           | the idea itself. I'm imagining something like VRChat or
           | Roblox or Fortnite, but where new things are simply spoken
           | into existence.
           | 
           | It's absolutely terrifying that Google has this much power.
        
             | sureglymop wrote:
             | How so? It's not really by itself being creative yet, no?
             | It sure seems like a game changer but who knows if one can
             | even use this at scale?
        
               | echelon wrote:
               | I played around with Diamond WM on my 3090 machine. I
               | also ran fast SDXL-turbo and LCM models with ControlNets
               | paired with a 3D game prototype I threw together. The
               | results were very compelling, and I was just one person
               | hacking things together.
               | 
               | This is 100% going to happen on-device. It's just a
               | matter of time.
        
               | rakete wrote:
               | I am convinced as well this will eventually be how we
               | render games and simulations.
               | 
               | Maybe just as kind of a DLSS on steroids where the engine
               | only renders very simple objects and a world model
               | translates these to the actual graphics.
        
             | tim333 wrote:
             | I imagine Unreal Engine will start incorporating such
             | stuff?
        
           | csomar wrote:
           | Also he is ex-Google Mind. Like the worst kind of pick you
           | can make when there are dozens of eligible journalists out
           | there.
        
         | kkukshtel wrote:
         | I similarly am surprised at how fast they are progressing. I
         | wrote this piece a few months ago about how I think steering
         | world model output is the next realm of AAA gaming:
         | 
         | https://kylekukshtel.com/diffusion-aaa-gamedev-doom-minecraf...
         | 
         | But even when I wrote that I thought things were still a few
         | years out. I facetiously said that Rockstar would be nerd-
         | sniped on GTA6 by a world model, which sounded crazy a few
         | months ago. But seeing the progress already made since GameNGen
         | and knowing GTA6 is still a year away... maybe it will actually
         | happen.
        
           | throwmeaway222 wrote:
           | I'm trying to wrap my head around this since we're still
           | seeing text spit out slowly ( I mean slowly as in 1000's of
           | tokens a second)
           | 
           | I'm starting to think some of the names behind LLMs/GenAI are
           | cover names for aliens and any actual humans involved have
           | signed an NDA that comes with millions of dollars and a death
           | warrant if disobeyed.
        
           | ewoodrich wrote:
           | > Rockstar would be nerd-sniped on GTA6 by a world model
           | 
           | I'm having trouble parsing your meaning here.
           | 
           | GTA isn't really a "drive on the street simulator", is it?
           | There is deliberate creative and artistic vision that makes
           | the series so enjoyable to play even decades after release,
           | despite the graphics quality becoming more dated every year
           | by AAA standards.
           | 
           | Are you saying someone would "vibe model" a GTAish clone with
           | modern graphics that would overtake the actual GTA6 in
           | popularity? That seems extremely unlikely to me.
        
             | everforward wrote:
             | Probably depends on how you engage with GTA. "Drive on the
             | street simulator" along with arrays of weapons and
             | explosions is the majority of my hours in GTA.
             | 
             | I despise the creative and artistic vision of GTA online,
             | but I'm clearly in a minority there gauging by how much
             | money they've made off it.
        
         | forrestthewoods wrote:
         | > this is a clear glimpse into the future.
         | 
         | Not for video games it isn't.
        
           | dlivingston wrote:
           | Unless and until state can be stored outside of the model.
           | 
           | I for one would love a video game where you're playing in a
           | psychedelic, dream-like fugue.
        
             | throwmeaway222 wrote:
             | It's kinda crazy though that a single game session would be
             | burning enough natural gas to power 3 cities. Unless that's
             | not true
        
             | forrestthewoods wrote:
             | It is plausible to run a full simulation the old fashioned
             | way and realtime render it with a diffusion model.
             | 
             | It is not currently, or near term, realistic to make a
             | video game where a meaningful portion of the simulation is
             | part of the model.
             | 
             | There will probably be a few interactive model-first
             | experiences. But they'll be popular as short novelties not
             | meaningful or long experiences.
             | 
             | A simple question to consider is how would you adjust a set
             | of simple tunables in a model-first simulator? For example
             | giving the player more health, making enemies deal 2x
             | damage, increasing move speed, etc etc. You can not.
        
         | tugn8r wrote:
         | But that was always going to be the case?
         | 
         | Reality is not composed of words, syntax, and semantics. A
         | human modal is.
         | 
         | Other human modals are sensory only, no language.
         | 
         | So vision learning and energy models that capture the energy to
         | achieve a visual, audio, physical robotics behavior are the
         | only real goal.
         | 
         | Software is for those who read the manual with their new NES
         | game. Where are the words inside us?
         | 
         | Statistical physics of energy to make machine draw the glyphs
         | of language not opionated clustering of language that will
         | close the keyboard and mouse input loop. We're like replicating
         | human work habits. Those are real physical behaviors. Not just
         | descriptions in words.
        
         | ojosilva wrote:
         | Gaming is certainly a use case, but I think this is primarily
         | coming as synthetic data generation for Google's robots
         | training in warehouses:
         | 
         | https://www.theguardian.com/technology/2025/aug/05/google-st...
         | 
         | Gemini Robot launch 4 mo ago:
         | 
         | https://news.ycombinator.com/item?id=43344082
        
         | resters wrote:
         | consider the hardware DOOM runs on. 720p would only be a true
         | test of capability if every bit of possible detail was used.
        
       | Oarch wrote:
       | I don't think I've ever seen a presentation that's had me
       | question reality multiple times before. My mind is suitably
       | blown.
        
       | andhuman wrote:
       | Now this could be the killer app VR's been looking for.
        
       | gundmc wrote:
       | Interesting! This feels like they're trying to position it as a
       | competitor to Nvidia's Omniverse, which is based on the Universal
       | Scene Descriptor format as the backbone. I wonder what format
       | world objects can be ingested into Genie in - e.g. for the
       | manufacturing use cases mentioned.
        
       | dzonga wrote:
       | movies are about to become cheap to produce.
       | 
       | good writers will remain scarce though.
       | 
       | maybe we will have personalized movies written entirely through
       | A.I
        
       | mclau157 wrote:
       | I can see this being incredible for history lessons and history
       | school lectures
        
         | MarkusQ wrote:
         | Some physicist once said "I endeavor to never write more
         | clearly than I think"; in the same way, history probably
         | shouldn't be presented more vividly than it's understood. (We
         | already have this problem with people remembering incidental
         | details and emotional vibes from historical fiction as if they
         | were established historical fact; VR diffusion delusions would
         | make this much worse.)
        
           | scotty79 wrote:
           | History is mostly made up. You can be sure mostly about
           | general facts. The other 80% are just narratives.
        
             | ecshafer wrote:
             | If you read actual history the historians typically go into
             | quite a lot of depth on why they think X happened as
             | opposed to Y, and what the limitations are on the theories
             | and the reasoning. The amount of archaeological and written
             | records we have is very important to those facts.
        
             | quesera wrote:
             | Also true of the present. :)
        
         | suddenlybananas wrote:
         | Not really, since it will hallucinate all sorts of ridiculous
         | anachronisms.
        
         | ivape wrote:
         | It's going to replace video games.
        
           | mclau157 wrote:
           | Do people play video games to look at pretty scenery? No most
           | people are testing skills in video games and this will not
           | test skill for a while
        
             | okasaki wrote:
             | They do both. Nobody played Cyberpunk 2077 for the riveting
             | gameplay.
             | 
             | Actually that game felt a lot like these videos, because
             | often you would turn around and then look back and the game
             | had deleted the NPCs and generated new ones, etc.
        
               | cptroot wrote:
               | People played Cyberpunk 2077 because it had oceans of
               | engaging story, which _is_ the gameplay.
        
             | klibertp wrote:
             | There's an entire genre of games (immersive sims) that
             | focus on experiencing the world with little to sometimes no
             | skill required on the part of the player. The genre is
             | diverse and incorporates elements of more gameplay-focused
             | genres. It's also pretty popular.
             | 
             | I think some people want to play, and some want to
             | experience, in different proportions. Tetris is the
             | emanation of pure gameplay, but then you have to remember
             | "Colossal Cave Adventure" is even older than Tetris. So
             | there's a long history of both approaches, and for one of
             | them, these models could be helpful.
             | 
             | Not that it matters. Until the models land in the hands of
             | indie developers for long enough for them to prove their
             | usefulness, no large developer will be willing to take on
             | the risks involved in shipping things that have the
             | slightest possibility of generating "wrong" content. So,
             | the AI in games is still a long way off, I think.
        
             | lbrito wrote:
             | >No most people are testing skills in video games
             | 
             | You must be young. As people get older they (usually) care
             | less about that.
        
             | nosignono wrote:
             | > Do people play video games to look at pretty scenery?
             | 
             | Yes.
             | 
             | > No most people are testing skills in video games
             | 
             | That's not mutually exclusive with playing for scenery.
             | 
             | Games, like all art, have different communities that enjoy
             | them for different reasons. Some people do not want their
             | skills tested at all by a game. Some people want the
             | maximum skill testing. Some want to experience novel
             | fantasy places, some people want to experience real places.
             | Some people want to tell complex weaving narratives, some
             | people want to optimize logistics.
             | 
             | A game like Flower is absolutely a game about looking at
             | pretty scenery and not one about testing skill.
        
           | SirMaster wrote:
           | I doubt it. The only video games I play are competitive games
           | like DotA 2, Counter Strike 2, Call of Duty, Rainbow 6 Siege,
           | etc. I don't really see how this completes or replaced that
           | at all.
        
         | ecshafer wrote:
         | Why? Sure a virtual walk around the Pantheon in all its glory
         | would be _nice_. But would that really improve history lessons?
         | It doesn 't help students understand why things happened, and
         | what the consequences were and how they have impacted the rest
         | of history of the modern world.
        
           | Philpax wrote:
           | Inhabiting a foreign cultural context can provide information
           | that factual lessons may struggle to convey to the same
           | degree. Of course, there's a limit to this - especially with
           | regards to historical accuracy - but you are much more likely
           | to understand why specific historical decisions were made if
           | you are "in the room" where they happened, so to speak.
        
           | motoxpro wrote:
           | Engagement is one of the core pieces education and one of the
           | hardest things to solve. If you remember back to being a kid,
           | reading white papers is not really a thing. Interesting (e.g.
           | engaging) teachers and field trips (which not all schools
           | have access to) are tools that help kids learn.
           | 
           | At the limit, if you could stay engaged you would be an
           | expert in pretty much anything.
           | 
           | "It doesn't help students understand why things happened, and
           | what the consequences were and how they have impacted the
           | rest of history of the modern world." I would say the
           | opposite, let's recreate each step in that historical journey
           | so you can see exactly what the concequenses were, exactly
           | why they happened and when.
        
       | Workaccount2 wrote:
       | I wonder how hard it would be to get VR output?
       | 
       | That's an insane product right there just waiting to happen. Too
       | bad Google sleeps so hard on the tech they create.
        
         | SeanaldMcDnld wrote:
         | Consistent output and spatial coherence across each eye, maybe
         | a couple years? But meeting head tracking accuracy and latency
         | requirements, I'd bet decades. There's no way any of this tech
         | reduces end to end latency to acceptable levels, without a
         | massive change in hardware. We'll probably see someone use
         | reprojection techniques in a year or so and claim they've done
         | it. But true generated pixels straight to the headset based on
         | head tracking, is so so far away.
        
           | kridsdale3 wrote:
           | Agree. So I'll make a wild bet of "20 years". And hope for
           | the best.
        
           | nosignono wrote:
           | You don't have to do it in real time, per se. I imagine a
           | world in which the renderer and the world generation are
           | decoupled. For example, you could descriptively articulate
           | what you wanted to achieve and have it generate a world,
           | quietly do some structure from motion (or just generate the
           | models and textures), and those those as assets in a game
           | engine for the actual moment to moment rendering.
           | 
           | You'd have some "please wait in this lobby space while we
           | generate the universe" moments, but those are easy to hide
           | with clever design.
        
         | pawelduda wrote:
         | It's hard to get an acceptable VR output for today's rendering
         | engines still. In the examples provided, the movement seems to
         | be slow and somewhat linear, which doesn't translate to head
         | movements in VR. VR needs 2 consistent videos with much higher
         | resolutions and low latency is a must. The feedback would still
         | be very dependent on people's tolerance to all imperfections -
         | some would be amazed, others would puke. That's why VR still
         | isn't in the spotlight after all the years (I personally find
         | it great).
        
         | kridsdale3 wrote:
         | I think VR will come at the same time they make multiplayer.
         | There needs to be differentiation between the world-state and
         | the viewport. Right now, I suspect they're the same.
         | 
         | But once you can get N cameras looking at the same world-state,
         | you can make them N players, or a player with 2 eyes.
        
       | fnands wrote:
       | Damn, I'm getting Black Mirror vibes from this. Maybe because I
       | watched the Eulogy episode last night.
       | 
       | Really great work though, impressive to see.
        
       | ollin wrote:
       | This is very encouraging progress, and probably what Demis was
       | teasing [1] last month. A few speculations on technical details
       | based on staring at the released clips:
       | 
       | 1. You can see fine textures "jump" every 4 frames - which means
       | they're most likely using a 4x-temporal-downscaling VAE with at
       | least 4-frame interaction latency (unless the VAE is also
       | control-conditional). Unfortunately I didn't see any real-time
       | footage to confirm the latency (at one point they intercut screen
       | recordings with "fingers on keyboard" b-roll? hmm).
       | 
       | 2. There's some 16x16 spatial blocking during fast motion which
       | could mean 16x16 spatial downscaling in the VAE. Combined with 1,
       | this would mean 24x1280x720/(4x16x16) = 21,600 tokens per second,
       | or around 1.3 million tokens per minute.
       | 
       | 3. The first frame of each clip looks a bit sharper and less
       | videogamey than later stationary frames, which suggests this is
       | could be a combination of text-to-image + image-to-world system
       | (where the t2i system is trained on general data but the i2w
       | system is finetuned on game data with labeled controls).
       | Noticeable in e.g. the dirt/textures in [2]. I still noticed some
       | trend towards more contrast/saturation over time, but it's not as
       | bad as in other autoregressive video models I've seen.
       | 
       | [1] https://x.com/demishassabis/status/1940248521111961988
       | 
       | [2]
       | https://deepmind.google/api/blob/website/media/genie_environ...
        
         | ollin wrote:
         | Regarding latency, I found a live video of gameplay here [1]
         | and it looks like closer to 1.1s keypress-to-photon latency (33
         | frames @ 30fps) based on when the onscreen keys start lighting
         | up vs when the camera starts moving. This writeup [2] from
         | someone who tried the Genie 3 research preview mentions that
         | "while there is some control lag, I was told that this is due
         | to the infrastructure used to serve the model rather than the
         | model itself" so a lot of this latency may be added by their
         | client/server streaming setup.
         | 
         | [1] https://x.com/holynski_/status/1952756737800651144
         | 
         | [2] https://togelius.blogspot.com/2025/08/genie-3-and-future-
         | of-...
        
           | rotexo wrote:
           | You know that thing in anxiety dreams where you feel very
           | uncoordinated and your attempts to manipulate your
           | surroundings result in unpredictable consequences? Like you
           | try to slam on the brake pedal but your car doesn't slow
           | down, or you're trying to get a leash on your dog to lead it
           | out of a dangerous situation and you keep failing to hook it
           | on the collar? Maybe that's extra latency because your brain
           | is trying to render the environment at the same time as it is
           | acting.
        
             | svdr wrote:
             | Your brain does not need to render any environments, just
             | the experience of being in them.
        
       | crossbody wrote:
       | I am much more convinced now that the Simulation Argument is
       | correct
        
         | sercanov wrote:
         | yeah the whole explosion around AI made me lean more to
         | simulation theory. it's literally happening in front of our
         | eyes and we're a baby civilization
        
         | Fraterkes wrote:
         | I'm seeing a lot of variations on this in this thread, but we
         | have been able to render photoreal things, and do intricate
         | physical simulations, for a long time. This is mostly
         | impressive because it is a real-time way to generate and render
         | big, intricate worlds.
         | 
         | But if you believe reality is a simulation, why would these
         | "efficient" world-generation methods convince you of anything?
         | The tech our reality would have to be running on is still
         | inconceivable science fiction.
        
           | ivape wrote:
           | _but we have been able to render photoreal things, and do
           | intricate physical simulations, for a long time._
           | 
           | Not like this we haven't. This is convincing because I can
           | have any of you close your eyes and imagine a world where
           | pink rabbits hand out parking tickets. We're a neurolink away
           | from going from thought > to prompt > to fantasy.
        
             | crossbody wrote:
             | Agree with ivape.
             | 
             | To add: our reality does not have to be rendered in it's
             | entirety, we'll just have very convincing and unscripted
             | first-person view simulations. Only what you look at is
             | getting rendered (e.g. tiny structures only get rendered
             | when you use microscope).
        
             | Fraterkes wrote:
             | I guess I should have clarified: when you talk about
             | reality being a simulation, do you mean that we
             | collectively live in a simulated universe, or that you
             | personally are playing a very realistic vr game?
        
       | alec_irl wrote:
       | What is the purpose of this? It seems designed to muddy the
       | waters of reality vs. falsehood and put creatives in film/tv out
       | of jobs. Real Jurassic Park moment here
        
         | Centigonal wrote:
         | They mention some possible applications in the video. Training
         | environments for robotics (use sample data to simulate the
         | surface of mars or the inside of a nuclear reactor),
         | educational worlds for students (like the old Encarta virtual
         | tours), and disaster preparedness simulations (e.g. training
         | firefighters on an endless variety of burning homes).
         | 
         | Obviously, none of these are super viable given the low
         | accuracy and steerability of world models out today, but
         | positive applications for this kind of tech do exist.
         | 
         | Also (I'm speculating now instead of restating the video), I
         | think pretty soon someone will hook up a real time version of
         | this to a voice model, and we will get some kind of interactive
         | voice + keyboard (or VR) lucid dream experience.
        
       | sercanov wrote:
       | like how? is this mainly realtime inference?
        
       | idencap wrote:
       | what a time to be alive
        
       | hnthrow90348765 wrote:
       | Think of the pornographic possibilities
       | 
       | /s
        
         | Mouvelie wrote:
         | Why even be sarcastic about it ? There is no human invention
         | that has not exploded thanks (or because of) pornographic
         | possibilities. HD-DVD vs Blueray, Internet...I'd even argue
         | that XR is not as big as it could be because it is really
         | clamped down to deviant usage !
        
       | lackoftactics wrote:
       | I thought I was not going to see too many negative comments here,
       | yet I was mistaken. I thought if it's not LLM, people would have
       | a more nuanced take and could look at the research with an open
       | mind. The examples on the website are probably cherry-picked, but
       | progress is really nice compared to Genie 2.
       | 
       | It's a nice step towards gains in embodied AI. Good work,
       | DeepMind.
        
         | Uehreka wrote:
         | A lot of the negativity around this post is about the fact that
         | there's no demo and no open weights, which is Correct
         | Negativity. Like don't get me wrong, it would be cool for
         | something like this to exist, but I've generally learned not to
         | trust AI companies' descriptions of their models until someone
         | (or I) can actually get their hands on it and see if it's
         | usable at all. A description of a model that isn't going to be
         | released to the public isn't very interesting to me.
        
           | whywhywhywhy wrote:
           | > but I've generally learned not to trust AI companies'
           | descriptions of their models
           | 
           | Sora was described very similar to this as a "world
           | simulator" but ultimately it never materialized.
           | 
           | This one is a bit more hopeful from the videos though.
        
       | rowanG077 wrote:
       | I'm not sure this is interesting beyond the wow effect. Unless we
       | can actually get the world out of the AI. The real reason chatgpt
       | and friends actually have customers is that the text interface is
       | actually durable and easily to build upon after generation. It's
       | also super ez to feed text into a fresh cycle. But this, while
       | looking fancy, doesn't seem to be on the path to actually working
       | out. Unless there is a sane export to unreal or something.
        
       | xlbuttplug2 wrote:
       | What would scare me is if this becomes economically viable enough
       | to release to the public, rather than staying an unlimited budget
       | type of demo.
        
       | nkotov wrote:
       | I wonder how far are we from being able to use this at home as a
       | form of entertainment.
        
       | timeattack wrote:
       | Advances in generative AI are making me progressively more and
       | more depressive.
       | 
       | Creativity is taken from us at exponential rate. And I don't buy
       | argument from people who are saying they are excited to live in
       | this age. I can get that if that technology stopped at current
       | state and remained to be just tools for our creative endeavours,
       | but it doesn't seem to be an endgame here. Instead it aims to be
       | a complete replacement.
       | 
       | Granted, you can say "you still can play musical
       | instruments/paint pictures/etc for yourself", but I don't think
       | there was ever a period of time where creative works were just
       | created for sake of itself rather for sharing it with others at
       | masse.
       | 
       | So what is final state here for us? Return to menial not-yet-
       | automated work? And when this would be eventually automated,
       | what's left? Plug our brains to personalized autogenerated worlds
       | that are tailored to trigger related neuronal circuitry for
       | producing ever increasing dopamine levels and finally burn our
       | brains out (which is arguably already happening with tiktok-style
       | leasure)? And how you are supposed to pay for that, if all work
       | is automated? How economics of that is supposed to work?
       | 
       | Looks like a pretty decent explanation of Fermi paradox. No-one
       | would know how technology works, there are no easily available
       | resources left to make use of simpler tech and planet is littered
       | to the point of no return.
       | 
       | How to even find the value in living given all of that?
        
         | delfinom wrote:
         | All I know is I am investing into suicide booth startups
        
           | taberiand wrote:
           | So that the robots have a leisure activity, or so that humans
           | get a quick escape in the face of runaway climate change?
        
         | wahnfrieden wrote:
         | Automation only leads to more labor if we allow that employer
         | relation to dictate so. Automation affords leisure time (for
         | everything besides labor that life has to offer, including
         | optional labor-like pursuits) but it's currently unevenly
         | distributed who gets to benefit from that
        
           | worldsayshi wrote:
           | We keep coming back to the conclusion that we need to turn
           | the economy on its head.
           | 
           | With business as usual capital is power and capital is
           | increasingly getting centralized.
        
           | fgafford wrote:
           | You need to read Brave New World. Already have all that
           | figured out.
           | 
           | Work is fundamental part of society and will never be
           | eliminated, regardless of its utility/usefulness. The
           | cast/class system determines the type of work. The amount
           | (time) of work is set as it was discovered additional leisure
           | and to reduce it does not improve individuals happiness.
        
             | wahnfrieden wrote:
             | Try reading Dawn of Everything
        
         | snickerdoodle12 wrote:
         | I see two ways this is going to go:
         | 
         | 1. Universal Basic Income as we're on the way to a post-
         | scarcity society. Unlikely to actually happen due to greed.
         | 
         | 2. We take inspiration from the french revolution and then
         | return to a simpler time.
        
           | dist-epoch wrote:
           | In the french revolution the army and the people had similar
           | kind of weapons. And there was no total surveillance to round
           | up the leaders.
        
             | snickerdoodle12 wrote:
             | Yes, it'd be difficult. I have some faith that once things
             | escalate far enough the people wielding the weapons are
             | unwilling to murder their countrymen en masse.
             | 
             | Luigi Mangione has shown that all it takes is one person in
             | the right time and place to remove some evil from the
             | world.
        
               | HeatrayEnjoyer wrote:
               | It needs to happen, at minimum, before drones can
               | reliably maintain themselves and kill dissidents in the
               | street. At that point even if the human police and
               | soldiers become disloyal it'll be too late; a society of
               | two types of people, the one guy with access to issue
               | prompts, and everyone else.
        
           | holoduke wrote:
           | If bio engineering takes off for real we will integrate our
           | consciousness in our artificial digital ecosystem.
        
           | maerF0x0 wrote:
           | > Unlikely to actually happen due to greed.
           | 
           | Greed makes no sense in a truly post scarcity society. There
           | is no scarcity from which to take in a zero sum way from
           | another.
           | 
           | Status is the real issue. Humans use status to select
           | sexually, and the display is both competitive and
           | comparative. It doesnt matter absolutely how many pants you
           | have, only that you have more and better than your
           | competition.
           | 
           | I actually think this thing is baked into our DNA and until
           | sex itself is saturated (if there is such a thing), or DNA is
           | altered, we will continue to have a however subtle form of
           | competition undergirding all interactions.
        
             | tim333 wrote:
             | I think UBI is likely to happen because of greed - people
             | like free stuff and will vote for it is it's real. The
             | trouble with the pitch:
             | 
             | >Vote for me and we'll hand free money to everyone and the
             | robots will do the work
             | 
             | at the moment is the robots doing the work don't exist.
             | Things will change when they do.
        
         | dist-epoch wrote:
         | > Looks like a pretty decent explanation of Fermi paradox.
         | 
         | It's not. We will be replaced, but the AI will carry on.
        
           | dingnuts wrote:
           | this is a religious opinion at this state of technological
           | development lol
           | 
           | a lot of these comments border on cult thinking. it's a
           | fucking text to 3D image model, not R Daneel Olivaw, calm
           | down
        
             | myrmidon wrote:
             | Do you _honestly_ believe that human minds won 't be
             | overtaken within the century?
             | 
             | I'll concede that it might take even longer to get _full_
             | artificial human capabilities (robust, selfrepairing,
             | selfreplicating, adaptable), but the writing is on the
             | wall.
             | 
             | Even in the very best case that I see (non-malicious AI
             | with a soft practical ceiling not too far beyond human
             | capabilities) poses giant challenges for our whole society,
             | just in ressource allocation alone (because people, as
             | workers, become practically worthless, undermining our
             | whole system completely).
        
           | hooverd wrote:
           | Eh, might as well kill yourself now then.
        
         | skybrian wrote:
         | We already live in a world where a vast library of songs by
         | musicians who play much better than you are readily available
         | on YouTube and Spotify. This seems like more of the same?
        
           | podgietaru wrote:
           | I like living in a world where I know that people who have
           | spent actually time on nurturing a talent get rewarded for
           | doing so, even if that talent is not something I will ever be
           | good at.
           | 
           | I don't want to live in a world where these things are
           | generated cheaply and easily for the profit of a very select
           | few group of people.
           | 
           | I know the world doesn't work like I described in the top
           | paragraph. But it's a lot closer to it than the bottom.
        
             | wolttam wrote:
             | It's hard to see how there will be room for profit as this
             | all advances
             | 
             | There will be two classes of media:
             | 
             | - Generated, consumed en-masse by uncreative, uninspired
             | individuals looking for cheap thrill
             | 
             | - Human created, consumed by discerning individuals seeking
             | out real human talent and expression. Valuing it based
             | merely on the knowledge that a biological brain produced
             | (or helped produce) it.
             | 
             | I tend to suspect that the latter will grow in value, not
             | diminish, as time progresses
        
               | skeezyboy wrote:
               | https://en.wikipedia.org/wiki/Pause_Giant_AI_Experiments:
               | _An...
               | 
               | people said the world could literally end if we train
               | anything bigger than chatgpt4... I would take these
               | projections with a handful of salt
        
               | pizzathyme wrote:
               | This is an incredible artifact.
        
               | skybrian wrote:
               | It seems to me that you're describing Hollywood?
               | Admittedly, there are big budget productions, but
               | Hollywood is all about fakery, it's cheap for the
               | consumer, and there's a lot of audience-pleasing dreck.
               | 
               | There's no bright line between computer and human-created
               | video - computer tools are used everywhere.
        
             | bko wrote:
             | > I like living in a world where I know that people who
             | have spent actually time on nurturing a talent get rewarded
             | for doing so, even if that talent is not something I will
             | ever be good at.
             | 
             | Rewarded how? 99.99% of people who do things like sports or
             | artistic like writing never get "rewarded for doing so", at
             | least in the way I imagine you mean the phrase. The reward
             | is usually the experience itself. When someone picks up a
             | ball or an instrument, they don't do so for some material
             | reward.
             | 
             | Why should anyone be rewarded materially for something like
             | this? Why are you so hung up on the <0.001% that can
             | actually make some money now having to enjoy the activity
             | more as a hobby than a profession.
        
               | podgietaru wrote:
               | 99.99% of people, really? You think there isn't a huge
               | swath of the economy that are made up of professional
               | writers, artists, musicians, graphic designers, and all
               | the other creative professionals that the producers of
               | these models aim to replicate the skills of?
               | 
               | Why am I so "hung up" on the livelihood of these people?
               | 
               | Doing art is a Hobby is a good in and of itself. I did
               | not say otherwise. But when I see a movie, when I listen
               | to a song, I want to appreciate the integrity and talent
               | of the people that wrote them. I want them to get paid
               | for that enjoyment. I don't think that's bizarre.
        
               | holoduke wrote:
               | You can still makes movies , music etc. But now with
               | better tools. Just accept the new reality and try to play
               | this new level. The old won't come back. Its a waste of
               | time to complain and feel frustrated. There are plenty of
               | opportunities to express your creativity.
        
             | fantasizr wrote:
             | I could see that theater and live music (especially
             | performed on acoustic instruments) become hyper popular
             | because it'll be the only talent worth paying to see when
             | everything else is 'cheaply' made.
        
             | pessimizer wrote:
             | > I like living in a world where I know that people who
             | have spent actually time on nurturing a talent get rewarded
             | for doing so, even if that talent is not something I will
             | ever be good at.
             | 
             | That world has only existed for the last hundred or so
             | years, and the talent is usually brutally exploited by
             | people whose main talent is parasitism. Only a tiny
             | percentage of people who sell creative works can make a
             | living out of it; the living to be made is in buying their
             | works at a premium, bundling them, and reselling them,
             | while offloading almost all of the risk to the creative as
             | an "advance."
             | 
             | Then you're left in a situation where both the buyer of art
             | and the creator of art are desperate to pander to the
             | largest audience possible because everybody is leveraged.
             | It's a dogshit world that creates dogshit art.
        
           | Saline9515 wrote:
           | It still requires work, dedication and produces authenticity.
           | A world where AI can produce music instantly commoditizes it.
        
             | skybrian wrote:
             | Music is already a commodity. You can just buy some
             | anonymous background music to play in your restaurant. No
             | effort required.
        
               | Saline9515 wrote:
               | Yes but I don't want to hear some anonymous background
               | music.
               | 
               | A better example would be Spotify replacing artist-made
               | music recommandations with low-quality alternatives, to
               | reduce what it pays to artists. Everyone except Spotify
               | loses in this scenario.
        
               | rohit89 wrote:
               | In the future, everyone will have their own ai agents
               | capable of generating music to their own tastes. They
               | won't be using spotify.
               | 
               | The future with AI is not going to be our current world
               | with some parts replaced by AI. It will be a whole new
               | way of life.
        
               | roywiggins wrote:
               | My prediction is that personal generation is going to be
               | niche forever, for purely social reasons. The demand for
               | fandoms and fan communities seems to be essentially
               | unlimited. Big artists have big fandoms, tiny ones have
               | tiny fandoms, but none of that works with personalized
               | generations.
        
               | HeatrayEnjoyer wrote:
               | Communities around fictional universes are already
               | fractured and shrinking in member size because of the
               | sheer number of algorithmically targeted universes
               | available.
               | 
               | Water cooler talk about what happened this week in
               | M.A.S.H. or Friends is extinct.
               | 
               | Worse, in the long run even community may be synthesized.
               | If a friend is meat or if they're silicon (or even carbon
               | fiber!), does it matter if you can't tell the difference?
               | It might to pre-modern boomers like me and you.
        
             | whamlastxmas wrote:
             | I mean you can just listen to human made music if that's an
             | important part of the experience for you. I doubt humans
             | are going to stop anytime soon
        
               | danelski wrote:
               | But availability of _new_ works shall change once the
               | floor of how popular you need to be to survive off of art
               | will change and it will, since not everyone will care.
               | Taylor Swift will be fine either way, but it 's not about
               | her.
        
               | Saline9515 wrote:
               | If you flood the space with AI-made music costing a few
               | cents to create, human artists will have a much harder
               | time to survive professionally.
        
         | svantana wrote:
         | You're quite the pessimist. I think the arts would do well to
         | look at sports as a glimpse of their future. Machines are
         | faster and stronger than people, but that hasn't had any impact
         | on sports at all. Nobody's tuning in to the robot olympics.
        
           | rishabhparikh wrote:
           | Agreed that no one wants to watch shotput when the ball is
           | launched out of a cannon, but people might be interested when
           | the robots competing are anthropomorphs.
           | 
           | For example, robot boxing:
           | https://www.youtube.com/watch?v=rdkwjs_g83w
        
           | AstroBen wrote:
           | Who did the visual effects of the last movie you watched?
           | 
           | Most commercial artists are very much unknown, in the
           | background. This is a different situation from sport
        
           | likium wrote:
           | A better analogy would be musicians. Recorded music is around
           | but some musicians still make a living, mostly off live
           | concerts and merch.
           | 
           | But it might also go the way of pottery, glass-making and
           | weaving. They're still around but extremely niche.
        
         | Etheryte wrote:
         | > I don't think there was ever a period of time where creative
         | works were just created for sake of itself rather for sharing
         | it with others at masse.
         | 
         | Numerous famous writers, painters, artists, etc counter this
         | idea, Kafka being a notable example, whose significant works
         | only came to light after his passing and against his will. This
         | doesn't take away from the rest of your discussion point, but
         | art always has and always will also exist solely for its own
         | sake.
        
         | zyruh wrote:
         | I agree. While I love AI, advancements must be responsible. We
         | are made to be social beings and giving more and more of lives
         | over to AI takes us away from the fundamental need to draw
         | creativity, inspiration, and connection from other people.
         | Thoughts?
        
         | quantumHazer wrote:
         | Machine learning as it is needs human data and input to
         | progress further.
         | 
         | Synthetic data can be useful until a certain point, but you
         | can't expect to have a better model on synthetic data alone
         | indefinitely.
         | 
         | The moat of GDM here is YouTube. That have a bazillion of
         | gameplay and whatever videos. But here it is.
         | 
         | The downside I can see is that most people will stop to publish
         | content online for free since this companies have absolutely no
         | respect whatsoever for the humans that created the data they
         | use.
        
           | dawnerd wrote:
           | Charging for content means nothing. Meta was pirating media
           | and training against that and I suspect everyone else is too
           | but hasn't been caught yet.
        
           | dbspin wrote:
           | I've never understood this argument... The real world is an
           | unbounded training set that its cheap to observe with readily
           | available sensors that have existed for almost a century.
        
         | yomismoaqui wrote:
         | The question is, why are you doing art?
         | 
         | - Because you enjoy it
         | 
         | - Because you get pats in the back from people you share it
         | with
         | 
         | - Because you want to earn money from it
         | 
         | The 1st one will continue to be true in this dystopian AI art
         | future, the other not so much.
         | 
         | And sincerely I find that kind of human art, the one that comes
         | from a pure inner force, the more interesting one.
         | 
         | EDIT: list formatting
        
           | sunsunsunsun wrote:
           | You seem to forget that most artists enjoy it but due to the
           | structure of our society are forced to either give it up for
           | most of their waking life to earn money or attempt to market
           | their art to the masses to make money. This AI stuff only
           | makes it harder for artists to make any kind of living off of
           | their work.
        
             | MetaWhirledPeas wrote:
             | While there are plenty of cases where good artists make
             | most of their money from the art, there are plenty of other
             | cases where good artists have a 'real job' on the side.
        
             | jjrh wrote:
             | Ideally AI makes it so you don't have to work and can
             | pursue whatever interests.
        
           | assword wrote:
           | > The 1st one will continue to be true in this dystopian AI
           | art future, the other not so much.
           | 
           | No it won't, you'll be too busy trying to survive off of what
           | pittance is left for you to have any time to waste on leisure
           | activities.
        
         | flyinglizard wrote:
         | I share your feelings. Also couple that with a populist and
         | cynical political climate that can't create effective
         | regulations even if it wanted, and that by its very appetite
         | for scale AI thrives at the hands of the few that can feed it
         | and you get something quite bleak.
         | 
         | My only hope is that we could have created 100k nukes of
         | monstrous yields but collectively decided not to. We instead
         | created 10k smaller ones. We could have destroyed ourselves
         | long ago but managed to avoid it.
        
         | roboboffin wrote:
         | In theory, creativity is an infinite space. As technology
         | advances it allows humans to explore more and more complex
         | things; take the advancement of music as an example, synths,
         | loops etc.
         | 
         | If humans are not stretched to their limits, and are still able
         | to be creative, then the tools will help us find our way
         | through this infinite space.
         | 
         | AI will never be able to generate everything for us, because
         | that means it will need infinite computation.
        
           | rowanG077 wrote:
           | AI will not be able to generate everything for us. Just the
           | things that are able to be explored by humans and hopefully a
           | tad bit more. AI is already more creative than humans by a
           | lot of measures.
        
             | roboboffin wrote:
             | Depends what you mean by creativity. In some ways, AI is
             | not creative at all, everything is generated by mapping
             | text to visuals using diffusion modelling via a shared
             | latent space. It has no agency or creative thought of its
             | own.
             | 
             | Humans have demonstrated time and again, even things beyond
             | our experience can be explored by us; quantum mechanics for
             | example. Humans find a way to map very complex subjects to
             | our own experience using analogy. Maybe AI can help us go
             | further by allowing us to do this on even more complex
             | ideas.
        
           | seanw444 wrote:
           | It doesn't need to generate _everything_. It only needs to be
           | marginally better or more efficient than a human for it to
           | start generating _everything humans need when needed_.
           | 
           | Edit: left the page open for a while before responding, and
           | the other person responded with basically the same thing
           | within that time.
        
             | roboboffin wrote:
             | If human need drives the creative process, then there will
             | always be a human in the loop. Instead, each human becomes
             | the "random seed" that initialises the process based on
             | their own unique make-up. This is only different from how
             | things work now, in that humans are also creating the
             | artefact.
             | 
             | Similar to how synths meant we no longer need to play an
             | instruments by plucking strings, it hasn't affected the
             | higher level creativity of creating music, only expanded
             | it.
        
         | pixelesque wrote:
         | What's interesting to me along these lines is I assume most of
         | the companies funding the research are targeting the "creative"
         | media in terms of image generation, music generation, avatars,
         | speach, etc.
         | 
         | I can understand it's very interesting from a researcher's
         | point-of-view (I'm a software dev who's worked adjacent to some
         | ML researchers doing pipeline stuff to integrate models into
         | software), but at the same time: Where are the robots to do
         | menial work like clean toilets, kitchens, homes, etc?
         | 
         | I assume the funding isn't there? Or maybe it's much less
         | exciting to research diffusion networks for image generation
         | that working out algorithms for the best way to clean toilets
         | :)
        
           | dingnuts wrote:
           | robotics is difficult and since transformers are just next
           | word predictors they can't actually help us design those
           | robots :)
           | 
           | also the billionaires have help so they don't give a shit if
           | the menial stuff is automated or not. throw in a little
           | misogyny by and large too; I saw a LinkedIn Lunatic in the
           | wild (some C-level) saying laundry is already automated
           | because laundry machines exist
           | 
           | fucking.. tell me you don't ever do the laundry without
           | telling me. That guy's poor wife.
        
           | einarfd wrote:
           | There are companies out there working on those problems as
           | well. How the funding climate for them are. I don't know. But
           | the market for smart robots, should be gigantic. So there
           | must be some. Keep in mind that what is easy, and hard for a
           | human, which is the result of billions of years of evolution.
           | Isn't necessary the same things that are hard or easy for our
           | technologies.
        
           | cherry_tree wrote:
           | There was a recent talk about using vision language models to
           | train robots to do household tasks:
           | https://youtu.be/a8-QsBHoH94
           | 
           | I wonder how advanced world models like genie 3 would change
           | the approach if it all.
        
           | lentil_soup wrote:
           | Or replacing CEOs, investors, bankers? I would have thought
           | those would be easier to replace than creating robots to
           | clean or replacing artists, or even developers. Maybe I am
           | wrong?
        
             | SirHumphrey wrote:
             | All these jobs are more who you know not what you know. The
             | social network of these people is often an integral part of
             | the work, so they are in a sense much safer than
             | programmers, accountants and artists.
        
         | myahio wrote:
         | What specific form of creative media is this supposed to
         | replace though? I feel like its just going to create a brand
         | new, exciting category of entertainment. I personally fail to
         | see any bad precedent within this announcement.
        
         | skeezyboy wrote:
         | a reminder, most of the world do manual labour in exchange for
         | money. an LLM cant help with that and never will
        
           | rowanG077 wrote:
           | There is huge progress in robotics. Which includes fruits
           | from the LLM hype. A lot of manual labor will be able to be
           | done by humanoid robots.
        
         | Wissenschafter wrote:
         | "Granted, you can say "you still can play musical
         | instruments/paint pictures/etc for yourself", but I don't think
         | there was ever a period of time where creative works were just
         | created for sake of itself rather for sharing it with others at
         | masse."
         | 
         | I sit and play guitar by myself all the time, I play for nobody
         | but myself, and I enjoy it a lot. Your argument is absurd.
        
         | furyofantares wrote:
         | > but I don't think there was ever a period of time where
         | creative works were just created for sake of itself rather for
         | sharing it with others at masse
         | 
         | Kids do it all the time.
         | 
         | > So what is final state here for us?
         | 
         | Something I haven't seen discussed too much is taste - human
         | tastes change based on what has come before. What we will care
         | about tomorrow is not what we care about today.
         | 
         | It seems plausible to me that generative AI could get higher
         | and higher quality without really touching how human tastes
         | changes. That would leave a lot of room for human creativity
         | IMO - we have shared experience in a changing world that seems
         | very hard to capture with data.
        
         | mbowcut2 wrote:
         | It's not a new problem (for individuals), though perhaps at an
         | unprecedented scale (so, maybe a new problem for civilization).
         | I'm sure there were black smiths that felt they had lost their
         | meaning when they were replaced by industrial manufacturing.
        
         | Kiro wrote:
         | I don't understand your argument at all. I've made hundreds of
         | songs in my life that I haven't shared with anyone and so have
         | all other musicians I know. The act of creating is separate
         | from finding or having an audience. In fact, I would say that
         | the complete opposite of what you say is true.
         | 
         | And even so, music production has been a constant evolution of
         | replacing prior technologies and making it easier to get into.
         | It used to be gatekept by expensive hardware.
        
         | p4coder wrote:
         | Today physical world is largely mechanized, we rarely walk, run
         | lift heavy things for survival. So we grow fat and weak unless
         | we exercise. Tomorrow vast majority of us will never think,
         | create, investigate for earning a living. So we will get dumb
         | and dumber over time. A small minority of us will keep
         | polishing their intellect but will never be smarter than
         | machines just like the best athletes of today can't outrun
         | machines.
        
           | pizzathyme wrote:
           | This is surprisingly a great analogy because millions of
           | people still run every week for their own benefit (physical
           | and mental health, social connection, etc).
           | 
           | I wonder if mental exercises will move to the same category?
           | Not necessarily a way to earn money, but something everybody
           | does as a way of flourishing as a human.
        
             | psbp wrote:
             | The process of thinking and exploring ideas is inherently
             | enriching.
             | 
             | Nothing can take away your ability to have incredible
             | experiences, except if the robots kill us all.
        
               | thinkingtoilet wrote:
               | I don't know... There are plenty of otherwise capable
               | adults who just get home from work and watch TV. They
               | either never, or extremely rarely, indulge in hobbies, go
               | see a concert, or even go out to meet others. Not that TV
               | can't be art and challenge us but lets be honest, 99% of
               | it is not that.
        
               | psbp wrote:
               | I have been this person. I can say that it's not a time
               | of my life I look back on fondly.
        
         | dartharva wrote:
         | I look at it as the pendulum swinging back.
         | 
         | For too long has humanity been collectively submerged into this
         | hyper-consumption of the arts. We, our parents and our
         | grandparents have been getting bombarded by some or the other
         | form of artificial dopamine sweets - from videos to reels to
         | xeets to "news" to ads to tunes to mainstream media - every
         | second of the day, every single day. The kind of media
         | consumption we have every day is something our forefathers
         | would have been overwhelmed by within an hour. It is not
         | natural.
         | 
         | This complete cheapening of the arts is finally giving us a
         | chance to shed off this load for good.
        
         | michalf6 wrote:
         | Nick Land kind of took this line of reasoning to its ultimate
         | conclusion, I recommend giving his ideas a read even if they
         | sound repulsive.
         | 
         | "Nothing human makes it out of the near-future."
        
           | pizzathyme wrote:
           | I've tried and failed to find a good starting point for his
           | ideas. Do you recommend any?
        
             | michalf6 wrote:
             | This is pretty decent, at least the first half:
             | https://www.youtube.com/watch?v=lrOVKHg_PJQ
        
         | lbrito wrote:
         | >And how you are supposed to pay for that, if all work is
         | automated? How economics of that is supposed to work?
         | 
         | With UBI, probably. With a central government formed by our
         | robot overlords. But why even pay us at that point?
        
         | curious_cat_163 wrote:
         | > So what is final state here for us? Return to menial not-yet-
         | automated work? And when this would be eventually automated,
         | what's left? Plug our brains to personalized autogenerated
         | worlds that are tailored to trigger related neuronal circuitry
         | for producing ever increasing dopamine levels and finally burn
         | our brains out (which is arguably already happening with
         | tiktok-style leasure)? And how you are supposed to pay for
         | that, if all work is automated? How economics of that is
         | supposed to work?
         | 
         | Wow. What a picture! Here's an optimistic take, fwiw: Whenever
         | we have had a paradigm shift in our ability to process
         | information, we have grappled with it by shifting to higher-
         | level tasks.
         | 
         | We tend to "invent" new work as we grapple with the technology.
         | The job of a UX designer did not exist in 1970s (at least not
         | as a separate category employing 1000s of people; now I want to
         | be careful this is HN, so there might be someone on here who
         | was doing that in the 70s!).
         | 
         | And there is capitalism -- if everyone has access to the best-
         | in-class model, then no one has true edge in a competition.
         | That is not a state that capitalism likes. The economics _will_
         | ultimately kick in. We just need this recent S-curve to settle
         | for a bit.
        
         | rohit89 wrote:
         | > So what is final state here for us?
         | 
         | I think we have a long way to go yet. Humanity is still in the
         | early stages of its tech tree with so many unknown and unsolved
         | problems. If ASI does happen and solves literally everything,
         | we will be in a position that is completely alien to what we
         | have right now.
         | 
         | > How to even find the value in living given all of that?
         | 
         | I feel like a lot of AI angst comes from people who place their
         | self-worth and value on external validation. There is value in
         | simply existing and doing what you want to do even if nobody
         | else wants it.
        
         | tekacs wrote:
         | We can dream bigger: when music, images, video and 3d assets
         | are far easier then treat them as primitives.
         | 
         | We can use these to create entire virtual worlds, games,
         | software that incorporates these, and to incorporate creativity
         | and media into infinitely more situations in real life.
         | 
         | We can create massive installations that are not a single image
         | but an endless video with endless music, and then our hand
         | turns to stabilizing and styling and aestheticizing those
         | exactly in line with our (the artist's) preferences.
         | 
         | Romanticizing the idea that picking at a guitar is somehow
         | 'more creative' than using a DAW to create incredibly complex
         | and layered and beautiful music is the same thing that's
         | happening here, even if the primitives seem 'scarier' and
         | 'bigger'.
         | 
         | Plus, there are many situations in life that would be made
         | infinitely more human by the introduction of our collective
         | work in designing our aesthetic and putting it into the world,
         | and encoding it into models. Installations and physical spaces
         | can absolutely be more beautiful if we can produce more, taking
         | the aesthetic(s) that we've built so far and making them
         | dynamic to spaces.
         | 
         | Also for learning: as a young person learning to draw and sing
         | and play music and so many other things, I would have
         | tremendously appreciated the ability to generate and follow
         | subtle, personalized generation - to take a photo of a scene in
         | front of me and have the AI first sketch it loosely so that I
         | can copy it, then escalate and escalate until I can do
         | something bigger.
        
         | stillpointlab wrote:
         | > I don't buy argument from people who are saying they are
         | excited to live in this age
         | 
         | What argument is required for excitement? Excitement is a
         | feeling not a rational act. It comes from optimism and
         | imagination. There is no argument for optimism. There is often
         | little reason in imagination.
         | 
         | > How to even find the value in living given all of that?
         | 
         | You might have heard of the Bhagavad Gita, a 2000+ year old
         | spiritual text. It details a conversation between a warrior
         | prince and a manifestation of God. The warrior prince is facing
         | a very difficult battle and he is having doubts justifying any
         | action in the face of the decisions he has to make. He is
         | begging this manifestation of God to give him good reasons to
         | act, good reasons not just to throw his weapons down, give away
         | all his possessions and sit in a cave somewhere.
         | 
         | There are no definite answers in the text, just meditations on
         | the question. Why should we act when the result is ultimately
         | pointless, we will all die, people will forget you, situations
         | will be resolved with or without you, etc.
         | 
         | This isn't some new question that LLMs are forcing us to
         | confront. LLMs are just providing us a new reason to ask the
         | same age-old questions we have been facing for as long as
         | writing has existed.
        
           | HocusLocus wrote:
           | Genie 3 not only groks the Bhagavad Gita, it can generate
           | "Blue & Elephant People: The Movie".
        
         | vessenes wrote:
         | Don't be mad bro. Seriously. Every single person working on a
         | film has creative input, not just someone hand painting a
         | backdrop. You have an immense number of tools available to be
         | creative with now. This is a great thing!
        
         | pessimizer wrote:
         | > I don't think there was ever a period of time where creative
         | works were just created for sake of itself rather for sharing
         | it with others at masse.
         | 
         | You don't think there was ever a time without a mass media
         | culture? Plenty of people have furniture older than mass media
         | culture. Even 20 years ago people could manage to be creative
         | for a tiny audience of what were possibly other people doing
         | creative things. It's only the zoomers who have never lived in
         | a world where you never thought to consider how you could sell
         | the song you were writing in your bedroom to the Chinese
         | market.
         | 
         | It used to be that music didn't come on piano rolls, records,
         | tapes, CDs or files. It used to be that your daughter would
         | play music on the piano in the living room for the entire
         | family. Even if it was music that wouldn't really sell, and
         | wasn't perfectly played, people somehow managed to enjoy it. It
         | was not a situation that AI could destroy. If anything, AI
         | could assist.
        
         | stronglikedan wrote:
         | > How to even find the value in living given all of that?
         | 
         | If your value in living is in any way affected by AI, ever,
         | then, well, let's just say I would never choose that for
         | myself. Good luck.
        
         | rikroots wrote:
         | > Granted, you can say "you still can play musical
         | instruments/paint pictures/etc for yourself", but I don't think
         | there was ever a period of time where creative works were just
         | created for sake of itself rather for sharing it with others at
         | masse.
         | 
         | There's a whole host of "art" that has been created by people -
         | sometimes for themselves, sometimes for a select few friends -
         | which had little purpose beyond that creation[1]. Some people
         | create art because they simply _have_ to create art - for
         | pleasure, for therapy, for whatever[2]. For many, the act of
         | creation was far more important than the act of
         | distribution[3].
         | 
         | For me, my obsession is constructing worlds, maps, societies
         | and languages that will almost certainly die with me. And
         | that's fine. When I feel the compulsion, I'll work on my
         | constructions for a while, until the compulsion passes - just
         | as I have done (on and off) for the past 50 years. If the world
         | really needs to know about me, then it can learn more than it
         | probably wants to know through my poetry.
         | 
         | [1] - Emily Dickinson is an obvious example:
         | https://en.wikipedia.org/wiki/Emily_Dickinson
         | 
         | [2] - Coral Castle, Florida:
         | https://en.wikipedia.org/wiki/Coral_Castle
         | 
         | [3] - Federico Garcia Lorca almost certainly didn't write his
         | Sonetos del amor oscuro for publication - he just needed to
         | write them:
         | https://es.wikisource.org/wiki/Sonetos_del_amor_oscuro
        
         | neom wrote:
         | In my opinion, what humans need, crave, chase, is novelty. Just
         | look at how phobic we are of boredom. I believe creativity is
         | part of the chasing of novelty, or the allaying of boredom. I
         | studied film making in my 20s when the shift to digital
         | happened, and I was the first cohort through the first digital
         | film program in my country. When new ways to create become
         | available, the people who struggle are often the ones who are
         | unable to adapt their mindset to the new creative mediums and
         | don't think "what is new to be done here". Many people when I
         | graduated thought I was totally nuts of not owning or using an
         | analogue camera, so many reasons, oh you can't trust the CF
         | cards, oh the HDR will never get there, oh the shutter is too
         | slow. This is just a version of that imo. I think AI and
         | robotics are going all the way to the end, I'm trying to adjust
         | my old man brain to the new world the best I can, feel blessed
         | to have been part of a version of this before.
        
         | tomrod wrote:
         | Branding and differentiation.
         | 
         | People still value Amish furniture or woodworking despite Ikea
         | existing. I love that if I want a cheap chair made of cardboard
         | and glue that I can find something to satisfy that need; but I
         | still buy nice furniture when I can.
         | 
         | AI creations are analogous. I've seen some cool AI stuff, but
         | it definitely doesn't replace the real "organic" art one finds.
        
           | HeatrayEnjoyer wrote:
           | What if it's not cardboard and glue but woodworking of ultra-
           | master quality?
           | 
           | These fears aren't realized if AI never achieves superhuman
           | performance, but what if they do?
        
         | mindwok wrote:
         | Man, same here. I was initially a massive AI evangelist up
         | until about a year ago, now I just feel sad for some reason -
         | and I don't want to feel sad, I'm a technologist at heart and
         | I've been thrilled by every advance since I was born. I feel
         | like some sad old boomer yelling at clouds and I'm not even 30
         | yet.
         | 
         | My only hope is this: I think the depression is telling us
         | something real, we are collectively mourning what we see as the
         | loss of our humanity and our meaning. We are resilient
         | creatures though, and hopefully just like the ozone layer, junk
         | food, and even the increasing rejections of social media and
         | screen time, we will navigate it and reclaim what's important
         | to us. It might take some pain first though.
        
         | lubujackson wrote:
         | Be comforted by the fact that no matter how good the AI gets,
         | people crave human connection. Just like AI can generate music
         | there is an uncanny valley effect where you quickly deduce
         | there's no true humanity behind any of it, and ultimately
         | undervalue it. At best you can have something like Minecraft or
         | Dwarf Fortress where the generated worlds CAN be inspiring to a
         | degree, but that is because the rules around generation are
         | incredibly intricate and, ultimately, human.
         | 
         | Yes, AI can make music that sounds decent and lyrics that rhyme
         | and can even be clever. But listen to a couple songs and your
         | brain quickly spots the patterns. Maybe AI gets there some day,
         | but the uncanny valley seems to be quite a chasm - and anything
         | that approaches the other side seems to do so by piling lots of
         | human intention along the way.
        
         | imiric wrote:
         | I can relate. It's exhausting.
         | 
         | The main challenge over the next decade as all our media
         | channels are flooded with generated media will become curation.
         | We desperately need ways to filter human-created content from
         | generated content. Not just for the sake of preserving art, but
         | for avoiding societal collapse from disinformation, which is a
         | much more direct and closer threat. Hell, we've been living
         | with the consequences of mass disinformation for the past
         | decade, but automated and much more believable campaigns
         | flooding our communication platforms will drastically lower the
         | signal-to-noise ratio. We're currently unable to even imagine
         | the consequences of that, and are far from being prepared for
         | it.
         | 
         | This tech needs strict regulation on a global scale. Anyone
         | against this is either personally invested in it, or is
         | ignorant of its dangers.
        
         | rolfus wrote:
         | I'm one of those excited people! We haven't lost anything with
         | this new technology, only gained.
         | 
         | The way I see it, most people aren't creative. And the people
         | who are creatives are mostly creating for the love of it. Most
         | books that are published are read exclusively by the friends
         | and family of the author. Most musicians, most stand-up
         | comedians, most artist get to show off their works for small
         | groups of people and make no money doing so. But they do it
         | anyway. I draw terrible portraits, make little inventions and
         | sometimes I build something for the home, knowing full well
         | that I do these things for my own enjoyment and whatever ego
         | boost I get from showing these things off to people I know.
         | 
         | I'm doing a marathon later and I've been working my ass off for
         | the prospect of crossing the finishing line as number four
         | thousand and something, and I'll do it again next year.
        
         | j_timberlake wrote:
         | I don't know how on Earth people can think like this. Most
         | people can find "value" in a slice of pizza. It doesn't even
         | have to be a good pizza.
         | 
         | Or kittens and puppies. Do you think there won't be kittens and
         | puppies?
         | 
         | And that's putting aside all the obvious space-exploration
         | stuff that will probably be more interesting than anything the
         | previous 100 billion humans ever saw.
        
         | tim333 wrote:
         | >So what is final state here for us?
         | 
         | The merge. (https://blog.samaltman.com/the-merge)
         | 
         | I'm quite enthusiastic. I've always thought mortality sucks.
        
         | HardCodedBias wrote:
         | "Creativity is taken from us at exponential rate"
         | 
         | Nothing is being taken away.
        
       | seydor wrote:
       | Would progress in these be faster if they created 3d meshes and
       | animations instead of full frame videos?
        
         | zbrw wrote:
         | I believe that the corpus of video data to train on with video
         | far exceeds that of 3D data. It's also much cheaper to produce
         | video data. So I'd expect that this is probably the quickest
         | way forward from a current world state perspective.
         | 
         | Additionally, video seems like a pretty forward output shape to
         | me - 2D image with a time component. If we were talking 3D
         | assets and animations I wouldn't even know where to start with
         | modeling that as input data for training. That seems really
         | hard to model as a fixed input size problem to me.
         | 
         | If there was comparable 3D data available for training, I'd
         | guess that we'd see different issues with different approaches.
         | 
         | A couple of examples that I could think of quickly: Using these
         | to build games, might be easier if we could interact with the
         | underlying "assets". Getting photorealistic results with
         | intricate detail (e.g. hair, vegetation) might be easier with
         | video based solutions.
        
           | nosignono wrote:
           | If the fidelity of the video is high enough, you could use
           | SFM to build point clouds from the generated video frames and
           | essentially do photogrammatry on the assets from a genie
           | video.
        
         | teamonkey wrote:
         | I always wonder why they're not chasing more pragmatic and
         | lower-hanging fruit first.
         | 
         | There's absolutely no reason that a game needs to be generated
         | frame-by-frame like this. It seems like a deeply unserious
         | approach to making games.
         | 
         | (My feeling is that it must be easier to train this way.)
        
           | seydor wrote:
           | well actually image output is fixed and there s lots of
           | training data. Neural networks can learn anything in their
           | latent space so there is no need to impose 3D rendering
           | constraints, and it s not evident that it's less efficient
           | (for the model).
           | 
           | 3D model rendering would be useful however for interfacing
           | with robots.
        
             | teamonkey wrote:
             | You often view 3D games on a 2D screen. That doesn't mean
             | that a game is natively 2D and the 3D world is an
             | inconvenient step that can be bypassed. Actually the
             | opposite, the 2D representation on screen is just a
             | projection.
             | 
             | In VR, for example, the same 3D scene will be rendered
             | twice, once for each eye, from two viewpoints 10-15cm
             | apart.
             | 
             | If you don't have an internal 3D representation of the
             | world, the AI would need to generate _exactly_ the same
             | scene from a very slightly different perspective for each
             | eye, without any discrepancies or artefacts.
             | 
             | And that's not even discussing physics, collisions or any
             | form of consistent world logic that happens off-screen. Or
             | multiplayer!
        
       | kouteiheika wrote:
       | > To that end, we're exploring how we can make Genie 3 available
       | to additional testers in the future.
       | 
       | No need to explore; I can tell you how. Release the weights to
       | the general public so that everyone can play with it and non-
       | Google researchers can build their work upon it.
       | 
       | Of course this isn't going to happen because "safety". Even
       | telling us how many parameters this model has is "unsafe".
        
         | Dlanv wrote:
         | Modern AI wouldn't exist without Google's contributions. Yet
         | they're a for-profit company. I'm ok with them keeping some
         | things closed source every now and then.
        
       | Davidzheng wrote:
       | This is one of the most insane feats of AI I have ever seen to be
       | honest.
        
       | obayesshelton wrote:
       | Strap on a headset and we are one step closer to being in a
       | simulation.
        
         | artificialprint wrote:
         | Jokes on u, I'm already in a simulation
        
       | sirolimus wrote:
       | Not open source, not worth it. Next.
        
       | jl6 wrote:
       | Have they explained anywhere what hardware resources it takes to
       | run this in 720p at 24fps with minutes-long context?
        
       | addisonj wrote:
       | Really impressive... but wow this is light on details.
       | 
       | While I don't fully align with the sentiment of other commenters
       | that this is meaningless unless you can go hands on... it is
       | crazy to think of how different this announcement is than a few
       | years ago when this would be accompanied by an actual paper that
       | shared the research.
       | 
       | Instead... we get this thing that has a few aspects of a paper -
       | authors, demos, a bibtex citation(!) - but none of the actual
       | research shared.
       | 
       | I was discussing with a friend that my biggest concern with AI
       | right now is not that it isn't capable of doing things... but
       | that we switched from research/academic mode to full value
       | extraction _so fast_ that we are way out over our skis in terms
       | of what is being promised, which, in the realm of exciting new
       | field of academic research is pretty low-stakes all things
       | considered... to being terrifying when we bet policy and
       | economics on it.
       | 
       | To be clear, I am not against commercialization, but the
       | dissonance of this product announcement made to look like
       | research written in this way at the same time that one of the
       | preeminent mathematicians writing about how our shift in funding
       | of real academic research is having real, serious impact is...
       | uh... not confidence inspiring for the long term.
        
       | demirbey05 wrote:
       | This is bad use of AI, we spend our compute to make science
       | faster. I am pretty confident computational cost of this will be
       | maybe 100x of chatgpt query. I don't want to think even
       | environmental effects.
        
       | dev0p wrote:
       | That's completely bonkers. We are making machines dream of
       | explorable, editable, interactable worlds.
       | 
       | I wonder how much it costs to run something like this.
        
       | netdur wrote:
       | Mark Zuckberd must very very upset looking at this, I expect him
       | to throw another billion dollars at google engineers
        
       | idiotsecant wrote:
       | A Mind needs a few things: The ability to synthesize sensor data
       | about the outside world into a form that can be compressed into
       | important features, the ability to choose which of those features
       | to pay attention to, the ability to model the physical world
       | around it, find reasonable solutions to problems, and simulate
       | its actions before taking them, The ability to understand and
       | simulate the actions of _other_ Minds, the ability to compress
       | events into important features and store them in memory, the
       | ability to _retrieve_ those memories and appropriate times and in
       | appropriate clarity, etc.
       | 
       | I feel like as time goes on more and more of these important
       | features are showing up as disconnected proofs of concept. I
       | think eventually we'll have all the pieces and someone will just
       | need to hook them together.
       | 
       | I am more and more convinced that AGI is just going to eventually
       | _happen_ and we 'll barely notice because we'll get there inch by
       | inch, with more and more amazing things every day.
        
       | superjan wrote:
       | There are very few people visible in the demo's. I suppose that
       | is harder?
        
       | badmonster wrote:
       | a massive leap forward for real-time world modeling
        
       | Bjorkbat wrote:
       | Genuinely technically impressive, but I have a weird issue with
       | calling these world simulator models. To me, they're video game
       | simulator models.
       | 
       | I've only ever seen demos of these models where things happen
       | from a first-person or 3rd-person perspective, often in the sort
       | of context where you are controlling some sort of playable
       | avatar. I've never seen a demo where they prompted a model to
       | simulate a forest ecology and it simulated the complex interplay
       | of life.
       | 
       | Hence, it feels like a video game simulator, or put another way,
       | a simulator of a simulator of a world model.
        
         | Bjorkbat wrote:
         | Also, to drive my point further home, in one of the demos they
         | were operating a jetski during a festival. If the jetski bumps
         | into a small Chinese lantern, it will move the lantern.
         | Impressive. However, when the jetski bumped into some sort of
         | floating structure the structure itself was completely
         | unaffected while the jetski simply stopped moving.
         | 
         | This is a pretty clear example of video game physics at work.
         | In the real world, both the jetski and floating structure would
         | be much more affected by a collision, but in the context of
         | video game physics such an interaction makes sense.
         | 
         | So yeah, it's a video game simulator, not a world simulator.
        
           | rohit89 wrote:
           | The goal is to eventually be able to model physics and all
           | the various interactions accurately.
        
             | Bjorkbat wrote:
             | Sure, but if you're trying to get there by training a model
             | on video games then you're likely going to wind up
             | inadvertently creating a video game simulator rather than a
             | physics simulator.
             | 
             | I don't doubt they're trying to create a world simulator
             | model, I just think they're inadvertently creating a video
             | game simulator model.
        
               | rohit89 wrote:
               | Are they training only on video game data though? I would
               | be surprised when its so easy to generate proper training
               | data for this.
               | 
               | It is interesting to think about. This kind of training
               | and model will only capture macro effects. You cannot use
               | this to simulate what happens in a biological cell or
               | tweak a gravity parameter and see how plants grow etc.
               | For a true world model, you'd need to train models that
               | can simulate at microscopic scales as well and then have
               | it all integrated into a bigger model or something.
               | 
               | As an aside, I would love to see something like this for
               | the human body. My belief is that we will only be able to
               | truly solve human health if we have a way of simulating
               | the human body.
        
           | kridsdale3 wrote:
           | In the "first person standing in a room" demo, it's cool to
           | see 100% optical (trained from recorded footage from cameras)
           | graphics, including non-rectilinear distortion of parallel
           | lines as you'd get from a wide-angle lens and not a high-FOV
           | game engine. But still the motion of the human protagonist
           | and the camera angle were 100% trained on how characters and
           | controllers work in video games.
        
         | lubujackson wrote:
         | It doesn't feel incredibly far off from demoscene scripts that
         | generate mountain ranges in 10k bytes or something. It is
         | wildly impressive but may also be wildly limited in how it
         | accomplishes it and not extensible in a way we would like.
        
       | phgn wrote:
       | So we cannot use this yet?
       | 
       | While watching the video I was just imagining the $ increasing by
       | the second. But then it's not available at all yet :(
        
       | ACAVJW4H wrote:
       | Wondering What happens when we peer through a microscope or
       | telescope?
        
       | koakuma-chan wrote:
       | Damn, this reminds me of those Chinese FMV games on Steam.
        
       | _hark wrote:
       | Very cool! I've done research on reinforcement/imitation learning
       | in world models. A great intro to these ideas is here:
       | https://worldmodels.github.io/
       | 
       | I'm most excited for when these methods will make a meaningful
       | difference in robotics. RL is still not quite there for long-
       | horizon, sparse reward tasks in non-zero-sum environments, even
       | with a perfect simulator; e.g. an assistant which books travel
       | for you. Pay attention to when virtual agents start to really
       | work well as a leading signal for this. Virtual agents are
       | strictly easier than physical ones.
       | 
       | Compounding on that, mismatches between the simulated dynamics
       | and real dynamics make the problem harder (sim2real problem).
       | Although with domain randomization and online corrections
       | (control loop, search) this is less of an issue these days.
       | 
       | Multi-scale effects are also tricky: the characteristic temporal
       | length scale for many actions in robotics can be quite different
       | from the temporal scale of the task (e.g. manipulating
       | ingredients to cook a meal). Locomotion was solved first because
       | it's periodic imo.
       | 
       | Check out PufferAI if you're scale-pilled for RL: just do RL
       | bigger, better, get the basics right. Check out Physical
       | Intelligence for the same in robotics, with a more
       | imitation/offline RL feel.
        
       | jimmySixDOF wrote:
       | What gets me is the egocentric perspective it has naturally
       | produced from its training data, where you have the perception of
       | a 3D 6 degrees of freedom world space around you. Once it's
       | running at 90 frames per second and working in a meshed geometry
       | space, this will intersect with augmented virtual XR headsets,
       | and the metaverse will become an interaction arena for working
       | with artificial intelligence using our physical action, our gaze,
       | our location, and a million other points of background noise
       | telemetry, all of which will be integrated into what we now today
       | call context and the response will be adjusting in a useful,
       | meaningful way what we see painted into our environment. Imagine
       | the world as a tangible user interface.
        
       | mhitza wrote:
       | Just imagine if the developers of Star Citizen had access to this
       | technology, how much more they could have squeezed from
       | unsuspecting backers.
        
       | red_hare wrote:
       | Can you imagine explaining to someone from the 1800s that we've
       | created a fully generative virtual world experience and the demo
       | was "painting a wall blue"
        
         | mattjreid wrote:
         | They would be impressed by the paint roller - it wasn't
         | invented until the 1940s.
        
         | SirHumphrey wrote:
         | Reading works of early computer scientists (mathematicians?)
         | like Ada Lovelace or Alan Turing it seems to me that they would
         | be a lot less surprised than some current observers. The idea
         | of artificial mind comes up a lot and they weren't witness to
         | 30 years of slow and uninspiring NLP developments.
        
       | sys32768 wrote:
       | I'm imagining how these worlds combined with AI NPCs could help
       | people learn real-world skills, or overcome serious anxiety
       | disorders, etc.
        
       | forrestthewoods wrote:
       | They're very clever to only turn 90 degrees. I'd like to see a
       | couple of 1080s with a little bit of 120 degree zig zagging along
       | the way please.
        
       | internetter wrote:
       | I feel like this tech is a dead end. If it could instead generate
       | 3d models which are then rendered, that would be immensely
       | useful. Eliminates memory and playtime constraints, allows it to
       | be embedded in applications like games. But this? Where do we go
       | from here? Even if we eliminate all graphical issues and get
       | latency from 1s to 0, what purpose does it serve?
        
         | calebh wrote:
         | I think the most likely path forward for
         | commercialization/widespread use is to use AI as a post-
         | processing filter for low poly games. Imagine if you could take
         | low quality/low poly assets, run it through a game engine to
         | add some basic lighting, then pass this through AI to get a
         | photo-realistic image. This solves the most egregious cases of
         | world inconsistency and still allows for creative human fine-
         | tuning. The trick will be getting the post-processor to run at
         | a reasonable frame rate.
        
           | internetter wrote:
           | Don't we already have upscalers which are frequently used in
           | games for this purpose? Maybe they could go further and get
           | better but I'd expect a model specifically designed to
           | improve the quality of an existing image to be better/more
           | efficient at doing so than an image generation model
           | retrofitted to this purpose.
        
       | ralusek wrote:
       | It's interesting, because I was always a bit confused and annoyed
       | by the Giant's Drink/Mind Game that Ender plays in Ender's Game.
       | It just always felt so different to how games I knew played, it
       | felt odd that he would "discover" things that the developers
       | hadn't intended, because I always just thought "wait, someone had
       | to build that into the game just in case he happened to do that
       | one specific thing?" Or if it was implied that they didn't do
       | that, then my thought was "that's not how this works, how is it
       | coming up with new/emergent stories?"
       | 
       | This feels almost exactly like that, especially the
       | weird/dreamlike quality to it.
        
       | arjie wrote:
       | This is beautiful. An incredible device that could expand
       | people's view of history and science. We could create such
       | immersive experiences with this.
       | 
       | I know that everyone always worries about trapping people in a
       | simulation of reality etc. etc. but this would have blown my mind
       | as a child. Even _Riven_ was unbelievable to me. I spent hours in
       | _Terragen_.
        
       | guybedo wrote:
       | a lot to unpack here, i've added a detailed summary here:
       | 
       | https://extraakt.com/extraakts/google-s-genie-3-capabilities...
        
       | jp1016 wrote:
       | This looks incredibly promising not just for AI research but for
       | practical use cases in game development. Being able to generate
       | dynamic, navigable 3D environments from text prompts could save
       | studios hundreds of hours of manual asset design and prototyping.
       | It could also be a game-changer for indie devs who don't have big
       | teams.
       | 
       | Another interesting angle is retrofitting existing 2D content
       | (like videos, images, or even map data) into interactive 3D
       | experiences. Imagine integrating something like this into Google
       | Maps suddenly street view becomes a fully explorable 3D
       | simulation generated from just text or limited visual data.
        
         | creata wrote:
         | It just generates video, though, doesn't it? How are you going
         | to get usable assets out of that?
        
           | CharlieDigital wrote:
           | Why wouldn't one be able to train an AI model to extract 3D
           | models/assets out of an image/still from video?
        
       | cranium wrote:
       | I find the model very impressive, but how could it be used in the
       | wild? They mention robots (maybe to test them cheaply in
       | completely different environments?), but I don't see the use in
       | games except during development to generate ideas/assets.
        
       | muskmusk wrote:
       | Jesus.
       | 
       | This is starting to feel pretty **ing exponential.
        
       | SpaceManNabs wrote:
       | So are foundational models real finally now?
       | 
       | Are they just multimodal for everything?
       | 
       | Are foundational time series models included in this category?
        
       | nektro wrote:
       | google pushing new levels of evil with this one
        
       | bluehat974 wrote:
       | It's feel like Ready Player One on Vision Pro will arrive soon
        
       | yahoozoo wrote:
       | What format do these world models output? Since it's interactive,
       | it's not just a video...does DeepMind have some kind of
       | proprietary runtime or what?
        
         | creata wrote:
         | > Since it's interactive, it's not just a video
         | 
         | I think it just outputs image frames...
        
           | yahoozoo wrote:
           | Ah, yea. You're right. After reading a bit more, it's just
           | "responding" to the prompts/navigation with real-time
           | generation. Pretty cool.
        
       | guybedo wrote:
       | it's simulations all the way down
        
       | whatever1 wrote:
       | This is scary. I don't have a benchmark to propose but in don't
       | think my brain can imagine things with greater fidelity than
       | this. I can probably write down the physics better but I think
       | these systems have reached parity with at least my imagination
       | model
        
       | unboxingelf wrote:
       | The Simulation Theory presents the following trilemma, one of
       | which must be true:
       | 
       | 1. Almost all human-level civilizations go extinct before
       | reaching a technologically mature "posthuman" stage capable of
       | running high-fidelity ancestor simulations.
       | 
       | 2. Almost no posthuman civilizations are interested in running
       | simulations of their evolutionary history or beings like their
       | ancestors.
       | 
       | 3. We are almost certainly living in a computer simulation.
        
         | lotyrin wrote:
         | If you take the idea of it needing to be a constructed
         | simulation you get the dream argument. If you add that one
         | can't verify anyone else having subjective experience you get
         | Boltzmann brain. If you add the idea that maybe the ancestor
         | simulations are designed to teach us virtuous behavior through
         | repeated visits to simulation worlds you get the karmic cycle,
         | and Boltzmann brain + karmic cycle is roughly the egg theory.
         | 
         | I think some/all of these things can roughly true at the same
         | time. Imagine an infinite space full of chaotic noise that
         | arises a solitary Boltzmann brain, top level universe and top
         | level intelligence. This brain, seeking purpose and company in
         | the void, dreams of itself in various situations (lower level
         | universes) and some of those universes' societies seek to
         | improve themselves through deliberate construction of karmic
         | cycle ancestor simulation. A hierarchy of self-similar
         | universes.
         | 
         | It was incredibly comforting to me to think that perhaps the
         | reason my fellow human beings are so poor at empathy,
         | inclusion, justice, is that this is a karmic kindergarten where
         | we're intended to be learning these skills (and the
         | consequences for failing to perform them) and so of course
         | we're bad at it, it's why we're here.
        
         | crazygringo wrote:
         | But there are lots of critiques of that supposed trilemma.
         | 
         | Why would beings in simulations be conscious?
         | 
         | Or maybe running simulations is really expensive and so it's
         | done sometimes (more than "almost none") but only sometimes
         | (nowhere near "we are almost certainly").
         | 
         | Or simulations are common but limited? You don't need to
         | simulate a universe if all you want to do is simulate a city.
         | 
         | The "trilemma" is an extreme example of black-and-white
         | thinking. In the real world, things cost resources and so there
         | are tradeoffs -- so middle grounds are the rule, not extremes.
        
       | lotyrin wrote:
       | Kinda wish the ski scenario had "yeti" as an event you could
       | trigger.
        
       | mason_mpls wrote:
       | The demo looks like they're being very gentle with the AI, this
       | doesn't look like much of an advancement.
        
       | mason_mpls wrote:
       | The claims being made in this announcement are not demonstrated
       | in the video. A very careful first person walk in an AI video
       | isn't very impressive these days...
        
       | qwertox wrote:
       | This is revolutionary. I mean, we already could see this coming,
       | but now it's here. With limitations, but this is the beginning.
       | 
       | In game engines it's the engineers, the software developers who
       | make sure triangles are at the perfect location, mapping to the
       | correct pixels, but this here, this is now like a drawing made by
       | a computer, frame by frame, with no triangles computed.
        
       | j_timberlake wrote:
       | People are thinking "how are video games going to use this?"
       | 
       | That's not the point, video games are worth chump-change compared
       | to robotics. Training AIs on real-world robotic arms scaled
       | poorly, so they're looking for paths that leverage what AI scales
       | well at.
        
       | maerF0x0 wrote:
       | I'm still struggling to imagine a world where predicting the next
       | pixel wins over over building a deterministic thing that is then
       | ran.
       | 
       | Eg: Using AI to generate textures, wire models, motion sequences
       | which themselves sum up to something that local graphics card can
       | then render into a scene.
       | 
       | I'm very much not an expert in this space, but to me it seems if
       | you do that, then you can tweak the wire model, the texture, move
       | the camera to wherever you want in the scene etc.
        
         | wolttam wrote:
         | At some point it will be computationally cheaper to predict the
         | next pixel than to classically render the scene, when talking
         | about scenes beyond a certain graphical fidelity.
         | 
         | The model can infinitely zoom in to some surface and
         | depict(/predict) what would really be there. Trying to do so
         | via classical rendering introduces many technical challenges
        
       | Vipitis wrote:
       | I wish they would share more about how it works. Maybe a reseach
       | paper for once? we didn't even get a technical report.
       | 
       | From my best guess: it's a video generation model like the ones
       | we already head. But they condition inputs (movement direction,
       | viewangle). Perhaps they aren't relative inputs but absolute and
       | there is a bit of state simulation going on? [although some demo
       | videos show physics interactions like bumping against objects -
       | so that might be unlikely, or maybe it's 2D and the up axis is
       | generated??].
       | 
       | It's clearly trained on a game engine as I can see screenspace
       | reflection artefacts being learned. They also train on
       | photoscans/splats... some non realistic elements look
       | significantly lower fidelity too..
       | 
       | some inconsistencies I have noticed in the demo videos:
       | 
       | - wingsuit discollcusions are lower fidelity (maybe initialized
       | by high resolution image?)
       | 
       | - garden demo has different "geometry" for each variation, look
       | at the 2nd hose only existing in one version (new "geometry" is
       | made up when first looked at, not beforehand).
       | 
       | - school demo has half a caroutside the window? and a
       | suspiciously repeating pattern (infinite loop patterns are common
       | in transformer models that lack parameters, so they can scale
       | this even more! also might be greedy sampling for stability)
       | 
       | - museum scene has odd reflection in the amethyst box, like the
       | rear mammoth doesn't have reflections on the right most side of
       | the box before it's shown through the box. The tusk reflection
       | just pops in. This isn't fresnel effect.
        
       | slj wrote:
       | Everyone is in agreement, this is impressive stuff. Mind blowing,
       | even. But have the good people at Google decided why exactly we
       | need to build the torment nexus?
        
       | pedalpete wrote:
       | We were working towards this years ago with Doarama/Ayvri, and I
       | remember fondly in 2018 an investor literally yelling at me that
       | I didn't know what I was talking about and AI would never be able
       | to do this. Less than a decade later, here we are.
       | 
       | Our product was a virtual 3d world made up of satellite data.
       | Think of a very quick, higher-res version of google earth, but
       | the most important bit was that you uploaded a GPS track and it
       | re-created the world around that space. The camera was always
       | focused on the target, so it wasn't a first person point of view,
       | which, for the most part, our brains aren't very good at
       | understanding over an extended period of time.
       | 
       | For those curious about the use case, our product was used by
       | every paraglider in the world, commercial drone operations,
       | transportation infrastructure sales/planning, out-door events
       | promotions (specifically bike and ultramarathon races).
       | 
       | Though I suspect we will see a new form of media come from this.
       | I don't pretend to suggest exactly what this media will be, but
       | mixing this with your photos we can see the potential for an
       | infinitely re-framable and zoomable type of photo media.
       | 
       | Creating any "watchable" content will be challenging if the
       | camera is not target focused, and it makes it difficult to create
       | a storyline if you can't dictate where the viewer is pointed.
        
         | swalsh wrote:
         | To be fair, I'm seeing the demo video, and I still don't
         | believe it's possible. This is sci-fi tech.
        
       ___________________________________________________________________
       (page generated 2025-08-05 23:00 UTC)