[HN Gopher] All AI models might be the same
___________________________________________________________________
All AI models might be the same
Author : jxmorris12
Score : 95 points
Date : 2025-07-17 17:28 UTC (5 hours ago)
(HTM) web link (blog.jxmo.io)
(TXT) w3m dump (blog.jxmo.io)
| tyronehed wrote:
| Especially if they are all me-too copies of a Transformer.
|
| When we arrive at AGI, you can be certain it will not contain a
| Transformer.
| jxmorris12 wrote:
| I don't think architecture matters. It seems to be more a
| function of the data somehow.
|
| I once saw a LessWrong post claiming that the Platonic
| Representation Hypothesis doesn't hold when you only embed
| random noise, as opposed to natural images:
| http://lesswrong.com/posts/Su2pg7iwBM55yjQdt/exploring-the-p...
| blibble wrote:
| > I don't think architecture matters. It seems to be more a
| function of the data somehow.
|
| of course it matters
|
| if I supply the ants in my garden with instructions on how to
| build tanks and stealth bombers they're still not going to be
| able to conquer my front room
| TheSaifurRahman wrote:
| This only works when different sources share similar feature
| distributions and semantic relationships.
|
| The M or B game breaks down when you play with someone who knows
| obscure people you've never heard of. Either you can't recognize
| their references, or your sense of "semantic distance" differs
| from theirs. The solution is to match knowledge levels: experts
| play with experts, generalists with generalists.
|
| The same applies to decoding ancient texts, if ancient
| civilizations focused on completely different concepts than we do
| today, our modern semantic models won't help us understand their
| writing.
| npinsker wrote:
| I've played this game with friends occasionally and -- when
| it's a person -- don't think I've ever completed a game.
| TheSaifurRahman wrote:
| Has there been research on using this to make models smaller? If
| models converge on similar representations, we should be able to
| build more efficient architectures around those core features.
| yorwba wrote:
| It's more likely that such an architecture would be bigger
| rather than smaller. https://arxiv.org/abs/2412.20292
| demonstrated that score-matching diffusion models approximate a
| process that combines patches from different training images.
| To build a model that makes use of this fact, all you need to
| do is look up the right patch in the training data. Of course a
| model the size of its training data would typically be rather
| unwieldy to use. If you want something smaller, we're back to
| approximations created by training the old-fashioned way.
| giancarlostoro wrote:
| I've been thinking about this a lot. I want to know what's the
| smallest a model needs to be, before letting it browse search
| engines, or files you host locally is actually an avenue an LLM
| can go through to give you more informed answers. Is it 2GB?
| 8GB? Would love to know.
| empath75 wrote:
| This is kind of fascinating because I just tried to play
| mussolini or bread with chatgpt and it is absolutely _awful_ at
| it, even with reasoning models.
|
| It just assumes that your answers are going to be reasonably
| bread-like or reasonably mussolini-like, and doesn't think
| laterally at all.
|
| It just kept asking me about varieties of baked goods.
|
| edit: It did much better after I added some extra explanation --
| that it could be anything that it may be very unlike either
| choice, and not to try and narrow down too quickly
| fsmv wrote:
| I think an LLM is a bit too high level for this game or maybe
| it just would need a lengthy prompt to explain the game.
|
| If you used word2vec directly it's the exact right thing to
| play this game with. Those embeddings exist in an LLM but it's
| trained to respond like text found online not play this game.
| Xcelerate wrote:
| Edit: I wrote my comment a bit too early before finishing the
| whole article. I'll leave my comment below, but it's actually not
| very closely related to the topic at hand or the author's paper.
|
| I agree with the gist of the article (which IMO is basically that
| universal computation is universal regardless of how you perform
| it), but there are two big issues that prevent this observation
| from helping us in a practical sense:
|
| 1. Not all models are equally _efficient_. We already have many
| methods to perform universal search (e.g., Levin 's, Hutter's,
| and Schmidhuber's versions), but they are painfully slow despite
| being optimal in a narrow sense that doesn't extrapolate well to
| real world performance.
|
| 2. Solomonoff induction is only optimal for _infinite_ data
| (i.e., it can be used to create a predictor that asymptotically
| dominates any other algorithmic predictor). As far as I can tell,
| the problem remains totally unsolved for _finite_ data, due to
| the additive constant that results from the question: _which_
| universal model of computation should be applied to finite data?
| You can easily construct a Turing machine that is universal and
| perfectly reproduces the training data, yet nevertheless
| dramatically fails to generalize. No one has made a strong case
| for any specific natural prior over universal Turing machines
| (and if you try to define some measure to quantify the "size" of
| a Turing machine you realize this method starts to fail once the
| number of transition tables becomes large enough to start
| exhibiting redundancy).
| im3w1l wrote:
| Regarding your second point I think there are two cases here
| that should be kept separate. The first is that you are
| teleported into a parallel dimension where literally everything
| works differently from here. In that case I do agree that there
| are several reasonable choices of models of computation. You
| simply have to pick one and hope it wasn't too bad.
|
| But the second case is that you encounter some phenomenon here
| in our ordinary world. And in that case I think you can do way
| better by reasoning about the phenomenon and trying to guess at
| plausible mechanics based on your preexisting knowledge of how
| the world works. In particular, I think guessing that "there is
| some short natural language description of how the phenomenon
| works, based on a language grounded in the corpus of human
| writing" is a very reasonable prior.
| dr_dshiv wrote:
| What about the platonic bits? Any other articles that give more
| details there?
| somethingsome wrote:
| Mmmh I'm deeply skeptical of some parts.
|
| > One explanation for why this game works is that there is only
| one way in which things are related
|
| There is not, this is a completely non transitive relationship.
|
| On another point, suppose you keep the same vocabulary, but
| permute the signification of the words, the neural network will
| still learn relationships, completely different ones, but it's
| representation may converge toward a better compression for that
| set of words, but I'm dubious that this new compression scheme
| will ressemble the previous one (?)
|
| I would say that given an optimal encoding of the relationships,
| we can achieve an extreme compression, but not all encodings lead
| to the same compression at the end.
|
| If I add 'bla' between every words in a text, that is easy to
| compress, but now, if I add an increasing sequence of words
| between each words, the meaning is still there, but the
| compression will not be the same, as the network will try to
| generate the words in-between.
|
| (thinking out loud)
| coffeecoders wrote:
| I think "we might decode whale speech or ancient languages" is a
| huge stretch. Context is the most important part of what makes
| language useful.
|
| There is billions of human-written texts, grounded in shared
| experience that makes our AI good at language. We don't have that
| for a whale.
| klank wrote:
| If a lion could speak, would we understand it?
| eddythompson80 wrote:
| There is nothing really special about speech as a form of
| communication. All animals communicate with each other and
| with other animals. Informational density and, uhhhhh,
| cyclomatic complexity might be different between speech and a
| dance or a grunt or whatever.
| klank wrote:
| I was referencing Wittgenstein's "If a lion could speak, we
| would not understand it." Wittgenstein believed (and I am
| strongly inclined to agree with him) that our ability to
| convey meaning through communication was intrinsically tied
| to (or, rather, sprang forth from) our physical, lived
| experiences.
|
| Thus, to your point, assuming communication, because
| "there's nothing really special about speech", does that
| mean we would be able to understand a lion, if the lion
| could speak? Wittgenstein would say probably not. At least
| not initially and not until we had built shared lived
| experiences.
| cdrini wrote:
| Hmm I'm not convinced we don't have a lot of shared
| experience. We live on the same planet. We both hunger,
| eat, and drink. We see the sun, the grass, the sky. We
| both have muscles that stretch and compress. We both
| sleep and yawn.
|
| I mean who knows, maybe their perception of these shared
| experiences would be different enough to make
| communication difficult, but still, I think it's
| undeniably shared experience.
| klank wrote:
| That's fair. To me, the point of Wittgenstein's lion
| thought experiment though was not necessarily to say that
| _any_ communication would be impossible. But to
| understand what it truly meant to be a lion, not just
| what it meant to be an animal. But we have no shared lion
| experiences nor does a lion have human experiences. So
| would we be able to have a human to lion communication
| even if we could both speak human speech?
|
| I think that's the core question being asked and that's
| the one I have a hard time seeing how it'd work.
| cdrini wrote:
| Hmm, I'm finding the premise a bit confusing, "understand
| what it truly meant to be a lion". I think that's quite
| different than having meaningful communication. One could
| make the same argument for "truly understanding" what it
| means to be someone else.
|
| My thinking is that if something is capable of human-
| style speech, then we'd be able to communicate with them.
| We'd be able to talk about our shared experiences of the
| planet, and, if we're capable of human-style speech,
| likely also talk about more abstract concepts of what it
| means to be a human or lion. And potentially create new
| words for concepts that don't exist in each language.
|
| I think the fact that human speech is capable of abstract
| concepts, not just concrete concepts, means that shared
| experience isn't necessary to have meaningful
| communication? It's a bit handwavy, depends a bit on how
| we're defining "understand" and "communicate".
| klank wrote:
| > I think the fact that human speech is capable of
| abstract concepts, not just concrete concepts, means that
| shared experience isn't necessary to have meaningful
| communication?
|
| I don't follow that line of reasoning. To me, in that
| example, you're still communicating with a human, who
| regardless of culture, or geographic location, still
| shares an immense amount of shared life experiences with
| you.
|
| Or, they're not. For example, an intentionally extreme
| example, I bet we'd have a super hard time talking about
| homotopy type theory with a member of the amazon rain
| forest. Similarly, I'd bet they had their own abstract
| concepts that they would not be able to easily explain to
| us.
| Isamu wrote:
| If we had a sufficiently large corpus of lion-speech we
| could build an LLM (Lion Language Model) that would
| "understand" as well as any model could.
|
| Which isn't saying much, it still couldn't explain Lion
| Language to us, it could just generate statistically
| plausible examples or recognize examples.
|
| To translate Lion speech you'd need to train a
| transformer on a parallel corpus of Lion to English, the
| existence of which would require that you already
| understand Lion.
| klank wrote:
| And even, assuming the existence of a Lion to English
| corpus, it would only give us Human word approximations.
| We experience how lossy that type of translation is
| already between Human->Human languages. Or sometimes
| between dialects within the same language.
|
| Who knows, we don't really have good insight into how
| this information loss, or disparity grows. Is it linear?
| exponential? Presumably there is a threshold beyond which
| we simply have no ability to translate while retaining a
| meaningful amount of original meaning.
|
| Would we know it when we tried to go over that threshold?
|
| Sorry, I know I'm rambling. But it has always been
| regularly on my mind and it's easy for me to get on a
| roll. All this LLM stuff only kicked it all into
| overdrive.
| cdrini wrote:
| Hmm I don't think we'd need a rosetta stone. In the same
| way LLMs associate via purely contextual usage the
| meaning of words, two separate data sets of lion and
| English, encoded into the same vector space, might pick
| up patterns of contextual usage at a high enough level to
| allow for mapping between the two languages.
|
| For example, given thousands of English sentences with
| the word "sun", the vector embedding encodes the meaning.
| Assuming the lion word for "sun" is used in much the same
| context (near lion words for "hot", "heat", etc), it
| would likely end up in a similar spot near the English
| word for sun. And because of our shared context living in
| earth/being animals, I reckon many words likely will be
| used in similar contexts.
|
| That's my guess though, note I don't know a ton about the
| internals of LLMs.
| ecocentrik wrote:
| That was a philosophical position on the difficulty of
| understanding alien concepts and language, not a hard
| technological limit.
| klank wrote:
| I'm missing why that distinction matters given the thread
| of conversation.
|
| Would you care to expound?
| kouru225 wrote:
| Knowing lions I bet all they'd talk about is being straight
| up dicks to anyone and everyone around them so yea I think we
| probably could ngl
| UltraSane wrote:
| We should understand common concepts like hungry, tired,
| horny, pain, etc.
| streptomycin wrote:
| _Is it closer to Mussolini or bread? Mussolini._
|
| _Is it closer to Mussolini or David Beckham? Uhh, I guess
| Mussolini. (Ok, they're definitely thinking of a person.)_
|
| That reasoning doesn't follow. Many things besides people would
| have the same answers, for instance any animal that seems more
| like Mussolini than Beckham.
| jxmorris12 wrote:
| Whoops. I hope you can overlook this minor logical error.
| streptomycin wrote:
| Oh yeah it's absolutely an interesting article!
| pjio wrote:
| I believe the joke is about David Beckham not really being
| (perceived as) human, even when compared to personified evil
| Fomite wrote:
| Oswald Mosley
| gerdesj wrote:
| The devil is in the details.
|
| I recently gave the "Veeam Intelligence" a spin.
|
| Veeam is a backup system spanning quite a lot of IT systems with
| a lot of options - it is quite complicated but it is also a
| bounded domain - the app does as the app does. It is very mature
| and has extremely good technical documentation and a massive
| amount of technical information docs (TIDs) and a vibrant and
| very well informed set of web forums, staffed by ... staff and
| even the likes of Anton Gostev -
| https://www.veeam.com/company/management-team.html
|
| Surely they have close to the perfect data set to train on?
|
| I asked a question about moving existing VMware replicas from one
| datastore to another and how to keep my replication jobs working
| correctly. In this field, you may not be familiar with my
| particular requirements but this is not a niche issue.
|
| The "VI" came up with a reasonable sounding answer involving a
| wizard. I hunted around the GUI looking for it (I had actually
| used that wizard a while back). So I asked where it was and was
| given directions. It wasn't there. The wizard was genuine but its
| usage here was a hallucination.
|
| A human might have done the same thing with some half remembered
| knowledge but would soon fix that with the docs or the app
| itself.
|
| I will stick to reading the docs. They are really well written
| and I am reasonably proficient in this field so actually - a
| decent index is all I need to get a job done. I might get some of
| my staff to play with this thing when given a few tasks that they
| are unfamiliar with and see what it comes up with.
|
| I am sure that domain specific LLMs are where it is at but we
| need some sort of efficient "fact checker" system.
| ieie3366 wrote:
| LLMs are bruteforce reverse engineered human brains. Think about
| it. Any written text out there is written by human brains. The
| "function" to output this is whatever happens inside the brain,
| insanely complex.
|
| LLM "training" is just brute forcing the same function into
| existence. "Human brain output X, llm output Y, mutate it times
| billion until X and Y start matching"
| tgsovlerkhgsel wrote:
| I've noticed that many of the large, separately developed AIs
| often answer with remarkably similar wording to the same
| question.
| kindkang2024 wrote:
| The Dao can be spoken of, yet what is spoken is not the eternal
| Dao.
|
| So, what is the Dao? Personally, I see it as will -- something we
| humans express through words. Even though we speak different
| languages -- Chinese, Japanese, English... -- behind them all
| lies a similar will.
|
| Large language models learn from word tokens and begin to grasp
| this will -- and in doing so, they become the Dao.
|
| In that sense, I agree: "All AI models might be the same."
| foxes wrote:
| So in the limit the models representation space has one dimension
| per "concept" or something, but making it couple things together
| is what actually makes it useful?
|
| An infinite dimensional model with just one dim per concept would
| be sorta useless, but you need things tied together?
| IAmNotACellist wrote:
| I agree LLMs are converging on a current representation of
| reality based on the collective works of humanity. What we need
| to do is provide AIs with realtime sensory input, simulated
| hormones each with their own half-lifes based on metabolic
| conditions and energy usage, a constant thinking loop, and
| discover a synthetic psilocybin that's capable of causing
| creative, cross-neural connections similar to human brains. We
| have the stoned ape theory, we need the stoned AI theory.
___________________________________________________________________
(page generated 2025-07-17 23:00 UTC)