Post ASeXmy2sRbu1f9Zj72 by ct_bergstrom@fediscience.org
 (DIR) More posts by ct_bergstrom@fediscience.org
 (DIR) Post #ASeXmy2sRbu1f9Zj72 by ct_bergstrom@fediscience.org
       2023-02-13T23:20:44Z
       
       1 likes, 2 repeats
       
       Meta. OpenAI. Google.Your AI chatbot is not *hallucinating*. It's bullshitting. It's bullshitting, because that's what you designed it to do. You designed it to generate seemingly authoritative text "with a blatant disregard for truth and logical coherence," i.e., to bullshit.
       
 (DIR) Post #ASeXoEijvpjdfZlGC0 by bikerglen@mastodon.social
       2023-02-13T23:25:24Z
       
       0 likes, 0 repeats
       
       @ct_bergstrom perfect for writing hollow but good sounding corporate pr flack pieces. Or the trader joe's fearless flyer product descriptions.
       
 (DIR) Post #ASeqT6YoyQ8RPSl7nU by moultano@sigmoid.social
       2023-02-13T23:30:12Z
       
       0 likes, 0 repeats
       
       @ct_bergstrom Disagree. They're designed to mimic what a human would write. If they end up bullshitting it's because the models aren't good enough, not because that's what they're designed to do.
       
 (DIR) Post #ASeqT78cpIm7CVVibY by ct_bergstrom@fediscience.org
       2023-02-13T23:34:46Z
       
       0 likes, 0 repeats
       
       @moultano Humans have an underlying knowledge model. They have beliefs about the world, and choose whether to represent those beliefs accurately or inaccurately using language.LLMs do not have an underlying knowledge model, they don't have a concept of what is true or false in the world. They just string together words they don't "understand" in ways that are likely to seem credible.It's not a matter of making better LLMs; it'll take a fundamentally different type of model.
       
 (DIR) Post #ASeqT7kYYH7H69G0jA by moultano@sigmoid.social
       2023-02-13T23:40:15Z
       
       0 likes, 0 repeats
       
       @ct_bergstrom LLMs represent whether they "believe something to be true" in a way that you can extract unsupervised. Not disagreeing that their world model isn't good enough to be used without auxiliary retrieval, but there's some evidence they have one. https://arxiv.org/abs/2212.03827
       
 (DIR) Post #ASeqT8Bqumx0So1oH2 by not2b@sfba.social
       2023-02-14T02:42:39Z
       
       0 likes, 0 repeats
       
       @moultano @ct_bergstrom They consume language and then produce language. Their "beliefs" can be about the structure of the English language (when generating text in English), like that adjectives that describe color always go after adjectives that describe size: "the little red hen", not "the red little hen". But they don't have a model of the external world.
       
 (DIR) Post #ASeqT8mihiRQJ9HFjs by ct_bergstrom@fediscience.org
       2023-02-14T03:20:44Z
       
       0 likes, 0 repeats
       
       @not2b @moultano Precisely. Their "beliefs" have no anchor point outside the world of text.
       
 (DIR) Post #ASeqT9Cx8BQPcVYCcy by moultano@sigmoid.social
       2023-02-14T03:24:03Z
       
       0 likes, 0 repeats
       
       @ct_bergstrom @not2b Huge fractions of what I know have no anchor point outside of text, like nearly all science and math.
       
 (DIR) Post #ASeqT9Icn5xhu6CjT6 by moultano@sigmoid.social
       2023-02-13T23:51:22Z
       
       0 likes, 0 repeats
       
       @ct_bergstrom Another way of disentangling things. For the question you're asking, does the answer exist on the web? If it does, then the problem can't be with the "design" (I. e. the training regime) but rather the power of the model.
       
 (DIR) Post #ASeqT9cpZy7oulerxo by TedUnderwood@sigmoid.social
       2023-02-14T03:37:53Z
       
       0 likes, 0 repeats
       
       @moultano @ct_bergstrom @not2b Not a whole lot of anchoring for history either. But there are already multimodal models, like Flamingo. If text really has to be grounded in sense experience we will presumably see that research path take the lead and produce better textual prediction.
       
 (DIR) Post #ASer44f6zl28EhKVoO by TedUnderwood@sigmoid.social
       2023-02-14T03:44:36Z
       
       0 likes, 0 repeats
       
       @moultano @ct_bergstrom @not2b if multimodal training doesn’t make a model much better at predicting text, then at some point we’ll need to revise our priors and consider the possibility that a functional world model can largely be inferred from text
       
 (DIR) Post #ASf4y1eseHYUXbEQQC by moultano@sigmoid.social
       2023-02-14T06:20:23Z
       
       0 likes, 0 repeats
       
       @TedUnderwood @ct_bergstrom @not2b I think it's plausible that a multimodal model might eventually benefit, but the bandwidth advantage of text over video is just too great. You'd need enough video frames to have cause and effect, physics, plot, object permanence.
       
 (DIR) Post #ASfCA7HoFX059L9XM0 by Ben_Carver@hcommons.social
       2023-02-14T07:41:00Z
       
       0 likes, 0 repeats
       
       @TedUnderwood @moultano @ct_bergstrom @not2b This connects with Angus Fletcher's article in Narrative (why computer AI will never do what we imagine it can), where he says narrative capacity derives from 500 million years of evolutionary practice at "flailing a flagellum or other primitive limb (...) In response to positive and negative reinforcement."
       
 (DIR) Post #ASfUVb4oU0jOnbSyHY by TedUnderwood@sigmoid.social
       2023-02-14T11:06:35Z
       
       0 likes, 0 repeats
       
       @Ben_Carver @moultano @ct_bergstrom @not2b I remember that essay. Surprised but grateful that people are lining up to make falsifiable predictions about this stuff.
       
 (DIR) Post #ASfeuQgjF8slMRgGMC by TedUnderwood@sigmoid.social
       2023-02-14T13:03:08Z
       
       0 likes, 0 repeats
       
       @moultano @ct_bergstrom @not2b I agree. Eventually sense experience will help, but language alone is providing more of a world model than lots of us would have expected. And if that’s true, then the people dismissing predict-the-next-word as inherently just bullshit generation are prob sneering too hastily.
       
 (DIR) Post #ASfjazIVMAheg9tnCy by jef@mathstodon.xyz
       2023-02-14T13:55:37Z
       
       0 likes, 0 repeats
       
       @TedUnderwood @moultano @ct_bergstrom @not2b People who understand the technology of large language models aren't dismissing it as "inherently just bullshit generation", but they are warning that its output smoothly mixes both fact and falsehood with no distinction or care. And they warn that the quantity and impact of this #bullshit could likely surpass that of #politics, #consumerism, and other forms of rampant #disinformation for which we humans have demonstrated we are poorly prepared.