Post ASeFBuuZKqkC3Ms9VA by virtuous_sloth@mstdn.ca
(DIR) More posts by virtuous_sloth@mstdn.ca
(DIR) Post #ASeEx6lyx6PqUtnKfQ by simon@fedi.simonwillison.net
2023-02-13T20:35:38Z
0 likes, 3 repeats
Wow.. while we were all making fun of Google's Bard demo for making some small mistakes about the James Webb Space Telescope, it turns out the Bing demo was wildly hallucinating made up financial comparisons between Gap and Lululemon! https://dkb.blog/p/bing-ai-cant-be-trusted
(DIR) Post #ASeFBuuZKqkC3Ms9VA by virtuous_sloth@mstdn.ca
2023-02-13T20:37:24Z
0 likes, 0 repeats
@simon Let's test AI in production, the best kind of testing!
(DIR) Post #ASeFPsIDrqdHzDakxk by emmah@wandering.shop
2023-02-13T20:37:29Z
0 likes, 0 repeats
@simon so you mean they are all crap? đ
(DIR) Post #ASeFg8FEKLAu5pwbFA by Randy_Au@recsys.social
2023-02-13T20:40:43Z
0 likes, 0 repeats
@simon can't wait for this whole situation to be written off as a collective hallucination
(DIR) Post #ASeHKK0Q7Q4K6hkEoi by simon@fedi.simonwillison.net
2023-02-13T21:02:11Z
0 likes, 0 repeats
These are some seriously misleading errors!> Lululemonâs gross margin is given as â58.7%â, which is a hallucinated value that doesnât appear in their financial document. The real value is 55.9%.>> Lululemonâs operating margin is 19%, not 20.7%.>> Lululemonâs diluted earnings per share is $2.00 not $1.65.>> Cash and cash equivalents is wrong for Gap (should be $679 million) but correct for Lululemon.>> Inventory is wrong for Gap (should be $3.04 billion) but correct for Lululemon.
(DIR) Post #ASeHWYXFvKiye2wAEa by SloanLA@mastodon.social
2023-02-13T21:04:08Z
0 likes, 0 repeats
@simon @mattjhodgkinson It isnât a small mistake. Itâs how these work. There is no verification of anything they produce, breaking expectations of users everywhere.
(DIR) Post #ASeIPPXX4QlCfmDuLI by mattjhodgkinson@scicomm.xyz
2023-02-13T21:12:16Z
0 likes, 0 repeats
@SloanLA @simon There needs to be anchoring in verifiable information built in to make these tools of any use.
(DIR) Post #ASeIPQF8SJder0cjJ2 by simon@fedi.simonwillison.net
2023-02-13T21:14:12Z
0 likes, 0 repeats
@mattjhodgkinson @SloanLA the wild thing here is that's supposed to be how the Bing one works!It runs regular searches and, according to the leaked prompts at least, instructs the language model to only use only those facts in its output, and provide citationsProblem is you can't actually tell a language model to do that - it's still going to predict random made up next tokens, because that's how language models work
(DIR) Post #ASeJoLjGTYmy31ANwu by zhksh@sigmoid.social
2023-02-13T21:29:55Z
0 likes, 0 repeats
@simon @mattjhodgkinson @SloanLA yeah, this will go down in history as a (totally predictable) bs usecase . But does look like LLMs can be used on top of proper search. Check out this recent paper by FAIR : https://arxiv.org/abs/2302.04761
(DIR) Post #ASeKFvH2QgGK6f6zMO by mattjhodgkinson@scicomm.xyz
2023-02-13T21:34:34Z
0 likes, 0 repeats
@simon @SloanLA You can lead an LLM to sources, but you canât make it think.
(DIR) Post #ASeMu3so2WsGIb2q0m by Tattered@mastodon.social
2023-02-13T22:04:50Z
0 likes, 0 repeats
@simon Anyone who has ever used Bing as a search engine is completely unsurprised. It seems to have a built-in randomiser. There is a reason that itâs allowed in China; like, good luck finding anything on Bing. Bing AI was always going to be psychedelic babbling.
(DIR) Post #ASeZQNVHpDFjrsDgkS by Kaker@infosec.exchange
2023-02-14T00:24:52Z
0 likes, 0 repeats
@simon GPTs have no episodic memory, I guess they'll keep hallucinating. The transformer predicts a vector that is mostly a general idea and the final step is basically the decoder of a VAE so it will generate plausible sounding stuff from any general Idea. The way to improve would be to remember training data which search engines already are kind of doing and transformers are query/key/value based so should not be too long.
(DIR) Post #ASeZj6sTrceTo106aW by polyna@toot.community
2023-02-14T00:28:35Z
0 likes, 0 repeats
@simon No shit, the day Microsoft makes something that doesnât suck is probably the day they start making vacuum cleaners.
(DIR) Post #ASf2DgXGePIvnhRioy by n1k0@mamot.fr
2023-02-14T05:47:47Z
0 likes, 0 repeats
@simon Clippy sabotaged it
(DIR) Post #ASgltfRhmaWng9CHtw by RellyAB@mastodon.social
2023-02-15T01:54:29Z
0 likes, 0 repeats
@simon The memes about this from the inside have been Rather Good.