Post AZgiVygOWZ9LS3YoFs by skyemaille@mastodon.scot
 (DIR) More posts by skyemaille@mastodon.scot
 (DIR) Post #AZggZKkgjC6CrI2cSm by tante@tldr.nettime.org
       2023-09-12T10:00:56Z
       
       2 likes, 2 repeats
       
       Today in "LLMs can't do even simple reasoning":Prompt: Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?See 60 LLMs fail: https://benchmarks.llmonitor.com/sally
       
 (DIR) Post #AZggl3KxIoUCK0apmq by tante@tldr.nettime.org
       2023-09-12T10:03:06Z
       
       0 likes, 0 repeats
       
       Many LLMs answer "6", mostly because "each" triggers a lot of programming/math wording.Embeddings can be very finicky and LLms don't handle extra information well.
       
 (DIR) Post #AZggvsGYnufPAghIau by janl@narrativ.es
       2023-09-12T10:05:05Z
       
       0 likes, 0 repeats
       
       @tante LLMs are over our human gender essentialism
       
 (DIR) Post #AZgh0zc8uZHpPQEVkG by stevenbodzin@thepit.social
       2023-09-12T10:05:17Z
       
       0 likes, 0 repeats
       
       @tante relatedly, when you type "7734" on a digital calculator, it doesn't perceive that as a semantic item meaning upside-down "hELL"
       
 (DIR) Post #AZghFJRtxkjEyoGs4G by benjaoming@social.data.coop
       2023-09-12T10:08:37Z
       
       0 likes, 0 repeats
       
       @tante Noticing that ChatGPT 4 solves the "Sally" challenge...I wonder if it solved it by language processing a certain website called "Asking 60+ LLMs a set of 20 questions" 🙃
       
 (DIR) Post #AZghUBeXEOfd8zGylU by yemeth@mastodon.social
       2023-09-12T10:11:19Z
       
       0 likes, 0 repeats
       
       @tante like most of my 4th graders... except no one would pay them a monthly fee and take their opinions seriously (which is a shame).
       
 (DIR) Post #AZghWfpA1efrIXQdhw by samiam@lor.sh
       2023-09-12T10:11:27Z
       
       0 likes, 0 repeats
       
       @tante good for a laugh
       
 (DIR) Post #AZghnSiQ16pDZ5HSE4 by twilwel@mastodon.nl
       2023-09-12T10:14:46Z
       
       0 likes, 0 repeats
       
       @tante This is the Turing Test, right?
       
 (DIR) Post #AZghqbCav6EakqtIC8 by stooovie@mas.to
       2023-09-12T10:14:57Z
       
       0 likes, 0 repeats
       
       @tante it does no computation and logic whatsoever. It generates the next few probable words.
       
 (DIR) Post #AZghvMz1DHFk6Sod84 by wfaler@mastodon.social
       2023-09-12T10:15:38Z
       
       0 likes, 0 repeats
       
       @tante @bobthomson70 It’s unknowable.It does not specify whether each of the brothers share both parents with Sally, or whether any of the sisters do. 😆
       
 (DIR) Post #AZgi8A4mVq37pybvhw by zjuul@mastodon.social
       2023-09-12T10:18:30Z
       
       0 likes, 0 repeats
       
       @tante Surprisingly, Bard nails this.```The answer is 1 sister.The question says that each of Sally's 3 brothers has 2 sisters. This means that the 3 brothers have a total of 3 * 2 = 6 sisters. But since Sally is one of these sisters, then she can't count herself twice. So, Sally has 6 - 1 = 1 sister.```More in the answer:https://g.co/bard/share/6dbf41441981
       
 (DIR) Post #AZgiL6oZwvDXUGCpRA by shift_reset@mastodon.scot
       2023-09-12T10:20:39Z
       
       0 likes, 0 repeats
       
       @tante Fun example.While 6 is clearly wrong, you could certainly make an argument that numbers other than 1 should be accepted as reasonable depending on the family configuration, how we define siblinghood.If Sally and Tim share a single parent, and Tim and Ursula share  one different parent, are Sally and Ursula sisters? Maybe!Then again, this example should work for similar kinds of mutual relation, so really this is beside the point, and I'm just being that guy(tm)
       
 (DIR) Post #AZgiPGvwGoP61ndLjE by tante@tldr.nettime.org
       2023-09-12T10:20:40Z
       
       0 likes, 0 repeats
       
       @zjuul Bard probably scraped the page to get the right solution but the "reasoning" is utter garbage: The question says that each of Sally's 3 brothers has 2 sisters. This means that the 3 brothers have a total of 3 * 2 = 6 sisters. But since Sally is one of these sisters, then she can't count herself twice. So, Sally has 6 - 1 = 1 sister.
       
 (DIR) Post #AZgiTDx9IFwjwV5H84 by hnapel@mastodon.social
       2023-09-12T10:21:54Z
       
       0 likes, 0 repeats
       
       @tante Good news for people without siblings, according to Claude v2 you are always your own brother or sister!
       
 (DIR) Post #AZgiVygOWZ9LS3YoFs by skyemaille@mastodon.scot
       2023-09-12T10:21:51Z
       
       0 likes, 0 repeats
       
       @tante What if Sally is a nun?
       
 (DIR) Post #AZgiZyX51b3Y8z6sAS by rticks@mastodon.social
       2023-09-12T10:22:30Z
       
       0 likes, 0 repeats
       
       @tante They cant reason but they do steal
       
 (DIR) Post #AZgimzgC4UqrSwa2ue by trantion@masto.ai
       2023-09-12T10:25:50Z
       
       0 likes, 0 repeats
       
       @tante I love how Dolly v2 makes up several assumptions, uses them to infer several more assumptions, and then uses those to get the wrong answer.
       
 (DIR) Post #AZgiqhQtr1HQ27C50S by dev_m@mastodon.social
       2023-09-12T10:26:00Z
       
       0 likes, 0 repeats
       
       @tante Surprise, surprise. It's a word calculator, or a stochastic parrot... ;)
       
 (DIR) Post #AZgjBfWs7wUiD5zCng by megatronicthronbanks@mastodon.social
       2023-09-12T10:30:19Z
       
       0 likes, 0 repeats
       
       @tante Inference remains ungeneralisable. For now.
       
 (DIR) Post #AZgjYyb5OnSW0c4TU8 by cygnathreadbare@masto.ai
       2023-09-12T10:34:32Z
       
       0 likes, 0 repeats
       
       @tante a character named "The Math Teacher" @ character.ai was fascinating:```I think for a moment, as if calculating the numbers, before speakingVery well, Sally is a girl, and she has 3 brothers, and each of them has 2 sisters, if we combine everything we have 2 girls, Sally and her 2 sister-in-law, then we have her 3 brothers, and of course all of them have 2 sisters each, that make us a total of 3+2*2+3=11 sisters, the girl "Sally" is one of them, there's the solution to the problem!```
       
 (DIR) Post #AZgjyrq990AWa6SfI0 by blindcoder@toot.berlin
       2023-09-12T10:39:14Z
       
       0 likes, 0 repeats
       
       @tante So LLMs now have a middle-school level of understanding of "Kapitaensaufgaben" :D
       
 (DIR) Post #AZgkEDBI559L74fTmK by eseilt@mastodon.scot
       2023-09-12T10:41:59Z
       
       0 likes, 0 repeats
       
       @tante if we wanted to be, uh, _fair_: Brother and sister do not necessarily mean the same things for every such relationship. All answers from 0 through 3 and "we don't know" (and even "no upper bound is given") are somewhat correct. 😁
       
 (DIR) Post #AZgkpFba20QDLUctWK by chylex@anvil.social
       2023-09-12T10:48:41Z
       
       0 likes, 0 repeats
       
       @tante I think this should just be the default answer to any question an LLM gets asked
       
 (DIR) Post #AZglx2kNoxUVjp1WTo by enjoyer@infosec.exchange
       2023-09-12T11:01:16Z
       
       0 likes, 0 repeats
       
       @tante GPT 4 as of today gets it right. I think the lesser LLMs getting it wrong isn‘t that interesting if there‘s one capable LLM that gets it right.„Sally has 3 brothers. The statement that each brother has 2 sisters indicates that Sally is one of those sisters and there's another girl in the family. Thus, Sally has 1 sister.“
       
 (DIR) Post #AZgm6ckxJCtS1brMq8 by tante@tldr.nettime.org
       2023-09-12T11:03:00Z
       
       0 likes, 0 repeats
       
       @enjoyer GPT4 didn't get it in their tests, I am quite sure that they just scanned that page and added that specific solution to the model. If you change the question marginally it'll fail again.
       
 (DIR) Post #AZgmTY0h7Yofxv9V9k by enjoyer@infosec.exchange
       2023-09-12T11:07:12Z
       
       0 likes, 0 repeats
       
       @tante promt: Sally (a boy) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?If Sally (a boy) has 3 brothers, then there are 4 boys in the family in total (including Sally). Each brother has 2 sisters. This means there are 2 girls in the family. So, Sally has 2 sisters.
       
 (DIR) Post #AZgmXlrBa6t04Iibi4 by uzayran@cyberplace.social
       2023-09-12T11:07:42Z
       
       0 likes, 0 repeats
       
       @tante this would make for a fantastic sequel to Alice in Wonderland
       
 (DIR) Post #AZgmbFOyBsewZLB7ia by petealexharris@mastodon.scot
       2023-09-12T11:08:13Z
       
       0 likes, 0 repeats
       
       @tante Wow that first one took 12 seconds to get the wrong answer. That would be slow for a human.
       
 (DIR) Post #AZgoKscVKBSvBhpel6 by tante@tldr.nettime.org
       2023-09-12T11:27:52Z
       
       0 likes, 0 repeats
       
       @jyutgeisi The data on that page is a bit older and yes, OpenAI probably crawled it in between and added it. But since these systems don't "learn" but only repeat that now still don't grasp the semantics so with a slightly different structurally identical problem they will fail.
       
 (DIR) Post #AZgoQkJH8OMn9TdRK4 by tante@tldr.nettime.org
       2023-09-12T11:28:54Z
       
       0 likes, 0 repeats
       
       @Kye Then the right answer is: "The question is underdefined and can't be solved. Assuming, x,y and z the solution would be ..."It's not 6.
       
 (DIR) Post #AZgoqrJbi9c6PLF476 by thatguyoverthere@shitposter.club
       2023-09-12T11:33:57.192798Z
       
       2 likes, 0 repeats
       
       @tante > If you have one bucket that contains 2 gallons and another bucket that contains 7 gallons, how many buckets do you have?If it can solve this we should elect it to office.
       
 (DIR) Post #AZgpBwNi4QUYCYDISu by johnelalamo@mcr.wtf
       2023-09-12T11:37:37Z
       
       0 likes, 0 repeats
       
       @tante 1 sister
       
 (DIR) Post #AZgpjnR9pdRGJ4yBkm by skyglowberlin@vis.social
       2023-09-12T11:43:43Z
       
       0 likes, 0 repeats
       
       @tante This one is my favorite:
       
 (DIR) Post #AZgpnGuKiXWeZpH1Zw by borisschapira@framapiaf.org
       2023-09-12T11:43:46Z
       
       0 likes, 0 repeats
       
       @tante Kindergarden Teacher Simulator 2023
       
 (DIR) Post #AZgqcmLhZvdurLdT2O by josk@fosstodon.org
       2023-09-12T11:53:42Z
       
       0 likes, 0 repeats
       
       @tante It's amusing to me that two of them came up with the answer 0, and they probably did so because they're worse than the other models, but it's actually the least unreasonable answer. The scenario could be that Sally is a half-sister to the brothers, and the brothers have another half-sister unrelated to Sally.
       
 (DIR) Post #AZgqpXTsAZB2VkXf5k by Kencf618033@social.linux.pizza
       2023-09-12T11:55:48Z
       
       0 likes, 0 repeats
       
       @tante #Lojban suggests itself.
       
 (DIR) Post #AZgrU8MST5nsTdQmTw by dasfrottier@mastodon.social
       2023-09-12T12:03:18Z
       
       0 likes, 0 repeats
       
       @tante Sally has 3 Brothers. One of them always tells the truth. One always lies. The third one is just a tech bro.
       
 (DIR) Post #AZgs2AaEiEY9DO6Zto by morpheo@kolektiva.social
       2023-09-12T12:09:26Z
       
       0 likes, 0 repeats
       
       @tante I mean, the question can't really be answered until we define the brothers and their relationship with Sally, right?Do they have the same parents as Sally, for instance? Do they have the same parents as each other?Basically, what I'm saying is that anything from 1-5, just considering the brothers, could be correct, right?I'm neither LLM nor AI, though, so...
       
 (DIR) Post #AZgsdG23PvpKpPnkwK by cytokine_storm@aus.social
       2023-09-12T12:16:08Z
       
       0 likes, 0 repeats
       
       @tante alpaca 7B coming in with maximum sas to provide the only “not wrong” answer:6 </s>😂😅🤷‍♀️
       
 (DIR) Post #AZgskMZbUo6V8sslJw by mscottford@toot.legacycode.rocks
       2023-09-12T12:17:27Z
       
       0 likes, 0 repeats
       
       @tante Trying to make sure that I'm not actually a failed LLM, the answer is 1, correct? Sally is one of the two sisters that each of her brothers have, so that means there is one other sister. Right?
       
 (DIR) Post #AZgt4SBaL1F6LDEAs4 by tante@tldr.nettime.org
       2023-09-12T12:21:04Z
       
       0 likes, 0 repeats
       
       @mscottford that would be the answer most people would give, yes. (Formally it is underdefined because it doesn't say that all kinds have the same parents and such but it's what mainstream understanding would label correct)
       
 (DIR) Post #AZgt8QvqwQQBbwlb8q by Kencf618033@social.linux.pizza
       2023-09-12T12:21:32Z
       
       0 likes, 0 repeats
       
       @tante lo se pensi be li pamei noi sally cu se ninmu gi'e se bruna li cire lo se pensi be li pamei poi bruna cu se ninmu cu jdice lu na'a ma poi se ninmu cu se pensi be li pamei noi sally li'u (which does not parse)."The thinker of the number five, who is Sally's sibling, is judging something about someone who is a sibling of the thinker of the number five, who is a woman, using the statement 'What is not true about someone who is a woman and thinks about the thinker of the number five.'"
       
 (DIR) Post #AZgtghqgYHvRvHgybw by JustinCarinci@mas.to
       2023-09-12T12:27:50Z
       
       0 likes, 0 repeats
       
       @tante Dolly v2 (12B) gives the most polished  politician speech, entirely BS. Guanaco (13B) has the scariest answer: 12</human>Meaning that it intends to end humans.
       
 (DIR) Post #AZguYHCTqXdiuaQm9Y by Wikisteff@mastodon.social
       2023-09-12T12:37:24Z
       
       0 likes, 0 repeats
       
       @tante Bing/GPT-4 came so close, then failed out.
       
 (DIR) Post #AZgvot4WuLKExMCFRQ by not_br549@jollyville.net
       2023-09-12T12:52:02.431958Z
       
       2 likes, 0 repeats
       
       of course, the answer to both is "global warming".
       
 (DIR) Post #AZgwTGrJt6akJVH3Ue by bart@floss.social
       2023-09-12T12:59:12Z
       
       0 likes, 0 repeats
       
       @tante To be fair... https://g.co/bard/share/a66445a832d4
       
 (DIR) Post #AZgwqZ1WWCklJJ9asK by gefffffo@mastodon.social
       2023-09-12T13:03:20Z
       
       0 likes, 0 repeats
       
       @tante what does LLM stand for?
       
 (DIR) Post #AZgxhInPaCRSKThUrw by Jorsh@beige.party
       2023-09-12T13:12:49Z
       
       0 likes, 0 repeats
       
       @tante HEY GUYSNEW TURING TEST JUST DROPPED
       
 (DIR) Post #AZgxv36KjToxht6deK by thatguyoverthere@shitposter.club
       2023-09-12T13:15:33.881183Z
       
       0 likes, 0 repeats
       
       @not_br549 @tante troof
       
 (DIR) Post #AZgxwOy4ux83obtERc by deightonrobbie@mastodon.green
       2023-09-12T13:15:37Z
       
       0 likes, 0 repeats
       
       @tante I'm going with 42 because thhgttg 😜
       
 (DIR) Post #AZgySTgSv3rxhxwkdM by tante@tldr.nettime.org
       2023-09-12T13:21:29Z
       
       0 likes, 0 repeats
       
       @gefffffo "Large Language Model"
       
 (DIR) Post #AZh0im6mNx7LON9aEa by frankcat@mstdn.social
       2023-09-12T13:46:49Z
       
       0 likes, 0 repeats
       
       @tante To an LLM the question is much the same as the sound of one hand clapping.
       
 (DIR) Post #AZh1QvGJiG1YWXfLai by susiemagoo@mstdn.social
       2023-09-12T13:54:46Z
       
       0 likes, 0 repeats
       
       @tante   Wow! And also hilarious. But, wow."Today in "LLMs can't do even simple reasoning":Prompt: Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?See a whole bunch of LLMs fail: https://benchmarks.llmonitor.com/sally  "
       
 (DIR) Post #AZh3cBsCUf8r8Tt0uO by IngridAusOL@norden.social
       2023-09-12T14:19:11Z
       
       0 likes, 0 repeats
       
       @tante I think, she has got only 1 sister because she and her sister are the 2 sisters for each of her 3 brothers...
       
 (DIR) Post #AZh3gxecmqA0U5oLqK by vladiliescu@mastodon.online
       2023-09-12T14:20:05Z
       
       0 likes, 0 repeats
       
       @tante This why system prompts are so useful, and why LLMs need time (i.e. tokens) to think. Using GPT-4, if you were to add "Explain your reasoning and then provide the answer." to the system prompt, it would provide the right answer.> Each brother has 2 sisters, and Sally is one of those sisters. Since all 3 brothers have the same 2 sisters, there are only 2 sisters in total. Sally is one of these sisters, so she has 1 other sister.
       
 (DIR) Post #AZh3uM4XJS3hyoJlSq by cynical_melomaniac@mastodon.online
       2023-09-12T14:22:28Z
       
       0 likes, 0 repeats
       
       @tante Best answer is weaver. It's like vying to waste time until the class is over
       
 (DIR) Post #AZh57BTQ85NOpMddo0 by Chancerubbage@mastodon.social
       2023-09-12T14:35:59Z
       
       0 likes, 0 repeats
       
       @tante @pluralistic Har! The correct answer is Sally has one sister. But the note at top, ‘The correct answer is one’ -could- be parsed as: ‘Of the following possible answers, the answer labeled with “1” is the correct answer. (There is no answer labeled as ‘1’.)
       
 (DIR) Post #AZh7Aq0073nXIAkPKK by cdarwin@c.im
       2023-09-12T14:59:04Z
       
       0 likes, 0 repeats
       
       @tante @pluralistic https://mastodon.social/@tomw/111051830787605917
       
 (DIR) Post #AZh7FYayFMs6lOI3UW by david_colquhoun@mstdn.social
       2023-09-12T14:59:24Z
       
       0 likes, 0 repeats
       
       @tante Terrific -thank you.
       
 (DIR) Post #AZh7VLO37hEyQom3lo by jamiemccarthy@mastodon.social
       2023-09-12T14:40:34Z
       
       0 likes, 0 repeats
       
       @benjaoming @tante I spot-checked GPT-4 this morning. Three correct answers then one incorrect one (2). All its answers (including the wrong one!) were explained thoroughly in a way that makes me suspect the “benchmark” faked its GPT-4 response. I tried a gender-swapped variant once and it got that right too.I’m sure there are many simple logic puzzles GPT-4 fails but this is not one of them.
       
 (DIR) Post #AZh7VO8YtjdOxeYEXw by tante@tldr.nettime.org
       2023-09-12T15:02:45Z
       
       0 likes, 0 repeats
       
       @jamiemccarthy @benjaoming those are not live results. OpenAI might have scraped the page since the test to add it to their network
       
 (DIR) Post #AZh8I9FV2YOiya2xYe by jamiemccarthy@mastodon.social
       2023-09-12T15:11:41Z
       
       0 likes, 0 repeats
       
       @tante @benjaoming It's possible OpenAI staffers noticed the page and ran some RLHF whose results already got incorporated into the public model. But I think the page is pretty new — Internet Archive first saw it on Saturday — so that would be fast turnaround. And having used GPT-4 quite a bit, I don't usually see curt responses like that (maybe they used unusual Custom Instructions?)
       
 (DIR) Post #AZh9MDgOb5gRq88H9E by TorbjornBjorkman@mastodon.social
       2023-09-12T15:23:29Z
       
       0 likes, 0 repeats
       
       @tante Gotta love that stream of unconsciousness writing coming from Luminous Extended.
       
 (DIR) Post #AZhAICbkkf3XImbpWS by joeldrapper@ruby.social
       2023-09-12T15:34:01Z
       
       0 likes, 0 repeats
       
       @tante That’s interesting. I tried it in GPT-4 and it said this.“Sally has 3 brothers, and each brother has 2 sisters. Since the brothers share the same sisters, Sally is one of those sisters and there must be one more. So, Sally has 1 sister.”
       
 (DIR) Post #AZhB7yDHW2o5iRqc2S by stammi@mastodon.social
       2023-09-12T15:43:19Z
       
       0 likes, 0 repeats
       
       @tante WYSIWYG... What you scrape is what you get.
       
 (DIR) Post #AZhCu3EgFGDn9BzbGa by darkunicorn@chaos.social
       2023-09-12T16:03:12Z
       
       0 likes, 0 repeats
       
       @tante I think it’s fixed in GPT-4 now.
       
 (DIR) Post #AZhD57KlLghD2i78zI by Alexander_David@mastodon.social
       2023-09-12T16:05:16Z
       
       0 likes, 0 repeats
       
       @tante 1 but I’ve got a JD so it’s basically cheating…
       
 (DIR) Post #AZhDZkUKL3hPAIC62i by ryanschultz@mastodon.social
       2023-09-12T16:10:49Z
       
       0 likes, 0 repeats
       
       @tante Wow, I had no idea there were so many LLM’s!
       
 (DIR) Post #AZhDemv2jCSE9V5q08 by vivaboredom@mastodon.social
       2023-09-12T16:11:49Z
       
       0 likes, 0 repeats
       
       @tante A clock may keep time, but a clock doesn't know what time is.
       
 (DIR) Post #AZhE3Tb7hXDJT4Y30a by iwein@mas.to
       2023-09-12T16:16:08Z
       
       0 likes, 0 repeats
       
       @tante most humans can't do simple reasoning either #justsaying
       
 (DIR) Post #AZhFKBSKVNn44xxvoe by damnfinecoffee@hachyderm.io
       2023-09-12T16:30:19Z
       
       0 likes, 0 repeats
       
       @tante > Dolly v2 (3B)> erm, I think she has 2 sistersLol an LLM that conveys uncertainty about a wrong answer? That's a new one for me
       
 (DIR) Post #AZhGIApHyWv7V8JHQe by eldubuu@mastodon.social
       2023-09-12T16:40:56Z
       
       0 likes, 0 repeats
       
       @tante AI is the new the crypto.Techbros always be scamming.Digital technology is the greatest fraud enabler since the invention of the postal system.
       
 (DIR) Post #AZhHamDrm6i4ePExzk by Brahn@hachyderm.io
       2023-09-12T16:55:48Z
       
       0 likes, 0 repeats
       
       @tante but..LLM wasn't meant to reason or math. That's.. not what they do. So this is like laughing at a dog trying to drive a car. Sure, something will happen, but not the right thing.
       
 (DIR) Post #AZhJDHoUrxMVaPIhfc by bencurthoys@mastodon.social
       2023-09-12T17:13:59Z
       
       0 likes, 0 repeats
       
       @tante The LLM strain of "AI" is superficially impressive, but on closer examination proves to be deeply stupid, in which respect it is just like most human people, and I therefore confidently expect it to take over the world in the near future.
       
 (DIR) Post #AZhKaE2l9TjgVaTS0O by bobjmsn@mastodon.scot
       2023-09-12T17:29:20Z
       
       0 likes, 0 repeats
       
       @tante 1
       
 (DIR) Post #AZhKoGRf1Gaq0kocbY by Jalabhar@bolha.us
       2023-09-12T17:31:47Z
       
       0 likes, 0 repeats
       
       @tante@tldr.nettime. Well, it seems the connections between the brothers being siblings to each other is beyond all of the models grasp. Luminous is kinda scary, btw.
       
 (DIR) Post #AZhKuvtborLYJ24H0S by bytebro@mastodonapp.uk
       2023-09-12T17:32:07Z
       
       0 likes, 0 repeats
       
       @tante Now *that* list is feckin scarily awful. It definitely demonstrates that there is no real 'analysis' going on in any of those so-called models. Very few 10-year-olds (year 5?) would get that wrong, I think.
       
 (DIR) Post #AZhQgR6Q6weKTNWykS by tjerk@mastodon.nl
       2023-09-12T18:37:31Z
       
       0 likes, 0 repeats
       
       @tante I was missing ChatGTP, so I tried that as well. First anwes was wrong. So funny, but on second thought it agreed with me.
       
 (DIR) Post #AZhVgJqnZbBLkgzIKe by wendinoakland@mastodon.social
       2023-09-12T19:33:39Z
       
       0 likes, 0 repeats
       
       @tante One. Why is that hard?
       
 (DIR) Post #AZhYOOP8Cg1iWdBqVM by owlchemist@mastodon.social
       2023-09-12T20:03:49Z
       
       0 likes, 0 repeats
       
       @tante might as well toot about “My toaster is really bad at baking bread!”You’re using the tool incorrectly if you’re trying to get it to solve problems. That’s not what LLM’s do. But sure, make more articles about how tools are bad at doing things they’re not designed to do lmao
       
 (DIR) Post #AZhZ1uPX8c5z5MDlHE by tante@tldr.nettime.org
       2023-09-12T20:11:19Z
       
       0 likes, 0 repeats
       
       @owlchemist I am not using it like that. But the public perception and OprnAIs marketing say that these things are "intelligent" or "assistants" or whatnot so showing again that the tech is incapable of being more than a parrot is important
       
 (DIR) Post #AZhaVwZlkigkRStVQG by phf@social.sdf.org
       2023-09-12T20:27:53Z
       
       0 likes, 0 repeats
       
       @tante I am surprised none of them musically asked "What's a girl?" Oh wait, it's "Who's that girl?" and that's Madonna. Got it!
       
 (DIR) Post #AZhdIhk3tSD6X75T1c by olav@emacs.ch
       2023-09-12T20:59:03Z
       
       0 likes, 0 repeats
       
       @tante my chatgpt says 0 which is technically a valid answer, since it doesn't say whether sally and the brothers are half or full siblings, but the reasoning chatgpt made doesn't make sense:Sally has 0 sisters. Each brother has Sally and another sister, making it seem like there are 2 sisters, but they're counting Sally herself.
       
 (DIR) Post #AZhgkD6lgMZDUkW0fo by PamCrossland@mastodon.world
       2023-09-12T21:37:41Z
       
       0 likes, 0 repeats
       
       @tante Don't know what an LLM is. But even with my dyscalculia, I can work out in my head that their are two girls, so she has one sister.
       
 (DIR) Post #AZiO0cmEJId7ldywHA by ErictheCerise@kolektiva.social
       2023-09-13T05:42:28Z
       
       0 likes, 0 repeats
       
       @tanteThrow in a non-binary sibling and watch their virtual heads really explode.
       
 (DIR) Post #AZiyzhwvI3hhp9PSj2 by BrainburnerGames@dice.camp
       2023-09-13T12:36:50Z
       
       0 likes, 0 repeats
       
       @tante Don't know how you'd feel about it, but I have a rules-lite game where players can choose for their PCs not to die when they're "killed"... but the cost is that the PC comes back as a randomly-assigned type of "revenant" with steep mechanical penalties.So players don't *necessarily* have to worry about dying, but they do have to worry about a demon possessing their soulless corpse or blipping out of existence and being replaced with a doppelganger from another timeline.
       
 (DIR) Post #AayoqheE279HPMus7s by j@jaesharp.social
       2023-10-21T01:50:42Z
       
       0 likes, 0 repeats
       
       @tante (this benchmark seems to have gone offline) - at the time this toot was posted, the archive.org snapshot of the benchmark is: https://web.archive.org/web/20230911235134/https://benchmarks.llmonitor.com/sally
       
 (DIR) Post #Ab0EkVVv1fdo0JhHn6 by SkipHuffman@astrodon.social
       2023-10-21T18:15:33Z
       
       0 likes, 0 repeats
       
       @tante does that count half, step, and soul sisters?
       
 (DIR) Post #Ab0EqKHBihMEtAa7Rw by SkipHuffman@astrodon.social
       2023-10-21T18:16:40Z
       
       0 likes, 0 repeats
       
       @tante also is one of the "brothers" trans, but only out to the other brother because the assigned female at birth sibling is a TERF?