https://garymarcus.substack.com/p/chatgpt-in-shambles Marcus on AI SubscribeSign in Share this post [https] # Marcus on AI ChatGPT in Shambles Copy link Facebook Email Notes More ChatGPT in Shambles After two years of massive investment and endless hype, GPT's reliability problems persist [http] Gary Marcus Feb 04, 2025 69 Share this post [https] # Marcus on AI ChatGPT in Shambles Copy link Facebook Email Notes More 28 4 Share Legendary investor Masayoshi Son assures us that AGI is imminent. Quoting the Wall Street Journal yesterday, "Just a few months ago, [Masayoshi] Son predicted that artificial general intelligence, or AGI, would be achieved within two to three years. "I now realize that AGI would come much earlier," he said Monday." So I decided this morning to give the latest (unpaid) version of ChatGPT a whirl, asking to make it some basic tables, in hopes of putting its newfound skills to work. Things started off so well I wondered if it was me who needed to update AGI timelines. As the system spat out the first few lines of a neatly formatted table on income and population in the United States, apparently responsive to my question, I started to wonder, "Have I been unfair to Generative AI?". The start was undeniably impressive. [https] Only later (at the time I was reading this on my phone and not looking carefully) and did I notice that more than half the states were left out. That should have been a clue. SS From there, things went downhill fast. I next asked for an extra column on population density. [https] That last line made me want to live in Alaska! But wait, I wondered, what happened to Florida? (And and a couple dozen other states? South Dakota? The Carolinas? Arizona? Washington? More were left out than included.) I inquired: [https] As it its tendency, ChatGPT apologized, "sincerely", and made a much longer list (for brevity, not shown here, but follow the link if you care)... and this time it omitted Alaska. Oops! At least it didn't include Canada. SS After another verifying query, and another apology, "You're right to double-check! I'll ensure that all 50 U.S. states are included. Let me verify and provide a complete, accurate table", Alaska was re-included. Hooray! At this point I moved on, to a more serious challenge: Canadian provinces, and wait for it ... vowels. [https] Neverminding that I didn't ask for territories (the bottom three lines), the vowel counts looked hinky. So I invited ChatGPT to doublecheck. AGI here we come! [https] Wait, what? Is h a vowel now? (See answer to Northwest Territories). And 9 vowels in British Columbia? I live there, and that doesn't sound right. (And, no, I am not even going to go into the o it alleges lives in the name Prince Edward Island.) Let's see if maybe we can sort some of this out? [https] Nope, still wrong. SS Luckily I am patient. Third try on my adopted province is a charm! [https] Now hear me out. I do honestly believe that if we spent $7 trillion on infrastructure we could fix all ... oh, never mind. SS But one last question, in the interest of future improvements: [https] SS In sum, in the space of next few exchanges, over the course of 10 minutes, ChatGPT, * failed, multiple times, to properly count to 50 * failed, multiple times, to include a full list of all US states * reported that the letter h could be a vowel, at least when it appeared in the word Northwest * couldn't count vowels to save its electronic life * issued numerous corrections that were wrong, never acknowledging uncertainty until after its errors were called out. * "lied" about having a subconscious. (In fairness, ChatGPT doesn't really lie; it just spews text that often bears little resemblance to reality, but you get my drift). The full conversation including all the prompts I used can be found here. SS As against all the constant claims of exponential progress that I see practically every day, ChatGPT still seems likes pretty much the same mix of brilliance and stupidity that I wrote about, more than two years ago: How come GPT can seem so brilliant one minute and so breathtakingly dumb the next? Gary Marcus * December 1, 2022 How come GPT can seem so brilliant one minute and so breathtakingly dumb the next? In light of the dozens of GPT fails that have circulating in the last 24 hours, regular reader Mike Ma just asked a profound question: how can GPT seem so brilliant and so stupid at the same time? Read full story SS By coincidence Sayash Kapoor, co-author of AI snake oil, reported some tests of OpenAI's new Operator agent this morning, pushing the extreme boundaries of intelligence by testing ... expense reports. [https] Things didn't go that well there either. Although (impressively) Operator was able to navigate to the right websites and even uploaded some things correctly, the train soon derailed: [https] Great summary. As Davis and I have been arguing since 2019, trust is of the essence, and we still aren't there. But honestly, if AI can't do Kapoor's expense reports or my simple tables, is AGI really imminent? Who is kidding whom? SS On my Kindle, this is the book I just started, a classic from an earlier century: [https] I'll close with one of the quotes I highlighted in the first chapter: In reading the history of nations, we find that, like individuals, they have their whims and their peculiarities; their seasons of excitement and recklessness, when they care not what they do. We find that whole communities suddenly fix their minds upon one object, and go mad in its pursuit; that millions of people become simultaneously impressed with one delusion, and run after it, till their attention is caught by some new folly more captivating than the first. He first published that in 1841. I am not sure how much has changed. Share Gary Marcus first started warning of neural-net-induced hallucinations in his 2001 analysis of multilayer perceptrons and their cognitive limitations, The Algebraic Mind. Marcus on AI is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. [ ] Subscribe 69 Share this post [https] # Marcus on AI ChatGPT in Shambles Copy link Facebook Email Notes More 28 4 Share Discussion about this post CommentsRestacks [htt] [ ] [ ] [ ] [ ] [htt] Ben Lowndes 1h Liked by Gary Marcus I am concerned at the way people unquestionally turn to these tools, and are prepared to explain away the glaring flaws. It's getting better. It needs a clearer prompt. If junior staff made this many mistakes, consistently and without learning, they wouldn't last in many high performing teams Expand full comment Reply Share [htt] S.S.W.(ahiyantra) 1hEdited Liked by Gary Marcus Here's the spelling of "prince edward island" changed to match chat-GPT-o's claims: "prince edwardo aisland". Expand full comment Reply Share 1 reply by Gary Marcus 26 more comments... TopLatestDiscussions No posts Ready for more? [ ] Subscribe (c) 2025 Gary Marcus Privacy [?] Terms [?] Collection notice Start WritingGet the app Substack is the home for great culture Share Copy link Facebook Email Notes More This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts