hngopher.com

       [HN Gopher] Claude for Excel
       ___________________________________________________________________
        
       Claude for Excel
        
       Author : meetpateltech
       Score  : 662 points
       Date   : 2025-10-27 16:09 UTC (1 days ago)
        
 (HTM) web link (www.claude.com)
 (TXT) w3m dump (www.claude.com)
        
       | cube00 wrote:
       | [flagged]
        
         | sdsd wrote:
         | Okay. But then you could say the same for a human, isn't your
         | brain just a cloud of matter and electricity that just reacts
         | to senses deterministically?
        
           | cube00 wrote:
           | > isn't your brain just a cloud of matter and electricity
           | that just reacts to senses deterministically?
           | 
           | LLMs are not deterministic.
           | 
           | I'd argue over the short term humans are more deterministic.
           | I ask a human the same question multiple times and I get the
           | same answer. I ask an LLM and each answer could be very
           | different depending on its "temperature".
        
             | krzyk wrote:
             | If you ask human the same question repeatedly, you'll get
             | different answers. I think that at third you'll get "I
             | already answered that" etc.
        
           | worldsayshi wrote:
           | We hardly react to things deterministically.
           | 
           | But I agree with the sentiment. It seems it is more important
           | than ever to agree on what it means to understand something.
        
           | qwertox wrote:
           | I'm having a bad day today. I'm 100% certain that today I'll
           | react completely different to any tiny issue compared to how
           | I did yesterday.
        
             | sdsd wrote:
             | Right, if you change the input to your function, you get a
             | different output. By that logic, the function `(def (add a
             | b) (+ a b)` isn't deterministic.
        
         | NDizzle wrote:
         | I mean - try clicking the CoPilot button and see what it can
         | actually do. Last I checked, it told me it couldn't change any
         | of the actual data itself, but it could give you suggestions.
         | Low bar for excellence here.
        
         | baal80spam wrote:
         | OK then. Groks?
        
         | dang wrote:
         | " _Eschew flamebait. Avoid generic tangents._ "
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
       | d--b wrote:
       | Ok, they weren't confident enough to let the model actually edit
       | the spreadsheet. Phew..
       | 
       | Only a matter of time before someone does it though.
        
         | cube00 wrote:
         | When I think how easy I can misclick to stuff up a spreadsheet
         | I can't begin to imagine all the subtle ways LLMs will screw
         | them up.
         | 
         | Unlike code where it's all on display, with all these formulas
         | are hidden in each cell, you won't see the problem unless click
         | on the cell so you'll have a hard time finding the cause.
        
         | tln wrote:
         | I wish Gemini could edit more in Google sheets and docs.
         | 
         | Little stuff like splitting text more intelligently or
         | following the formatting seen elsewhere would be very
         | satisfying.
        
         | password4321 wrote:
         | How well does change tracking work in Excel... how hard would
         | it be to review LLM changes?
         | 
         | AFAIK there is no 'git for Excel to diff and undo', especially
         | not built-in (aka 'for free' both cost-wise and add-ons/macros
         | not allowed security-wise).
         | 
         | My limited experience has been that it is difficult to keep
         | LLMs from changing random things besides what they're asked to
         | change, which could cause big problems if unattackable in
         | Excel.
        
           | NewsaHackO wrote:
           | I thought there was track changes on all office products.
           | Most Office documents are zip files of XML files and assets,
           | so I'd imagine it would be possible to rollback changes.
        
       | strange_quark wrote:
       | Yet more evidence of the bubble burst being imminent. If any of
       | these companies really had some almost-AGI system internally,
       | they wouldn't be spending any effort making f'ing Excel plugins.
       | Or at the very least, they'd be writing their own Excel because
       | AI is so amazing at coding, right?
        
         | qsort wrote:
         | You wouldn't believe the amount of shit that runs on Excel.
        
           | efields wrote:
           | This. I work in Pharma. Excel and faxes.
        
           | powvans wrote:
           | Yes. I once interviewed a developer who's previous job was
           | maintaining the .NET application that used an Excel sheet as
           | the brain for decisions about where to drill for oil on the
           | sea floor. No one understood what was in the Excel sheet. It
           | was built by a geologist who was long gone. The engineering
           | team understood the inputs and outputs. That's all they
           | needed to know.
        
             | mwigdahl wrote:
             | Years ago when I worked for an engineering consulting
             | company we had to work with a similarly complex, opaque
             | Excel spreadsheet from General Electric modeling the
             | operation of a nuclear power plant in exacting detail.
             | 
             | Same deal there -- the original author was a genius and was
             | the only person who knew how it was set up or how it
             | worked.
        
           | cube00 wrote:
           | I spotted a custom dialog in an Excel spreadsheet in a
           | medical context the other day, I was horrified.
        
           | dickersnoodle wrote:
           | Sic
        
           | strange_quark wrote:
           | I think you're misunderstanding me. This might be something
           | somewhat useful, I don't know, and I'm not judging it based
           | on that.
           | 
           | What I'm saying is that if you really believed we were 2,
           | maybe 3 years tops from AGI or the singularity or whatever
           | you would spend 0 effort serving what already seems to be a
           | domain that is already served by 3rd parties that are already
           | using your models! An excel wrapper for an LLM isn't exactly
           | cutting edge AI research.
           | 
           | They're desperate to find something that someone will pay a
           | meaningful amount of money for that even remotely justifies
           | their valuation and continued investment.
        
         | FergusArgyll wrote:
         | A program that can do excel for you _is_ almost AGI
        
         | pton_xd wrote:
         | The fine tuning will continue until we reach AGI.
        
           | amlib wrote:
           | The fine tuning will continue until we reach the torment
           | nexus, at best
        
         | HDThoreaun wrote:
         | The current valuations do not require AGI. They require
         | products like this that will replace scores of people doing
         | computer based grunt work. MSFT is worth $4 trillion off the
         | back of enterprise productivity software, the AI labs just need
         | some of that money.
        
         | ipaddr wrote:
         | You make a great point. Where is all of the complex
         | applications? They haven't been able to create than own office
         | suite or word processor or really anything aside from a
         | halloween matching game in js. You would think we would have
         | some complex application they can point to but nothing.
        
         | mitjam wrote:
         | Excel is living business knowlege stuck in private SharePoint
         | Sites, tappimg into it might kick off a nice data flywheel not
         | to speak of the nice TAM.
        
       | jawns wrote:
       | Gemini already has its hooks in Google Sheets, and to be honest,
       | I've found it very helpful in constructing semi-complicated Excel
       | formulas.
       | 
       | Being able to select a few rows and then use plain language to
       | describe what I want done is a time saver, even though I could
       | probably muddle through the formulas if I needed to.
        
         | break_the_bank wrote:
         | I would recommend trying TabTabTab at https://tabtabtab.ai/
         | 
         | It is an entire agent loop. You can ask it to build a multi
         | sheet analysis of your favorite stock and it will. We are
         | seeing a lot of early adopters use it for financial modeling,
         | research automation, and internal reporting tasks that used to
         | take hours.
        
         | dangoodmanUT wrote:
         | Gemini integratoins to Google workspace feels like it's using
         | Gemini 1.5 flash, it's so comically bad at understanding and
         | generating
        
         | gumby271 wrote:
         | Last time I tried using Gemini in Google Sheets it hallucinated
         | a bunch of fake data, then gave me a summary that included all
         | that fake data. I'd given it a bunch of transaction data, and
         | asked it to group the records into different categories for
         | budgeting. When asking it to give the largest values in each
         | category, all the values that came back were fake. I'm not sure
         | I'd really trust it to touch a spreadsheet after that.
        
           | genrader wrote:
           | you should:
           | 
           | -stop using the free plan -don't use gemini flash for these
           | tasks -learn how to do things over time and know that all ai
           | models have improved significantly every few months
        
             | ipaddr wrote:
             | Or not use it.
        
         | frankfrank13 wrote:
         | I have had the opposite experience. I've never had Gemini give
         | me something useful in sheets, and I'm not asking for
         | complicated things. Like "group this data by day" or "give me
         | p50 and p90"
        
         | break_the_bank wrote:
         | I forgot to add, you can try TabTabTab, without installing
         | anything as well.
         | 
         | To see something much more powerful on Google Sheets than
         | Gemini for free, you can add "try@tabtabtab.ai" to your sheet,
         | and make a comment tagging "try@tabtabtab.ai" and see it in
         | action.
         | 
         | If that is too much just go to ttt.new!
        
       | soared wrote:
       | It's interesting to me that this page talks a lot about
       | "debugging models" etc. I would've expected (from the title) this
       | to be going after the average excel user, similar to how chatgpt
       | went after every day people.
       | 
       | I would've expected "make a vlookup or pivot table that tells me
       | x" or "make this data look good for a slide deck" to be easier
       | problems to solve.
        
         | burkaman wrote:
         | I think this is aiming to be Claude Code for people who use
         | Excel as a programming environment.
        
         | layer8 wrote:
         | The issue is that the average Excel user doesn't quite have the
         | skills to validate and double-check the Excel formulas that
         | Claude would produce, and to correct them if needed. It would
         | be similar to a non-programmer vibe-coding an app. And that's
         | really not what you want to happen for professionally used
         | Excel sheets.
        
           | soared wrote:
           | IMO that is exactly what people want. At my work everyone
           | uses LLMs constantly and the trade off of not perfect
           | information is known. People double check it, etc, but the
           | information search is so much faster even if it finds the
           | right confluence but misquotes it, it still sends me the
           | link.
           | 
           | For easy spreadsheet stuff (which 80% of average white
           | collars workers are doing when using excel) I'd imagine the
           | same approach. Try to do what I want, and even if you're half
           | wrong the good 50% is still worth it and a better starting
           | point.
           | 
           | Vibe coding an app is like vibe coding a "model in excel".
           | Sure you could try, but most people just need to vibe code a
           | pivot table
        
         | extr wrote:
         | I think actually Anthropic themselves are having trouble with
         | imagining how this could be used. Coders think like coders -
         | they are imagining the primary use case being managing large
         | Excel sheets that are like big programs. In reality most Excel
         | worksheets are more like tiny, one-off programs. More like
         | scripts than applications. AI is very very good at scripts.
        
       | burkaman wrote:
       | I'm excited to see what national disasters will be caused by
       | auto-generated Excel sheets that nobody on the planet
       | understands. A few selections from past HN threads to prime your
       | imagination:
       | 
       | Thousands of unreported COVID cases:
       | https://news.ycombinator.com/item?id=24689247
       | 
       | Thousands of errors in genetics research papers:
       | https://news.ycombinator.com/item?id=41540950
       | 
       | Wrong winner announced in national election:
       | https://news.ycombinator.com/item?id=36197280
       | 
       | Countries across the world implement counter-productive economic
       | austerity programs:
       | https://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt#Metho...
        
         | HPsquared wrote:
         | Especially combined with the dynamic array formulas that have
         | recently been added (LET, LAMBDA etc). You can have much more
         | going on within each cell now. Think whole temporary data
         | structures. The "evaluate formula" dialog doesn't quite cut it
         | anymore for debugging.
        
         | malthaus wrote:
         | from my experience in the corporate world, i'd trust an excel
         | generated / checked by an LLM more than i would one that has
         | been organically grown over years in a big corporation where
         | nobody ever checks or even can check anything because its one
         | big growing pile of technical debt people just accept as
         | working
        
       | whalesalad wrote:
       | I just want Claude inside of Metabase.
        
         | adamfeldman wrote:
         | https://www.metabase.com/features/metabot-ai
        
       | asdev wrote:
       | George Hotz said there's 5 tiers of AI systems, Tier 1 - Data
       | centers, Tier 2 - fabs, Tier 3 - chip makers, Tier 4 - frontier
       | labs, Tier 5 - Model wrappers. He said Tier 4 is going to eat all
       | the value of Tier 5, and that Tier 5 is worthless. It's looking
       | like that's going to be the case
        
         | matsur wrote:
         | People were saying the same thing about AWS vs SaaS ("AWS
         | wrappers") a decade ago and none of that came to pass. Same
         | will be true here.
        
         | tln wrote:
         | Claude is a model wrapper, no?
        
           | piperswe wrote:
           | Anthropic is a frontier lab, and Claude is a frontier model
        
             | tln wrote:
             | Anthropic models are Sonnet / Haiku / Opus
             | 
             | https://docs.claude.com/en/docs/about-
             | claude/models/overview
        
               | piperswe wrote:
               | Okay, Claude is a _family_ of frontier models then. IMO
               | that's a pedantic distinction in this context.
        
         | extr wrote:
         | George Hotz says a lot of things. I think he's directionally
         | correct but you could apply this argument to tech as a whole.
         | Even outside of AI, there are plenty of niches where domain-
         | specific solutions matter quite a bit but are too small for the
         | big players to focus on.
        
         | rudedogg wrote:
         | Tier 5 requires domain expertise until we reach AGI or
         | something very different from the latest LLMs.
         | 
         | I don't think the frontier labs have the bandwidth or domain
         | knowledge (or dare I say skills) to do tier 5 tasks well. Even
         | their chat UIs leave a lot to be desired and that should be
         | their core competency.
        
         | benatkin wrote:
         | Interesting. I found a reference to this in a tweet [1], and it
         | looks to be a podcast. While I'm not extremely knowledgable.
         | I'd put it like this: Tier 1 - fabs, Tier 2 - chip makers, Tier
         | 3 - data centers, Tier 4 - frontier labs, Tier 5 - Model
         | wrappers
         | 
         | However I would think more of elite data centers rather than
         | commodity data centers. That's because I see Tier 4 being
         | deeply involved in their data centers and thinking of buying
         | the chips to feed their data centers. I wouldn't be so inclined
         | to throw in my opinion immediately if I found an article
         | showing this ordering of the tiers, but being a tweet of a
         | podcast it might have just been a rough draft.
         | 
         | 1: https://x.com/tbpn/status/1935072881425400016
        
         | mediaman wrote:
         | That is a common refrain by people who have no domain expertise
         | in anything outside of tech.
         | 
         | Spend a few years in an insurance company, a manufacturing
         | plant, or a hospital, and then the assertion that the frontier
         | labs will figure it out appears patently absurd. (After all, it
         | takes humans years to understand just a part of these
         | institutions, and they have good-functioning memory.)
         | 
         | This belief that tier 5 is useless is itself a tell of a
         | vulnerability: the LLMs are advancing fastest in domain-
         | expertise-free generalized technical knowledge; if you have no
         | domain expertise outside of tech, you are most vulnerable to
         | their march of capability, and it is those with domain
         | expertise who will rely increasingly less on those who have
         | nothing to offer but generalized technical knowledge.
        
           | asdev wrote:
           | yeah but if Anthropic/OpenAI dedicate resources to gaining
           | domain expertise then any tier 5 is dead in the water. For
           | example, they recently hired a bunch of finance professionals
           | to make specialized models for financial modeling. Any
           | startup in that space will be wiped out
        
           | HDThoreaun wrote:
           | I dont think the claim is exactly that tier 5 is useless more
           | that tier 5 synergizes so well with tier 4 that all the
           | popular tier 5 products will eventually be made by the tier 4
           | companies.
        
         | mitjam wrote:
         | Andrew Ng argumented in 2023
         | (https://www.youtube.com/watch?v=5p248yoa3oE ) that the
         | underlying tiers depend on the app tier's success.
         | 
         | That OpenAI is now apparantly striving to become the next big
         | app layer company could hint at George Hotz being right but
         | only if the bets work out. I'm glad that there is competition
         | on the frontier labs tier.
        
       | extr wrote:
       | What is with the negativity in these comments? This is a huge,
       | huge surface area that touches a large percentage of white collar
       | work. Even just basic automation/scaffolding of spreadsheets
       | would be a big productivity boost for many employees.
       | 
       | My wife works in insurance operations - everyone she manages from
       | the top down lives in Excel. For line employees a large
       | percentage of their job is something like "Look at this internal
       | system, export the data to excel, combine it with some other
       | internal system, do some basic interpretation, verify it, make a
       | recommendation". Computer Use + Excel Use isn't there yet...but
       | these jobs are going to be the first on the chopping block as
       | these integrations mature. No offense to these people but Sonnet
       | 4.5 is already at the level where it would be able to replicate
       | or beat the level of analysis they typically provide.
        
         | cube00 wrote:
         | I don't trust LLMs to do the kind of precise deterministic work
         | you need in a spreadsheet.
         | 
         | It's one thing to fudge the language in a report summary, it
         | can be subjective, however numbers are not subjective. It's
         | widely known LLMs are terrible at even basic maths.
         | 
         | Even Google's own AI summary admits it which I was surprised
         | at, marketing won't be happy.
         | 
         |  _Yes, it is true that LLMs are often bad at math because they
         | don 't "understand" it as a logical system but rather process
         | it as text, relying on pattern recognition from their training
         | data._
        
           | extr wrote:
           | Seems like you're very confused about what this work
           | typically entails. The job of these employees is not mental
           | arithmatic. It's closer to:
           | 
           | - Log in to the internal system that handles customer
           | policies
           | 
           | - Find all policies that were bound in the last 30 days
           | 
           | - Log in to the internal system that manages customer
           | payments
           | 
           | - Verify that for all policies bound, there exists a
           | corresponding payment that roughly matches the premium.
           | 
           | - Flag any divergences above X% for accounting/finance to
           | follow up on.
           | 
           | Practically this involves munging a few CSVs, maybe typing in
           | a few things, setting up some XLOOKUPs, IF formulas,
           | conditional formatting, etc.
           | 
           | Will AI replace the entire job? No...but that's not the goal.
           | Does it have to be perfect? Also no...the existing employees
           | performing this work are also not perfect, and in fact
           | sometimes their accuracy is quite poor.
        
             | Ntrails wrote:
             | Checking someone elses spreadsheet is a fucking nightmare.
             | If your company has extremely good standards it's less
             | miserable because at least the formatting etc will be
             | consistent...
             | 
             | The one thing LLMs should consistently do is ensure that
             | formatting is correct. Which will help greatly in the
             | checking process. But no, I generally don't trust them to
             | do sensible things with basic formulation. Not a week ago
             | GPT 5 got confused whether a plus or a minus was necessary
             | in a basic question of "I'm 323 days old, when is my
             | birthday?"
        
               | xmprt wrote:
               | I think you have a misunderstanding of the types of
               | things that LLMs are good at. Yes you're 100% right that
               | they can't do math. Yet they're quite proficient at basic
               | coding. Most Excel work is similar to basic coding so I
               | think this is an area where they might actually be pretty
               | well suited.
               | 
               | My concern would be more with how to check the work (ie,
               | make sure that the formulas are correct and no columns
               | are missed) because Excel hides all that. Unlike code,
               | there's no easy way to generate the diff of a spreadsheet
               | or rely on Git history. But that's different from the
               | concerns that you have.
        
               | collingreen wrote:
               | I've built spreadsheet diff tools on Google sheets
               | multiple times. As the needs grows I think we will see
               | diffs and commits and review tools reach customers
        
               | break_the_bank wrote:
               | hey Collin! I am working on an AI agent on Google Sheets,
               | I am curious if any of your designs are out in the
               | public. We are trying to re-think how diffs should look
               | like and want to make something nicer than what we
               | currently have, so curious.
        
               | collingreen wrote:
               | Hi! Nothing public nor generic enough to be a good
               | building block. I found myself often frustrated by the
               | tools that came out of the box but I believe better apis
               | could make this slightly easier to solve.
               | 
               | The UX of spreadsheet diffs is a hard one to solve
               | because of how weird the calculation loops are and how
               | complicated the relationship between fields might be.
               | 
               | I've never tried to solve this for a real end user before
               | in a generic way - all my past work here was for internal
               | ability to audit changes and rollback catastrophes. I
               | took a lot of shortcuts by knowing which cells are input
               | data vs various steps of calculations -- maybe part of
               | your ux is being able to define that on a sheet by sheet
               | basis? Then you could show how different data (same
               | formulas) changed outputs or how different formulas (same
               | data) did differently?
               | 
               | Spreadsheets are basically weird app platforms at this
               | point so you might not be able to create a single
               | experience that is both deep and generic. On the other
               | hand maybe treating it as an app is the unlock? Get your
               | AI to noodle on what the whole thing is for, then show
               | diff between before and after stable states (after all
               | calculation loops stabilize or are killed) side by side
               | with actual diffs of actual formulas? I feel like Id want
               | to see a diff as a live final spreadsheet and be able to
               | click on changed cells and see up the chain of their
               | calculations to the ancestors that were modified.
               | 
               | Fun problem that sounds extremely complicated. Good luck
               | distilling it!
        
               | alfalfasprout wrote:
               | proficient != near-flawless.
               | 
               | > Most Excel work is similar to basic coding so I think
               | this is an area where they might actually be pretty well
               | suited.
               | 
               | This is a hot take. One I'm not sure many would agree
               | with.
        
               | mguerville wrote:
               | Excel work of people who make a living because of their
               | excel skills (Bankers, VCs, Finance pros) is truly on the
               | spectrum of basic coding. Excel use by others (Strategy,
               | HR, etc.) is more like crude UI to manipulate small
               | datasets (filter, sort, add, share and collaborate).
               | Source: have lived both lives.
        
               | Wowfunhappy wrote:
               | > Yes you're 100% right that they can't do math.
               | 
               | The model ought to be calling out to some sort of tool to
               | do the math--effectively writing code, which it can do.
               | I'm surprised the major LLM frontends aren't always doing
               | this by now.
        
               | mapt wrote:
               | So do it in basic code where numbering your line G53
               | instead of G$53 doesn't crash a mass transit network
               | because somebody's algorithm forgot to order enough fuel
               | this month.
        
               | mr_toad wrote:
               | > Most Excel work is similar to basic coding
               | 
               | Excel is similar to coding in BASIC, a giant hairy ball
               | of tangled wool.
        
               | klausnrooster wrote:
               | MS Office Tools menu has a "Spreadsheet Compare"
               | application. It is quite good for diffing 2 spreadsheets.
               | Of course it cannot catch logic errors, human or ML.
        
               | runarberg wrote:
               | > The one thing LLMs should consistently do is ensure
               | that formatting is correct.
               | 
               | In JavaScript (and I assume most other programming
               | languages) this is the job of static analysis tools (like
               | eslint, prettier, typescript, etc.). I'm not aware of any
               | LLM based tools which performs static analysis with as
               | good a results as the traditional tools. Is static
               | analysis not a thing in the spreadsheet world? Are there
               | the tools which do static analysis on spreadsheets
               | subpar, or offer some disadvantage not seen in other
               | programming languages? And if so, are LLMs any better?
        
               | eric-burel wrote:
               | Just use a normal static analysis tool and shove the
               | result to an LLM. I believe Anthropic properly figured
               | that agents are the key, in addition to models, contrary
               | to OpenAI that is run by a psycho that only believes in
               | training the bigger model.
        
               | koliber wrote:
               | Maybe LLMs will enable a new type of work in
               | spreadsheets. Just like in coding we have PR reviews,
               | with an LLM it should be possible to do a spreadsheet
               | review. Ask the LLM to try to understand the intent and
               | point out places where the spreadsheet deviates from the
               | intent. Also ask the LLM to narrate the spreadsheet so it
               | can be understood.
        
               | Insanity wrote:
               | That first condition "try to understand the intent" is
               | where it could go wrong. Maybe it thinks the spreadsheet
               | aligns with the intent, but it misunderstood the intent.
               | 
               | LLMs are a lossy validation, and while they work
               | sometimes, when they fail they usually do so 'silently'.
        
               | monkeydust wrote:
               | Maybe we need some kind of method, framework to develop
               | intent. Most of things that go wrong in knowledge working
               | are down to lack of common understanding of intent.
        
             | lossolo wrote:
             | Last time, I gave claude an invoice and asked it to change
             | one item on it, it did so nicely and gave me the new
             | invoice. Good thing I noticed it had also changed the bank
             | account number..
             | 
             | The more complicated the spreadsheet and the more
             | dependencies it has, the greater the room for error. These
             | are probabilistic machines. You can use them, I use them
             | all the time for different things, but you need to treat
             | them like employees you can't even trust to copy a bank
             | account number correctly.
        
               | mikeyouse wrote:
               | We've tried to gently use them to automate some of our
               | report generation and PDF->Invoice workflows and it's a
               | nightmare of silent changes and absence of logic.. basic
               | things like specifically telling it "debits need to match
               | credits" and "balance sheets need to balance" that are
               | ignored.
        
               | wholinator2 wrote:
               | Yeah, asking llm to edit one specific thing in a large or
               | complex document/ codebase is like those repeated "give
               | me the exact same image" gifs. It's fundamentally a
               | statistical model so the only thing we can be _certain_
               | of is that _it's not_. It might get the desired change
               | 100% correct but it's only gonna get the entire document
               | 99 5%
        
               | onion2k wrote:
               | Something that Claude Sonnet does when you use it to code
               | is write scripts to test whether or not something is
               | working. If it does that for Excel (e.g. some form of
               | verification) it should be fine.
               | 
               | Besides, using AI is an exercise in a "trust but verify"
               | approach to getting work done. If you asked a junior to
               | do the task you'd check their output. Same goes for AI.
        
             | dpoloncsak wrote:
             | Sysadmin of a small company. I get asked pretty often to
             | help with a pivot table, vlookup, or just general excel
             | functions (and smartsheet, these users LOVE smartsheet)
        
               | toomuchtodo wrote:
               | Indeed, in a small enough org, the sysadmin/technologist
               | becomes support of last resort for all the things.
        
               | JumpCrisscross wrote:
               | > _these users LOVE smartsheet_
               | 
               | I hate smartsheet...
               | 
               | Excel or R. (Or more often, regex followed by pen and
               | paper followed by more regex.)
        
               | dpoloncsak wrote:
               | They're coming to me for pivot tables....
               | 
               | Handing them regex would be like giving a monkey a
               | bazooka
        
             | AvAn12 wrote:
             | > "Does it have to be perfect?"
             | 
             | Actually, yes. This kind of management reporting is either
             | (1) going to end up in the books and records of the company
             | - big trouble if things have to be restated in the future
             | or (2) support important decisions by leadership -- who
             | will be very much less than happy if analysis turns out to
             | have been wrong.
             | 
             | A lot of what ties up the time of business analysts is
             | ticking and tying everything to ensure that mistakes are
             | not made and that analytics and interpretations are
             | consistent from one period to the next. The math and
             | queries are simple - the details and correctness are hard.
        
               | 2b3a51 wrote:
               | There is another aspect to this kind of activity.
               | 
               | Sometimes there can be an advantage in leading or lagging
               | some aspects of internal accounting data for a time
               | period. Basically sitting on credits or debits to some
               | accounts for a period of weeks. The tacit knowledge to
               | know when to sit on a transaction and when to action it
               | is generally not written down in formal terms.
               | 
               | I'm not sure how these shenanigans will translate into an
               | ai driven system.
        
               | AvAn12 wrote:
               | That's the kind of thing that can get a company into a
               | lot of trouble with its auditors and shareholders. Not
               | that I am offering accounting advice of course. And yeah,
               | one can not "blame" and ai system or try to ai-wash any
               | dodgy practices.
        
               | iamacyborg wrote:
               | > Sometimes there can be an advantage in leading or
               | lagging some aspects of internal accounting data for a
               | time period.
               | 
               | This worked famously well for Enron.
        
               | extr wrote:
               | Speak for yourself and your own use cases. There are a
               | huge diversity of workflows with which to apply
               | automation in any medium to large business. They all have
               | differing needs. Many excel workflows I'm personally
               | familiar with already incoporate a "human review" step.
               | Telling a business leader that they can now jump straight
               | to that step, even if it requires 2x human review, with
               | AI doing all of the most tediuous and low-stakes prework,
               | is a clear win.
        
               | Revanche1367 wrote:
               | >Speak for yourself and your own use cases
               | 
               | Take your own advice.
        
               | extr wrote:
               | I'm taking a much weaker position than the respondent:
               | LLMs are useful for many classes of problem that do not
               | require zero shot perfect accuracy. They are useful in
               | contexts where the cost of building scaffolding around
               | them to get their accuracy to an acceptable level is less
               | than the cost of hiring humans to do the same work to the
               | same degree of accuracy.
               | 
               | This is basic business and engineering 101.
        
               | Barbing wrote:
               | >LLMs are useful for many classes of problem that do not
               | require zero shot perfect accuracy. They are useful in
               | contexts where the cost of building scaffolding around
               | them to get their accuracy to an acceptable level is less
               | than the cost of hiring humans to do the same work to the
               | same degree of accuracy.
               | 
               | Well said. Concise and essentially inarguable, at least
               | to the extent it means LLMs are here to stay in the
               | business world whether anyone likes it or not (barring
               | the unforeseen, e.g. regulation or another pressure).
        
               | jacksnipe wrote:
               | Is this not belligerently ignoring the fact that this
               | work is already done imperfectly? I can't tell you how
               | many serious errors I've caught in just a short time of
               | automating the generation of complex spreadsheets from
               | financial data. All of them had already been checked by
               | multiple analysts, and all of them contained serious
               | errors (in different places!)
        
               | harrall wrote:
               | There's actually different classes of errors though.
               | There's errors in the process itself versus errors that
               | happen when performing the process.
               | 
               | For example, if I ask you to tabulate orders via a query
               | but you forgot to include an entire table, this is a
               | major error of process but the query itself actually is
               | consistently error-free.
               | 
               | Reducing error and mistakes is very much modeling where
               | error can happen. I never trust an LLM to interpret data
               | from a spreadsheet because I cannot verify every
               | individual result, but I am willing to ask an LLM to
               | write a macro that tabulates the data because I can
               | verify the algorithm and the macro result will always be
               | consistent.
               | 
               | Using Claude to interpret the data directly for me is
               | scary because those kinds of errors are neither
               | verifiable nor consistent. At least with the "missing
               | table" example, that error may make the analysis
               | completely bunk but once it is corrected, it is always
               | correct.
        
               | AvAn12 wrote:
               | Very much agreed
        
               | AvAn12 wrote:
               | No belligerence intended! Yes, processes are faulty today
               | even with maker-checker and other QA procedures. To me it
               | seems the main value of LLMs in a spreadsheet-heavy
               | process is acceleration - which is great! What is harder
               | is quality assurance - like the example someone gave
               | regarding deciding when and how to include or exclude
               | certain tables, date ranges, calc, etc. Properly
               | recording expert judgment and then consistently applying
               | that judgement over time is key. I'm not sure that is the
               | kind of thing LLMs are great at, even ignoring their
               | stochastic nature. Let's figure out how to get best use
               | out of the new kit - and like everything else, focus on
               | achieving continuously improving outcomes.
        
             | next_xibalba wrote:
             | The use cases for spreadsheets are much more diverse than
             | that. In my experience, spreadsheets just as often used for
             | calculation. Many of them do require high accuracy, rely on
             | determinism, and necessitate the understanding of maths
             | ranging from basic arithmetic to statistics and engineering
             | formulas. Financial models, for example, must be built up
             | from ground truth and need to always use the right formulas
             | with the right inputs to generate meaningful outputs.
             | 
             | I have personally worked with spreadsheet based financial
             | models that use 100k+ rows x dozens of columns and involve
             | 1000s of formulas that transform those data into the
             | desired outputs. There was very little tolerance for
             | mistakes.
             | 
             | That said, humans, working in these use cases, make
             | mistakes >0% of the time. The question I often have with
             | the incorporation of AI into human workflows is, will we
             | eventually come to accept a certain level of error from
             | them in the way we do for humans?
        
             | jay_kyburz wrote:
             | >Does it have to be perfect? Also no.
             | 
             | Yeah, but it could be perfect, why are there humans in the
             | loop at all? That is all just math!
        
           | mrcwinn wrote:
           | I couldn't agree more. I get all my perfectly deterministic
           | work output from human beings!
        
             | goatlover wrote:
             | If only we had created some device that could perform
             | deterministic calculations and then wrote software that
             | made it easy for humans to use such calculations.
        
               | bryanrasmussen wrote:
               | ok but humans are idiots, if only we could make some sort
               | of Alternate Idiot, a non-human but every bit as
               | generally stupid as humans are! This A.I would be able to
               | do every stupid thing humans did with the device that
               | performed deterministic calculations only many times
               | faster!
        
               | baconbrand wrote:
               | Yes and when the AI did that all the stupid humans could
               | accept its output without question. This would save the
               | humans a lot of work and thought and personal
               | responsibility for any mistakes! See also Israel's
               | Lavender for an exciting example of this in action.
        
           | laweijfmvo wrote:
           | I don't trust humans to do the kind of precise deterministic
           | work you need in a spreadsheet!
        
             | baconbrand wrote:
             | Right, we shouldn't use humans or LLMs. We should use
             | regular deterministic computer programs.
             | 
             | For cases where that is not available, we should use a
             | human and never an LLM.
        
               | extr wrote:
               | "regular deterministic computer programs" - otherwise
               | known as the SUM function in Microsoft Excel
        
               | davidpolberger wrote:
               | I like to use Claude Code to write deterministic computer
               | programs for me, which then perform the actual work. It
               | saves a lot of time.
               | 
               | I had a big backlog of "nice to have scripts" I wanted to
               | write for years, but couldn't find the time and energy
               | for. A couple of months after I started using Claude
               | Code, most of them exist.
        
               | baconbrand wrote:
               | That's great and the only legitimate use case here. I
               | suspect Microsoft will not try to limit customers to just
               | writing scripts and will instead allow and perhaps even
               | encourage them to let the AI go ham on a bunch of raw
               | data with no intermediary code that could be reviewed.
               | 
               | Just a suspicion.
        
           | doug_durham wrote:
           | Sure, but this isn't requiring that the LLM do any math. The
           | LLM is writing formulas and code to do the math. They are
           | very good at that. And like any automated system you need to
           | review the work.
        
             | causal wrote:
             | Exactly, and if it can be done in a way that helps users
             | better understand their own spreadsheets (which are often
             | extremely complex codebases in a single file!) then this
             | could be a huge use case for Claude.
        
           | bg24 wrote:
           | "I don't trust LLMs to do the kind of precise deterministic
           | work" => I think LLM is not doing the precise arithmetic. It
           | is the agent with lots of knowledge (skills) and tools.
           | Precise deterministic work is done by tools (deterministic
           | code). Skills brings domain knowledge and how to sequence a
           | task. Agent executes it. LLM predicts the next token.
        
           | zarmin wrote:
           | >I don't trust LLMs to do the kind of precise deterministic
           | work
           | 
           | not just in a spreadsheet, any kind of deterministic work at
           | all.
           | 
           | find me a reliable way around this. i don't think there is
           | one. mcp/functions are a band aid and not consistent enough
           | when precision is important.
           | 
           | after almost three years of using LLMs, i have not found a
           | single case where i didn't have to review its output, which
           | takes as long or longer than doing it by hand.
           | 
           | ML/AI is not my domain, so my knowledge is not deep nor
           | technical. this is just my experience. do we need a new
           | architecture to solve these problems?
        
             | baconbrand wrote:
             | ML/AI is not my domain but you don't have to get all that
             | technical to understand that LLMs run on probability. We
             | need a new architecture to solve these problems.
        
           | chpatrick wrote:
           | They're not great at arithmetic but at abstract mathematics
           | and numerical coding they're pretty good actually.
        
           | mhh__ wrote:
           | If LLMs can replace mathematica for me when I'm doing affine
           | yield curve calculations they can do a DCF for some banker
           | idiots
        
           | sdeframond wrote:
           | > I don't trust LLMs to do the kind of precise deterministic
           | work you need in a spreadsheet.
           | 
           | Rightly so! But LLMs can still make you faster. Just don't
           | expect _too much_ from it.
        
           | mbreese wrote:
           | I don't see the issue so much as the deterministic precision
           | of an LLM, but the lack of observability of spreadsheets.
           | Just looking at two different spreadsheets, it's impossible
           | to see what changes were made. It's not like programming
           | where you can run a `git diff` to see what changes an LLM
           | agent made to a source code file. Or even a word processing
           | document where the text changes are clear.
           | 
           | Spreadsheets work because the user sees the results of
           | complex interconnected values and calculations. For the user,
           | that complexity is hidden away and left in the background.
           | The user just sees the results.
           | 
           | This would be a nightmare for most users to validate what
           | changes an LLM made to a spreadsheet. There could be
           | fundamental changes to a formula that could easily be hidden.
           | 
           | For me, that the concern with spreadsheets and LLMs - which
           | is just as much a concern with spreadsheets themselves. Try
           | collaborating with someone on a spreadsheet for modeling and
           | you'll know how frustrating it can be to try and figure out
           | what changes were made.
        
           | informal007 wrote:
           | you might trust when the precision is extremely high and
           | others agree with that.
           | 
           | high precision is possible because they can realize that by
           | multiple cross validations
        
           | prisonguard wrote:
           | ChatGPT is actively being used as a calculator.
        
           | game_the0ry wrote:
           | _> I don 't trust LLMs to do the kind of precise
           | deterministic work you need in a spreadsheet._
           | 
           | I was thinking along the same lines, but I could not
           | articulate as well as you did.
           | 
           | Spreadsheet work is deterministic; LLM output is
           | probabilistic. The two should be distinguished.
           | 
           | Still, its a productivity boost, which is always good.
        
           | Kiro wrote:
           | Most real-world spreadsheets I've worked with were fragile
           | and sloppy, not precise and deterministic. Programmers always
           | get shocked when they realize how many important things are
           | built on extremely messy spreadsheets, and that people simply
           | accept it. They rather just spend human hours correcting
           | discrepancies than trying to build something maintainable.
        
             | bonoboTP wrote:
             | Usually this is very hard because the tasks and the job
             | often subtly shifts in somewhat unpredictable and
             | unforeseen ways and there is no neat clean abstraction that
             | you can just implement as an application. Too
             | hererogeneous, too messy, too many exceptions. If you
             | develop some clean elegant solution, next week there will
             | be something that your shiny app doesn't allow and they'd
             | have to submit a feature request or whatever.
             | 
             | In Excel, it's possible to just ad hoc adjust things and
             | make it up as you go. It's not clean but very adaptable and
             | flexible.
        
           | MangoCoffee wrote:
           | LLMs are just a tool, though. Humans still have to verify
           | them, like with very other tools out there
        
             | A4ET8a8uTh0_v2 wrote:
             | Eh, yes. In theory. In practice, and this is what I have
             | experienced personally, bosses seem to think that you now
             | have interns so you should be able to do 5x the output..
             | guess what that means. No verification or rubber stamp.
        
           | brookst wrote:
           | Do you trust humans to be precise and deterministic, or even
           | to be especially good at math?
           | 
           | This is talking about applying LLMs to formula creation and
           | references, which they are actually pretty good at.
           | Definitely not about replacing the spreadsheet's calculation
           | engine.
        
             | amrocha wrote:
             | I trust humans to not be able to shoot the company on the
             | foot without even realizing it.
             | 
             | Why are we suddenly ok with giving every underpaid and
             | exploited employee a foot gun and expect them to be
             | responsible with it???
        
           | onion2k wrote:
           | _It 's widely known LLMs are terrible at even basic maths._
           | 
           | Claude for Excel isn't doing maths. It's doing Excel. If the
           | llm is bad at maths then teaching it to use a tool that's
           | good at maths seems sensible.
        
         | pavel_lishin wrote:
         | My concern is that my insurance company will reject a claim, or
         | worse, because of something an LLM did to a spreadsheet.
         | 
         | Now, granted, that can also happen because Alex fat-fingered
         | something in a cell, but that's something that's much easier to
         | track down and reverse.
        
           | manquer wrote:
           | They already doing that with AI, rejecting claims at higher
           | numbers than before .
           | 
           | Privatized insurance will always find a way to pay out less
           | if they could get away with it . It is just nature of having
           | the trifecta of profit motive , socialized risk and light
           | regulation .
        
             | philipallstar wrote:
             | > It is just nature of having the trifecta of profit motive
             | , socialized risk and light regulation.
             | 
             | It's the nature of everything. They agree to pay you for
             | something. It's nothing specific to "profit motive" in the
             | sense you mean it.
        
               | manquer wrote:
               | I should have been clearer - profit maximization above
               | all else as long it is mostly legal. Neither profit or
               | profit maximization at all cost is nature of everything .
               | 
               | There are many other entity types from unions[1],
               | cooperatives , public sector companies , quasi government
               | entities, PBC, non profits that all offer insurance and
               | can occasionally do it well.
               | 
               | We even have some in the US and don't think it is
               | communism even - like the FDIC or things like social
               | security/ unemployment insurance.
               | 
               | At some level government and taxation itself is nothing
               | but insurance ? We agree to paying taxes to mitigate
               | against variety of risks including foreign invasion or
               | smaller things like getting robbed on the street.
               | 
               | [1] Historically worker collectives or unions self-
               | organized to socialize the risks of both major work
               | ending injuries or death.
               | 
               | Ancient to modern armies operate on because of this
               | insurance the two ingredients that made them not
               | mercenaries - a form of long term insurance benefit
               | (education, pension, land etc) or family members in the
               | event of death and sovereign immunity for their actions.
        
             | JumpCrisscross wrote:
             | > _They already doing that with AI, rejecting claims at
             | higher numbers than before_
             | 
             | Source?
        
               | nartho wrote:
               | Haven't risk based models been a thing for the last 15-20
               | years ?
        
             | keernan wrote:
             | >>They already doing that with AI, rejecting claims at
             | higher numbers than before .
             | 
             | That's a feature, not a bug.
        
               | elpakal wrote:
               | This is a great application of this quote. Insurance
               | providers have 0 incentive to make their AI "good" at
               | processing claims, in fact it's easy to see how "bad" AI
               | can lead to a justification to deny more claims.
        
               | bonoboTP wrote:
               | The question is how you define good. They surely want the
               | Ai to be good in the sense that it rejects all claims
               | that they think can get away with rejecting. But it
               | should not reject those where rejection likely results in
               | litigation and losing and having to pay damages.
        
             | jimbokun wrote:
             | Couldn't they accomplish the same thing by rejecting a
             | certain percentage of claims totally at random?
        
               | manquer wrote:
               | That would be illegal though, the goal is do this legally
               | after all.
               | 
               | We also have to remember all claims aren't equal. i.e.
               | some claims end up being way costlier than others. You
               | can achieve similar % margin outcomes by putting a ton of
               | friction like, preconditions, multiple appeals processes
               | and prior authorization for prior authorization, reviews
               | by administrative doctors who have no expertise in the
               | field being reviewed don't have to disclose their
               | identity and so and on.
               | 
               | While U.S. system is most extreme or evolved, it is not
               | unique, it is what you get when you end up privatize
               | insurance any country with private insurance has some
               | lighter version of this and is on the same journey .
               | 
               | Not that public health system or insurance a la NHS in UK
               | or like Germany work, they are underfunded, mismanaged
               | with long times in months to see a specialist and so on.
               | 
               | We have to choose our poison - unless you are rich of
               | course, then the U.S. system is by far the best, people
               | travel to the U.S. to get the kind of care that is not
               | possible anywhere else.
        
               | jimbokun wrote:
               | Why does saying "AI did it" make it legal, if the outcome
               | is the same?
        
               | nxobject wrote:
               | > While U.S. system is most extreme or evolved, it is not
               | unique, it is what you get when you end up privatize
               | insurance any country with private insurance has some
               | lighter version of this and is on the same journey .
               | 
               | I disagree with the statement that healthcare insurance
               | is predominantly privatized in the US: Medicare and
               | Medicaid, at least in 2023, outspent private plans for
               | healthcare spending by about ~10% [1]; this is before
               | accounting for government subsidies for private plans.
               | And boy, does America have a very unique relationship
               | with these programs.
               | 
               | https://www.healthsystemtracker.org/chart-collection/u-s-
               | spe...
        
               | manquer wrote:
               | It is more nuanced, for example Medicare Advantage(Part
               | C) is paid by Medicare money but it is profitable private
               | operators who provide the plans and service it a fast
               | growing part of Medicare .
               | 
               | John Oliver had an excellent segment coincidentally
               | yesterday on this topic.
               | 
               | While the government pays for it, it is not managed or
               | run by them so how to classify the program as public or
               | private ?
        
               | jimbokun wrote:
               | That's a great and thorough analysis!
               | 
               | My take away is that as public health costs are
               | overtaking private insurance and at the same time doing a
               | better job controlling costs per enrollee, it makes more
               | and more sense just to have the government insure
               | everyone.
               | 
               | I can't see what argument the private insurers have in
               | their favor.
        
             | smithkl42 wrote:
             | If you think that insurance companies have "light
             | regulation", I shudder to think of what "heavy regulation"
             | would look like. (Source: I'm the CTO at an insurance
             | company.)
        
               | lotsofpulp wrote:
               | They have too much regulation, and too little auditing
               | (at least in the managed healthcare business).
        
               | nxobject wrote:
               | I agree, _and_ I can see where it comes from (at least at
               | the state level). The cycle is: bad trend happens that
               | has deep root causes (let 's say PE buying rural
               | hospitals because of reduced Medicaid/Medicare
               | reimbursements); legislators (rightfully) say "this
               | shouldn't happen", but don't have the ability to address
               | the deep root causes so they simply regulate healthcare
               | M&As - now you have a bandaid on a problem that's going
               | to pop up elsewhere.
        
               | lotsofpulp wrote:
               | I mean even in the simple stuff like denying payment for
               | healthcare that should have been covered. CMS will come
               | by and out a handful of cases, out of millions, every few
               | years.
               | 
               | So obviously the company that prioritizes accuracy of
               | coverage decisions by spending money on extra labor to
               | audit itself is wasting money. Which means insureds have
               | to waste more time getting the payment for healthcare
               | they need.
        
               | manquer wrote:
               | Light did not mean to imply quantity of paperwork you
               | have to do, rather are you allowed to do the things you
               | want to do as a company.
               | 
               | More compliance or reporting requirements usually tend to
               | favor the larger existing players who can afford to do it
               | and that is also used to make the life difficult and
               | reject more claims for the end user.
               | 
               | It is kind of thing that keeps you and me busy, major
               | investors don't care about it all, the cost of the
               | compliance or the lack is not more than a rounding number
               | in the balance, the fines or penalties are puny and
               | laughable.
               | 
               | The enormous profits year on year for decades now, the
               | amount of consolidation allowed in the industry show that
               | the industry is able to do mostly what they want pretty
               | much, that is what I meant by light regulation.
        
               | smithkl42 wrote:
               | I'm not sure we're looking at the same industry. Overall,
               | insurance company profit margins are in the single
               | digits, usually low single digits - and in many segments,
               | they're frequently not profitable at all. To take one
               | example, 2024 was the first profitable year for
               | homeowners insurance companies since 2019, and even then,
               | the segment's entire profit margin was 0.3% (not 3% -
               | 0.3%).
               | 
               | https://riskandinsurance.com/us-pc-insurance-industry-
               | posts-...
        
               | bonoboTP wrote:
               | It's an accounting 101 thing to use all tricks in the
               | book to reduce the reported profit, to avoid paying taxes
               | on that profit.
        
               | zetazzed wrote:
               | The total profit of ALL US health insurance companies
               | added together was $9bln in 2024:
               | https://content.naic.org/sites/default/files/2024-annual-
               | hea.... This is a profit margin of 0.8% down from 2.2% in
               | the previous year.
               | 
               | Meta alone made $62bln in 2024:
               | https://investor.atmeta.com/investor-news/press-release-
               | deta...
               | 
               | So it's weird to see folks on a tech site talking about
               | how enormous all the profits are in health insurance, and
               | citations with numbers would be helpful to the
               | discussion.
               | 
               | I worked in insurance-related tech for some time, and the
               | providers (hospitals, large physician groups) and
               | employers who actually pay for insurance have signficant
               | market power in most regions, limiting what insurers can
               | charge.
        
           | wombatpm wrote:
           | Wait until a company has to restate earnings because of a bug
           | in a Claudified Excel spreadsheet.
        
         | doctorpangloss wrote:
         | > What is with the negativity in these comments?
         | 
         | Some people - normal people - understand the difference between
         | the holistic experience of a mathematically informed opinion
         | and an actual model.
         | 
         | It's just that normal people always wanted the holistic
         | experience of an answer. Hardly anyone wants a right answer.
         | They have an answer in their heads, and they want a defensible
         | journey to that answer. That is the purpose of Excel in 95% of
         | places it is used.
         | 
         | Lately people have been calling this "syncophancy." This was
         | always the problem. Sycophancy is the product.
         | 
         | Claude Excel is leaning deeply into this garbage.
        
           | extr wrote:
           | It seems like to me the answer is moreso "People on HN are so
           | far removed from the real use cases for this kind of
           | automation they simply have no idea what they're talking
           | about".
        
             | genrader wrote:
             | This is so correct it hurts
        
         | intended wrote:
         | I used to live in excel.
         | 
         | The issue isn't in creating a new monstrosity in excel.
         | 
         | The issue is the poor SoB who has to spelunk through the damn
         | thing to figure out what it does.
         | 
         | Excel is the sweet spot of just enough to be useful, capable
         | enough to be extensible, yet gated enough to ensure everyone
         | doesn't auto run foreign macros (or whatever horror is more
         | appropriate).
         | 
         | In the simplest terms - it's not excel, it's the business
         | logic. If an excel file works, it's because theres someone who
         | "gets" it in the firm.
        
           | extr wrote:
           | I used to live in Excel too. I've trudged through plenty of
           | awful worksheets. The output I've seen from AI is actually
           | more neatly organized than most of what I used to receive in
           | outlook. Most of that wasn't hyper-sophisticated cap table
           | analyses. It was analysis from a Jr Analyst or line employee
           | trying to combine a few different data sources to get some
           | signal on how XYZ function of the business was performing. AI
           | automation is perfectly suitable for this.
        
             | intended wrote:
             | How?
             | 
             | Neat formatting didn't save any model from having the wrong
             | formula pasted in.
             | 
             | Being neat was never a substitute for being well rested, or
             | sufficiently caffeinated.
             | 
             | Have you seen how AI functions in the hands of someone who
             | isn't a domain expert? I've used it for things I had no
             | idea about, like Astro+ web dev. User ignorance was
             | magnified spectacularly.
             | 
             | This is going to have Jr Analysts dumping well formatted
             | junk in email boxes within a month.
        
         | gedy wrote:
         | It's actually really cool. I will say that "spreadsheets"
         | remain a bandaid over dysfunctional UIs, processes, etc and
         | engineering spends a lot of time enabling these bandaids vs
         | someone just saying "I need to see number X" and not "a BI
         | analytics data in a realtime spreadsheet!", etc.
        
         | gadders wrote:
         | Yeah, this could be a pretty big deal. Not everyone is an excel
         | expert, but nearly everyone finds themselves having to work
         | with data in excel at some time or other.
        
         | hbarka wrote:
         | What does scaffolding of spreadsheets mean? I see the term
         | scaffolding frequently in the context of AI-related articles
         | and not familiar with this method and I'm hesitant to ask an
         | LLM.
        
           | Rudybega wrote:
           | Scaffolding typically just refers to a larger state machine
           | style control flow governing an agent's behavior and the
           | suite of external tools it has access to.
        
         | behnamoh wrote:
         | > How teams use Claude for Excel
         | 
         | Who are these teams that can get value from Anthropic? One MCP
         | and my context window is used up and Claude tells me to start a
         | new chat.
        
           | fragmede wrote:
           | MCPs and context window sizing, putting the engineering into
           | prompt engineering.
        
         | BuildItBusk wrote:
         | I have to admit that my first thought was "April's fool". But
         | you are right. It makes a lot of sense (if they can get it to
         | work well). Not only is Excel the world's biggest "programming
         | language". It's probably also one of the most unintuitive ways
         | to program.
        
           | adastra22 wrote:
           | Why unintuitive?
        
           | baq wrote:
           | If you exclude macros with IO it's actually the most popular
           | purely functional programming language (no quotes) on the
           | planet by far.
        
         | tokai wrote:
         | Whats with claiming negativity when most of the comments here
         | are positive?
        
           | bartvk wrote:
           | I have to remember this one. Waltz into the room and
           | proclaim, why is everyone so negative? It's great because x,
           | y and z. It looks pretty great.
        
         | protonbob wrote:
         | > but these jobs are going to be the first on the chopping
         | block as these integrations mature.
         | 
         | Perhaps this is part of the negativity? This is a bad thing for
         | the middle class.
        
           | jpadkins wrote:
           | in the short run. In the long run, productivity gains
           | benefit* all of us (in a functional market economy).
           | 
           | *material benefit. In terms of spirit and purpose, the older
           | I get the more I think maybe the Amish are on to something.
           | Work gives our lives purpose, and the closer the work is to
           | our core needs, the better it feels. Labor saving so that
           | most of us are just entertaining each other on social
           | networks may lead to a worse society (but hey, our material
           | needs are met!)
        
           | informal007 wrote:
           | agree with you, but it cannot be stopped. development of
           | technology always makes wealth distribution more centralized
        
             | bartvk wrote:
             | I kind of get what you're saying but can you explain your
             | reasoning or provide a source?
        
         | Workaccount2 wrote:
         | I think excel is a dead end. LLM agents will probably greatly
         | prefer SQL, sqlite, and Python instead of bulky made-for-
         | regular-folks excel.
         | 
         | Versatility and efficiency explode while human usability tanks,
         | but who cares at that point?
        
           | informal007 wrote:
           | Database might be the future, but viable solution on excel
           | are evidence to prove that it works
        
         | informal007 wrote:
         | this will push the development of open source models.
         | 
         | people think of privacy at first regards of data, local
         | deployment of open source models are the first choice for them
        
         | threetonesun wrote:
         | Probably because many people here are software developers, and
         | wrapping spreadsheets in deterministic logic and a consistent
         | UI covers... most software use cases.
        
         | Scubabear68 wrote:
         | Having wrangled many spreadsheets personally, and worked with
         | CFOs who use them to run small-ish businesses, and all the way
         | up to one of top 3 brokerage houses world-wide using them to
         | model complex fixed income instruments... this is a disaster
         | waiting to happen.
         | 
         | Spreadsheet UI is already a nightmare. The formula editing and
         | relationship visioning is not there at all. Mistakes are
         | rampant in spreadsheets, even my own carefully curated ones.
         | 
         | Claude is not going to improve this. It is going to make it
         | far, far worse with subtle and not so subtle hallucinations
         | happening left and right.
         | 
         | The key is really this - all LLMs that I know of rely on
         | entropy and randomness to emulate human creativity. This works
         | pretty well for pretty pictures and creating fan fiction or
         | emulating someone's voice.
         | 
         | It is not a basis for getting correct spreadsheets that show
         | what you want to show. I don't want my spreadsheet correctness
         | to start from a random seed. I want it to spring from first
         | principles.
        
           | noosphr wrote:
           | My first job out of uni was building a spreadsheet infra as
           | code version control system after a Windows update made an
           | eight year old spreadsheet go haywire and lose $10m in a
           | afternoon.
           | 
           | Spreadsheets are already a disaster.
        
             | daveguy wrote:
             | > Spreadsheets are already a disaster.
             | 
             | Yeah, that's what OP said. Now add a bunch of random
             | hallucinations hidden inside formulas inside cells.
             | 
             | If they really have a good spreadsheet solution they've
             | either fixed the spreadsheet UI issues or the LLM
             | hallucination issues or both. My guess is neither.
        
             | sally_glance wrote:
             | Compared to what? Granted, Excel incidents are probably
             | underreported and might produce "silent" consequential
             | losses. But compared to that, for enterprise or custom
             | software in general we have pretty scary estimates of the
             | damages. Like Y2K (between 300-600bn) and the UK Postal
             | Office thing (~1bn).
        
               | array_key_first wrote:
               | Excel spreadsheets ARE custom software, with custom
               | requirements, calculations, and algorithms. They're just
               | not typically written by programmers, have no version
               | control or rollback abilities, are not audited, are not
               | debuggable, and are typically not run through QA or QC.
        
               | iambateman wrote:
               | If I could teach managers one lesson, it would be this
               | one.
        
               | jackcviers3 wrote:
               | I'll add to this - if you work on a software project to
               | port an excel spreadsheet to real software that has all
               | those properties, if the spreadsheet is sophisticated
               | enough to warrant the process, the creators won't be able
               | to remember enough details abut how they created it to
               | tell you the requirements necessary to produce the
               | software. You may do all the calculations right, and
               | because they've always had a rounding error that they've
               | worked around somewhere else, your software shows
               | calculations that have driven business decisions for
               | decades were always wrong, and the business will insist
               | that the new software is wrong instead of owning some
               | mistake. It's never pretty, and it always governs
               | something extremely important.
        
               | calgoo wrote:
               | Now, if we could give that excel file to an llm and it
               | creates a design document that explains everything is
               | does, then that would be a great use of an LLM.
        
               | pjmlp wrote:
               | Thing is, they are also the common workaround solution
               | for savy office workers that don't want to wait for the
               | IT department if it exists, or some outsourced
               | consultancy, to finally deliver something that only does
               | half the job they need.
               | 
               | So far no one has managed to deliver an alternative to
               | spreedsheets that fix this issue, doesn't matter if we
               | can do much better in Python, Java, C# whatever, if it is
               | always over budget and only covers half of the work.
               | 
               | I know, I have taken part in such project, and it run
               | over budget because there was always that little workflow
               | super easy to do in Excel and they would refuse to adopt
               | the tool if it didn't cover that workflow as well.
        
               | gpderetta wrote:
               | exactly. And Claude and other code assistants are more of
               | the same, allowing non-programmers[1] to write code for
               | their needs. And that's a good thing overall.
               | 
               | [1] well, people that don't consider themselves
               | programmers.
        
               | sally_glance wrote:
               | Agreed. The tradition has been continued by workflow
               | engines, low code tools, platforms like Salesforce and
               | lately AI-builders. The issue is generally not that these
               | are bad, but because they don't _feel_ like software
               | development everyone is comfortable skipping steps of the
               | development process.
               | 
               | To be fair, I've seen shops which actually apply good
               | engineering practices to Excel sheets too. Just
               | definitely not a majority...
        
               | pjmlp wrote:
               | Sometimes it isn't that folks are confortable skipping
               | steps, rather they aren't even available.
               | 
               | As so happens in the LLM age, I have been recently having
               | to deal with such tools, and oh boy Smalltalk based image
               | development in the 1990's with Smalltalk/V is so much
               | better in regards to engineering practices than those
               | "modern" tools.
               | 
               | I cannot test code, if I want to backup to some version
               | control system, I have to manually export/import a
               | gigantic JSON file that represents the low-code workflow
               | logic, no proper debugging tools, and so many other
               | things I could rant about.
               | 
               | But I guess this is the future, AI agents based workflow
               | engines calling into SaaS products, deployed in a MACH
               | architecture. Great buzzword bingo, right?
        
             | p4ul wrote:
             | It's interesting that you mention disaster; there is at
             | least one annual conference dedicated to "spreadsheet risk
             | management".[1]
             | 
             | [1] https://eusprig.org/
        
             | anitil wrote:
             | I know you probably can't share the details, but if you can
             | I (and I'm sure all of us) would love to hear them
        
           | MattGaiser wrote:
           | > Mistakes are rampant in spreadsheets
           | 
           | To me, the case for LLMs is strongest not because LLMs are so
           | unusually accurate and awesome, but because if human
           | performance were put on trial in aggregate, it would be found
           | wanting.
           | 
           | Humans already do a mediocre job of spreadsheets, so I don't
           | think it is a given that Claude will make more mistakes than
           | humans do.
        
             | lionkor wrote:
             | But isn't this only fine as long someone who knows what
             | they are doing has oversight and can fix issues when they
             | arise and Claude gets stuck?
             | 
             | Once we all forget how to write SUM(A:A), will we just
             | invent a new kind of spreadsheet once Claude gets stuck?
             | 
             | Or in other words; what's the end game here? LLMs clearly
             | cannot be left alone to do anything properly, so what's the
             | end game of making people not learn anything anymore?
        
               | solumunus wrote:
               | Well the end game with AI is AGI of course. But
               | realistically the best case scenario with LLM's is having
               | fewer people with the required knowledge, leveraging
               | LLM's to massively enhance productivity.
               | 
               | We're already there to some degree. It is hard to put a
               | number on my productivity gain, but as a small business
               | owner with a growing software company it's clear to me
               | already that I can reduce developer hiring going forward.
               | 
               | When I read the skeptics I just have to conclude that
               | they're either poor at context building and/or work on
               | messy, inconsistent and poorly documented projects.
               | 
               | My sense is that many weaker developers who can't learn
               | these tools simply won't compete in the new environment.
               | Those who can build well designed and documented projects
               | with deep context easy for LLM's to digest will thrive.
               | 
               | I assume all of this applies to spreadsheets.
        
               | dns_snek wrote:
               | Why isn't there a single study that would back up your
               | observations? The only study with a representative
               | experimental design that I know about is the METR study
               | and it showed the opposite. Every study citing
               | significant productivity improvements that I've seen is
               | either:
               | 
               | - relying on self-assessments from developers about how
               | much time they think they saved, or
               | 
               | - using useless metrics like lines of code produced or
               | PRs opened, or
               | 
               | - timing developers on toy programming assignments like
               | implementing a basic HTTP server that aren't
               | representative of the real world.
               | 
               | Why is it that any time I ask people to provide examples
               | of high quality software projects that were predominantly
               | LLM-generated (with video evidence to document the
               | process and allow us to judge the velocity), nobody ever
               | answers the call? Would you like to change that?
               | 
               | My sense is that weaker developers and especially weaker
               | leaders are easily impressed and fascinated by
               | substandard results :)
        
             | nosianu wrote:
             | Okay, and now you give those mediocre humans a tool hat is
             | both great and terrible. The problem is, unless you know
             | your way around very well, they won't know which is which.
             | 
             | Since my company uses Excel a lot, and I know the basics
             | but don't want to become an expert, I use LLMs to ask
             | intermediate questions, too hard to answer with the few
             | formulas I know, not too hard for a short solution path.
             | 
             | I have great success and definitely like what I can get
             | with the Excel/LLM combo. But if my colleagues used it the
             | same way, they would not get my good results, which is not
             | their fault, they are not IT but specialists, e.g. for
             | logistics. The best use of LLMs is if you could already do
             | the job without them, but it saves you time to ask them and
             | then check if the result is actually acceptable.
             | 
             | Sometimes I abandon the LLM session, because sometimes, and
             | it's not always easy to predict, fixing the broken result
             | would take more effort than just doing it the old way
             | myself.
             | 
             | A big problem is that the LLMs are so darn confident and
             | always present a result. For example, I point it to a
             | problem, it "thinks", and then it gives me new code and
             | very confidently summarizes what the problem was,
             | correctly, that it now for sure fixed the problem. Only
             | that when I actually try the result has gotten worse than
             | before. At that point I never try to get back to a working
             | solution by continuing to try to "talk" to the AI, I just
             | delete that session and do another, non-AI approach.
             | 
             | But non-experts, and people who are very busy and just want
             | to get some result to forward to someone waiting for it as
             | quickly as possible will be tempted to accept the nice
             | looking and confidently presented "solution" as-is. And you
             | may not find a problem until half a year later somebody
             | finds that prepayments, pro forma bills and the final
             | invoices don't quite match in hard to follow ways.
             | 
             | Not that these things don't happen now already, but adding
             | a tool with erratic results might increase problems,
             | depending on actual implementation of the process. Which
             | most likely won't be well thought out, many will just cram
             | in the new tool and think it works when it doesn't implode
             | right away, and the first results, produced when people
             | still pay a lot of attention and are careful, all look
             | good.
             | 
             | I am in awe of the accomplishments of this new tool, but it
             | is way overhyped IMHO, still far too unpolished and random.
             | Forcing all kinds of processes and people to use it is not
             | a good match, I think.
        
               | ryandrake wrote:
               | This is a great point. LLMs make good developers better,
               | but they make bad developers even worse. LLMs multiply
               | instead of add value. So if you're a good developer, who
               | is careful, pays attention, watches out for trouble, and
               | is constantly reviewing and steering, the LLM is
               | multiplying by a positive number and will make you
               | better. However, if you're a mediocre/bad developer, who
               | is not careful, who lacks attention to detail, and just
               | barely gets things to compile / run, then the LLM is
               | multiplying by a negative number and will make your
               | output even worse.
        
           | extr wrote:
           | Is this just a feeling you have or is this downstream of
           | actual use cases you've applied AI to observed and measured
           | reliability on?
        
             | lionkor wrote:
             | Not OP but using LLMs in any professional setting, like
             | programming, editing or writing technical specifications,
             | OP is correct.
             | 
             | Without extensive promoting and injectimg my own knowledge
             | and experience, LLMs generate absolute unusable garbage (on
             | average). Anyone who disagrees very likely is not someone
             | who would produce good quality work by themselves (on
             | average). That's not a clever quip; that's a very sad
             | reality. SO MANY people cannot be bothered to learn
             | anything if they can help it.
        
               | extr wrote:
               | I would completely disagree. I use LLMs daily for coding.
               | They are quite far from AGI and it does not appear they
               | are replacing Senior or Staff Engineers any time soon.
               | But they are incredible machines that are perfectly
               | capable of performing some economically valuable tasks in
               | a fraction of the time it would have taken a human. If
               | you deny this your head is in the sand.
        
               | lionkor wrote:
               | Capable, yeah, but not reliable, that's my point. They
               | can one shot fantastic code, or they can one shot the
               | code I then have to review and pull my hair out over for
               | a week, because it's such crap (and the person who pushed
               | it is my boss, for example, so I can't just tell him to
               | try again).
               | 
               | That's not consistent.
        
               | wahnfrieden wrote:
               | You can ask your boss to submit PRs using Codex's "try 5
               | variations of the same task and select the one you like
               | most though
        
               | zxor wrote:
               | Surely at that point they could write the code themselves
               | faster than they can review 5 PRs.
               | 
               | Producing more slop for someone else to work through is
               | not the solution you think it is.
        
               | extr wrote:
               | Have you never used one to hunt down an obscure bug and
               | found the answer quicker than you likely would have
               | yourself?
        
               | lionkor wrote:
               | Actually, yeah, a couple of times, but that was a rubber-
               | ducky approach; the AI said something utterly stupid, but
               | while trying to explain things, I figured it out. I don't
               | think an LLM has solved any difficult problem for me
               | before. However, I think I'm likely an outlier because I
               | do solve most issues myself anyways.
        
               | chrisweekly wrote:
               | Why do you frame the options as "one shot... or... one
               | shot"?
        
               | lionkor wrote:
               | Because lazy people will use it like that, and we are all
               | inherently lazy
        
               | dns_snek wrote:
               | It's not much better with planning either. The amount of
               | time I spent planning, clarifying requirements, hand-
               | holding implementation details always offset any
               | potential savings.
        
               | visarga wrote:
               | The triad of LLM dependencies in my view: initiation of
               | tasks, experience based feedback, and consequence sink.
               | They can do none of these, they all connect to the outer
               | context which sits with the user, not the model.
               | 
               | You know what? This is also not unlike hiring a human,
               | they need the hirer party tell them what to do, give
               | feedback, and assume the outcomes.
               | 
               | It's all about context which is non-fungible and
               | distributed, not related to intelligence but to the
               | reason we need intelligence for.
        
               | KronisLV wrote:
               | > Anyone who disagrees very likely is not someone who
               | would produce good quality work by themselves (on
               | average).
               | 
               | So for those producing slop and not knowing any better
               | (or not caring), AI just improved the speed at which they
               | work! Sounds like a great investment for them!
               | 
               | For many mastering any given craft might not be the goal,
               | but rather just pushing stuff out the door and paying
               | bills. A case of mismatched incentives, one might say.
        
             | mbesto wrote:
             | Not the parent poster, but this is pretty much the
             | foundation of LLMs. They are by their nature probabilistic,
             | not deterministic. This is precisely what the parent is
             | referring to.
        
               | extr wrote:
               | All processes in reality, everywhere, are probablistic.
               | The entire reason "engineering" is not the same as
               | theoretical mathematics is about managing these
               | probabilities to an acceptable level for the task you're
               | trying to perform. You are getting a "probablistic"
               | output from a human too. Human beings are not
               | guaranteeing theoretically optimal excel output when they
               | send their boss Final_Final_v2.xlsx. You are using your
               | mental model of their capabilities to inform how much you
               | trust the result.
               | 
               | Building a process to get a similar confidence in LLM
               | output is part of the game.
        
               | jbs789 wrote:
               | Yup. It becomes clearer to me when I think about the
               | existing validators. Can these be improved, for sure.
               | 
               | It's when people make the leaps to the multi-year endgame
               | and in their effort to monetise by building
               | overconfidence in the product where I see the inherent
               | conflict.
               | 
               | It's going to be a slog... the detailed implementations.
               | And if anyone is a bit more realistic about managing
               | expectations I think Anthropic is doing it a little
               | better.
        
               | mbesto wrote:
               | > All processes in reality, everywhere, are probablistic.
               | 
               | If we want to go in philosophy then sure, you're correct,
               | but this not what we're saying.
               | 
               | For example, an LLM is capable (and it's highly plausible
               | for it to do so) of creating a reference to a non-
               | existent source. Humans generally don't do that when
               | their goal is clear and aligned (hence deterministic).
               | 
               | > Building a process to get a similar confidence in LLM
               | output is part of the game.
               | 
               | Which is precisely my point. LLMs are supposed to be
               | _better_ than humans. We 're (currently) shoehorning the
               | technology.
        
               | extr wrote:
               | > Humans generally don't do that when their goal is clear
               | and aligned (hence deterministic).
               | 
               | Look at the language you're using here. Humans
               | "generally" make less of these kinds of errors.
               | "Generally". That is literally an assessment of
               | likelihood. It is completely possible for me to hire
               | someone so stupid that they create a reference to a non-
               | existent source. It's completely possible for my high IQ
               | genius employee who is correct 99.99% of the time to have
               | an off-day and accidentally fat finger something. It
               | happens. Perhaps it happens at 1/100th of the rate that
               | an LLM would do it. But that is simply an input to the
               | model of the process or system I'm trying to build that I
               | need to account for.
        
               | spookie wrote:
               | When humans make mistakes repeatedly in their job they
               | get fired.
        
               | Scubabear68 wrote:
               | I have to disagree. There are many areas where things are
               | extremely deterministic, regulated financial services
               | being one of those areas. As one example of zillions,
               | look at something like Bond Math. All of it is very well
               | defined, all the way down to what calendar model you will
               | you use (360/30 or what have you), rounding, etc. It's
               | all extremely well defined specifically so you can get
               | apple to apple comparisons in the market place.
               | 
               | The same applies to my checkbook, and many other areas of
               | either calculating actuals or where future state is well
               | defined by a model.
               | 
               | That said, there _can_ be a statistical aspect to any
               | spreadsheet model. Obviously. But not all spreadsheets
               | are statistical, and therein lies the rub. If an LLM
               | wants to hallucinate a 9,000 day yearly calendar because
               | it confuses our notion of a year with one of the outer
               | planets, that falls well within probability, but not
               | within determinism following well define rules.
               | 
               | The other side of the issue is LLMs trained on the
               | Internet. What are the chances that Claude or whatever is
               | going to make a change based on a widely prevalent but
               | incorrect spreadsheet it found on some random corner of
               | the Internet? Do I want Claude breaking my well-honed
               | spreadsheet because Floyd in Nebraska counted sheep wrong
               | in a spreadsheet he uploaded and forgot about 5 years
               | ago, and Claude found it relevant?
        
           | sothatsit wrote:
           | I don't think tools like Claude are there yet, but I already
           | trust GPT-5 Pro to be more diligent about catching bugs in
           | software than me, even when I am trying to be very careful. I
           | expect even just using these tools to help review existing
           | Excel spreadsheets could lead to a significant boost in
           | quality if software is any guide (and Excel spreadsheets seem
           | even worse than software when it comes to errors).
           | 
           | That said, Claude is still quite behind GPT-5 in its ability
           | to review code, and so I'm not sure how much to expect from
           | Sonnet 4.5 in this new domain. OpenAI could probably do
           | better.
        
             | admdly wrote:
             | > That said, Claude is still quite behind GPT-5 in its
             | ability to review code, and so I'm not sure how much to
             | expect from Sonnet 4.5 in this new domain. OpenAI could
             | probably do better.
             | 
             | It's always interesting to see others opinions as it's
             | still so variable and "vibe" based. Personally, for my use,
             | the idea that any GPT-5 model is superior to Claude just
             | doesn't resonate - and I use both regularly for similar
             | tasks.
        
               | sothatsit wrote:
               | I also find the subjective nature of these models
               | interesting, but in this case the difference in my
               | experiences between Sonnet 4.5 and GPT-5 Codex, and
               | especially GPT-5 Pro, for code review is pretty stark.
               | GPT-5 is consistently much better at hard logic problems,
               | which code review often involves.
               | 
               | I have had GPT-5 point out dozens of complex bugs to me.
               | Often in these cases I will try to see if other models
               | can spot the same problems, and Gemini has occasionally
               | but the Claude models never have (using Opus 4, 4.1, and
               | Sonnet 4.5). These are bugs like complex race conditions
               | or deadlocks that involve complex interactions between
               | different parts of the codebase. GPT-5 and Gemini can
               | spot these types of bugs with a decent accuracy, while
               | I've never had Claude point out a bug like this.
               | 
               | If you haven't tried it, I would try the codex /review
               | feature and compare its results to asking Sonnet to do a
               | review. For me, the difference is very clear for code
               | review. For actual coding tasks, both models are much
               | more varied, but for code review I've never had an
               | instance where Claude pointed out a serious bug that
               | GPT-5 missed. And I use these tools for code review all
               | the time.
        
               | bcrosby95 wrote:
               | I've noticed something similar. I've been working on some
               | concurrency libraries for elixir and Claude constantly
               | gets things wrong, but GPT5 can recognize the techniques
               | I'm using and the tradeoffs.
        
               | meowface wrote:
               | Try the TypeScript codex CLI with the gpt-5-codex model
               | with reasoning always set to high, or GPT-5 Pro with max
               | reasoning. Both are currently undeniably better than
               | Claude Opus 4.1 or Sonnet 4.5 (max reasoning or
               | otherwise) for all code-related tasks. Much slower but
               | more reliable and more intelligent.
               | 
               | I've been a Claude Code fanboy for many months but OpenAI
               | simply won this leg of the race, for now.
        
               | typpilol wrote:
               | Same. I switched from sonnet 4 when it was out to codex.
               | Went back to try sonnet 4.5 and it really hates to work
               | for longer than like 5 minutes at a time
               | 
               | Codex meanwhile seems to be smarter and plugs away at a
               | massive todo list for like 2 hours
        
           | scoot wrote:
           | Or you could, you know, read the article before commenting to
           | see the limited scope of this integration?
           | 
           | Anyway, Google has already integrated Gemini into Sheets, and
           | recently added direct spreadsheet editing capability so your
           | comment was disproven before you even wrote it
        
           | silenced_trope wrote:
           | > The key is really this - all LLMs that I know of rely on
           | entropy and randomness to emulate human creativity. This
           | works pretty well for pretty pictures and creating fan
           | fiction or emulating someone's voice.
           | 
           | I think you need to turn down the temperature a little bit.
           | This could be a beneficial change.
        
           | scosman wrote:
           | > all LLMs that I know of rely on entropy and randomness to
           | emulate human creativity
           | 
           | Those are tuneable parameters. Turn down the temperature and
           | top_p if you don't want the creativity.
           | 
           | > Claude is not going to improve this.
           | 
           | We can measure models vs humans and figure this out.
           | 
           | To your own point, humans already make "rampant" mistakes.
           | With models, we can scale inference time compute to catch and
           | eliminate mistakes, for example: run 6x independent
           | validators using different methodologies.
           | 
           | One-shot financial models are a bad idea, but properly
           | designed systems can probably match or beat humans pretty
           | quickly.
        
             | th0ma5 wrote:
             | > Turn down the temperature and top_p if you don't want the
             | creativity.
             | 
             | This also reduces accuracy in real terms. The randomness is
             | used to jump out of local minima.
        
               | scosman wrote:
               | That's at training time, not inference time. And
               | temp/top_p aren't used to escape local minima, methods
               | like SDG batch sampling, Adam, dropout, LR decay, and
               | other techniques do that.
        
             | hansmayer wrote:
             | > Those are tuneable parameters. Turn down the temperature
             | and top_p if you don't want the creativity.
             | 
             | Ah yes, we'll tell Mary from the Payroll she could just
             | tune them parameters if there is more than "like 2%" error
             | in her spreadsheets
        
               | scosman wrote:
               | No one said it was a user setting. The person building
               | the spreadsheet agent system would tune the hyper-
               | parameters with a series of eval sets.
        
           | sally_glance wrote:
           | Having AI create the spreadsheet you want is totally
           | possible, just like generating bash scripts works well. But
           | to get good results, there needs to be some documentation
           | describing all the hidden relationships and nasty workarounds
           | first.
           | 
           | Don't try to make LLMs generate results or numbers, that's
           | bound to fail in any case. But they're okay to generate a
           | starting point for automations (like Excel sheets with lots
           | of formulas and macros), given they get access to the same
           | context we have in our heads.
        
             | bnug wrote:
             | I like this take. There seems to be an over-focus on 'one-
             | shot' results, but I've found that even the free tools are
             | a significant productivity booster when you focus on
             | generating smaller pieces of code that you can verify.
             | Maybe I'm behind the power curve since I'm not leveraging
             | the full capability of the advanced LLM's, but if the
             | argument is disaster is right around the corner due to
             | potential hallucinations, I think we should consider that
             | you still have to check your work for mission critical
             | systems. That said, I don't really build mission critical
             | systems - I just work in Aerospace Engineering and like
             | building small time saving scripts / macros for other
             | engineers to use. For this use, free LLMs even have been
             | huge for me. Maybe I'm in a very small minority, but I do
             | use Excel & Python nearly every day.
        
           | mountainriver wrote:
           | You can do it cursor style
        
           | hoistbypetard wrote:
           | IMO people tend to over-trust both AI and Excel. Maybe this
           | will recalibrate that after it leads to a catastrophic
           | business failure or two.
        
             | phatfish wrote:
             | You would hope so. But how many companies have actually
             | changed their IT policy of outsourcing everything to Tata
             | Consultancy Services (or similar) where a sweaty office in
             | Mumbai full of people who don't give a shit run critical
             | infrastructure?
             | 
             | Jaguar Landrover had production stopped for over a month I
             | think, and 100+ million impact to their business (including
             | a trail of smaller suppliers put near bankruptcy). I'd bet
             | Tata are still there and embedded even further in 5 years.
             | 
             | If AI provides some day-to-day running cost reduction that
             | looks good on quarterly financial statements it will be
             | fully embraced, despite the odd "act of god".
        
               | gpderetta wrote:
               | to be clear, tata owns JLR.
        
               | phatfish wrote:
               | Indeed, that slipped my mind. However the Marks and
               | Spencer hack was also their fault. Just searching on it
               | now it seems there is a ray of hope. Although i have a
               | feeling the response won't be a well trained
               | onshore/internal IT department. It will be another
               | offshore outsourcing jaunt but with better compensation
               | for incompetent staff on the outsourcers side.
               | 
               | "Marks & Spencer Cuts Ties With Tata Consultancy Services
               | Amid PS300m Cyber Attack Fallout" (ibtimes.co.uk)
        
           | jbs789 wrote:
           | I tend to agree that dropping the tool as it is into
           | untrained hands is going to be catastrophic.
           | 
           | I've had similar professional experiences as you and have
           | been experimenting with Claude Code. I've found I really need
           | to know what I'm doing and the detail in order to make
           | effective (safe) use out of it. And that's been a learning
           | curve.
           | 
           | The one area I hope/think it's closest to (given comments
           | above) is potentially as a "checker" or validator.
           | 
           | But even then I'd consider the extent to which it leaks data,
           | steers me the wrong way, or misses something.
           | 
           | The other case may be mocking up a simple financial model for
           | a test / to bounce ideas around. But without very detailed
           | manual review (as a mitigating check), I wouldn't trust it.
           | 
           | So yeah... that's the experience of someone who maybe bridges
           | these worlds somewhat... And I think many out there see the
           | tough (detailed) road ahead, while these companies are racing
           | to monetize.
        
           | stocksinsmocks wrote:
           | My take is more optimistic. This could be an off ramp to stop
           | putting critical business workflows in spreadsheets. If
           | people start to learn that general purpose programming
           | languages are actually easier than Excel (and with LLMs,
           | there is no barrier), then maybe more robust workflows and
           | automation will be the norm.
           | 
           | I think the world would be a lot better off if excel weren't
           | in it. For example, I work at business with 50K+ employees
           | where project management is done in a hellish spreadsheet
           | literally one guy in Australia understands. Data entry errors
           | can be anywhere and are incomprehensible. 3 or 4 versions are
           | floating around to support old projects. A CRUD app with a
           | web front end would solve it all. Yet it persists because
           | Excel is erroneously seen as accessible whereas Rails,
           | Django, or literally anything else is witchcraft.
        
             | player1234 wrote:
             | There was never a barrier to automating your office work
             | with python unless you are a moron.
             | 
             | Who fooled the world scripting some known work flow of
             | yours is fucking rocket science. It should be a requirement
             | to even enter the fucking office building.
        
           | xbmcuser wrote:
           | In my opinion the biggest use case for spread sheet with LLM
           | is to ask them to build python scripts to do what ever
           | manipulations you want to do with the data. Once people learn
           | to do this workplace productivity would increase greatly I
           | have been using LLM for years now to write python scripts
           | that automate different repeatable tasks. Want a pdf of this
           | data to be overlayed on this file create a python script with
           | an LLM. Want the data exported out of this to be formated and
           | tallied create a script for that.
        
             | calgoo wrote:
             | Yesterday I had to pass a bunch of data to finance as the
             | person that does so had left the company. But they wanted
             | me to basically group by a few columns, so instead of
             | spending an hour on this in excel, I created 3 rows of fake
             | data, gave it to the llm, it created a Python script which
             | I ran against the dataset. After manual verification of the
             | results, it could be submitted to finance.
        
               | jb1991 wrote:
               | Congrats? But you are not likely a typical user.
        
               | brabel wrote:
               | That's exactly how it should be done if accuracy is
               | important.
        
               | xbmcuser wrote:
               | Yeah I am not a programmer just more tech literate than
               | most as I have always been fascinated by tech. I think
               | people are missing the forest for the trees when it comes
               | to LLMS. I have been using them to create simple bash,
               | bat, python scripts. Which I would not have been able to
               | put together before even with weeks of googling. I say
               | that because I used to do that unsuccessfully but my
               | success rate thorugh the roof with LLM's.
               | 
               | Now I just ask an LLM to create the scripts and explain
               | all the steps. If it is a complex script I would also ask
               | it to add logging to the script so that I can feed the
               | log back to the LLM and explain what is going wrong which
               | allowed for a lot faster fixes. In the early days I and
               | the LLM would be going around in circles till I hit the
               | token limits. And to start from scratch again.
        
               | player1234 wrote:
               | Learn python, the subscription for that knowledge won't
               | be jacked up to 2000$/month when the VC money drys up.
        
               | player1234 wrote:
               | Just learn python, what are you a child?
        
             | PatronBernard wrote:
             | How will people without Python knowledge know that the
             | script is 100% correct? You can say "Well they shouldn't
             | use it for mission critical stuff" or "Yeah that's not a
             | use case, it could be useful for qualitative analysis"
             | etc., but you bet they will use it for everything. People
             | use ChatGPT as a search engine and a therapist, which tells
             | us enough
        
               | 010101010101 wrote:
               | If you have a mechanism that can prove arbitrary program
               | correctness with 100% accuracy you're sitting on
               | something more valuable than LLMs.
        
               | tonyhart7 wrote:
               | so human powered LLM user ??
        
               | freedomben wrote:
               | For sure, I've never seen a human write a bug or make a
               | mistake in programming
        
               | tonyhart7 wrote:
               | that's why we create LLM for that
        
             | player1234 wrote:
             | Basic python knowledge should be a requirement for any
             | office job.
             | 
             | LLMs is a retarded way of spending trillions automating
             | what can be done with good old reliable scripting. We
             | haven't automated shit yet.
        
           | hansmayer wrote:
           | Yeah, it's like that commercial for OpenAI (or was it
           | Gemini?) where the guy says it lets the tool work on it's
           | complex financial spreadsheets, goes for a walk with a dog,
           | gets back and it is done with "like 98% accuracy". I cannot
           | imagine what the 2% margin of error looks like for a company
           | that moves around hundreds of billions of dollars...
        
         | lacker wrote:
         | It's like the negativity whenever a post talks about hiring or
         | firing. A lot of people are afraid that they are going to lose
         | their jobs to AI.
        
         | pluc wrote:
         | Anthropic now has all your company's data, and all you saved
         | was the cost of one human minus however much they charge for
         | this. The good news is it can't have your data _again_! So
         | starting from the 163rd-165th person you fire, you start to see
         | a good return and all you 've sacrificed is exactitude,
         | precision, judgement, customer service and a little bit of
         | public perception!
        
         | mapt wrote:
         | The vast majority of people in business and science are using
         | spreadsheets for complex algorithmic things they weren't really
         | designed for, and we find a metric fuckton of errors in the
         | sheets when you actually bother looking auditing them, mistakes
         | which are not at all obvious without troubleshooting by...
         | manually checking each and every cell & cell relation, peering
         | through parentheses, following references. It's a nightmare to
         | troubleshoot.
         | 
         | LLMs specialize in making up plausible things with a minimum of
         | human effort, but their downside is that they're very good at
         | making up plausible things which are covertly erroneous. It's a
         | nightmare to troubleshoot.
         | 
         | There is already an abject inability to provision the labor to
         | verify Excel reasoning when it's composed by humans.
         | 
         | I'm dead certain that Claude will be able to produce plausibly
         | correct spreadsheets. How important is accuracy to you? How
         | life-critical is the end result? What are your odds, with the
         | current auditing workflow?
         | 
         | Okay! Now! Half of the users just got laid off because
         | management thinks Claude is Good Enough. How about now?
        
           | practice9 wrote:
           | LLMs are getting quite good at reviewing the results and
           | implementations, though
        
             | lionkor wrote:
             | Not really, they're only as good as their context and they
             | do miss and forget important things. It doesn't matter how
             | often, because they do, and they will tell you with 100%
             | confidence and with every synonym of "sure" that they
             | caught it all. That's the issue.
        
               | sothatsit wrote:
               | I am very confident that these tools are better than the
               | median programmer at code review now. They are certainly
               | much more diligent. An actually useful standard to
               | compare them to is human review, and for technical
               | problems, they definitely pass it. That said, they're
               | still not great at giving design feedback.
               | 
               | But GPT-5 Pro, and to a certain extent GPT-5 Codex, can
               | spot complex bugs like race conditions, or subtly
               | incorrect logic like memory misuse in C, remarkably well.
               | It is a shame GPT-5 Pro is locked behind a $200/month
               | subscription, which means most people do not understand
               | just how good the frontier models are at this type of
               | task now.
        
           | rchaud wrote:
           | I'd say the vast majority of Excel users in business are
           | working off of a CSV sent from their database/ERP team or
           | exported from a self-serve analytics tool and using pivot
           | tables to do the heavy lifting, where it's nearly impossible
           | to get something wrong. Investment banks and trading desks
           | are different, and usually have an in-house IT team building
           | custom extensions into Excel or training staff to use bespoke
           | software. That's still a very small minority of Excel users.
        
         | atleastoptimal wrote:
         | HN has a base of strong anti-AI bias, I assume is partially
         | motivated by insecurity over being replaced, losing their jobs
         | or having missed the boat on the AI.
        
           | extr wrote:
           | Based on the comments here, it's surprisingly anything in
           | society works at all. I didn't realize the bar was
           | "everything perfect every time, perfectly flexible and
           | adaptable". What a joy some of these folks must be to work
           | with, answering every new technology with endless reasons why
           | it's worthless and will never work.
        
             | jay_kyburz wrote:
             | I think perhaps you underestimate how antithetical the
             | current batch of LLM AI's is to what most programmers
             | strive for every day, and what we want from our tools. Its
             | not about losing our job, its about "correctness". (or as
             | said below - deterministic)
             | 
             | In a lot of jobs, particularly in creative industries, or
             | marketing, media and writing, the definition of a job well
             | done is a fairly grey area. I think AI will be mostly
             | disruptive in these areas.
             | 
             | But in programming there is a hard minimum of quality.
             | Given a set of inputs, does the program return the correct
             | answer or not? When you ask it what 2+2, do you get 4?
             | 
             | When you ask AI anything, it might be right 50% of the
             | time, or 70% of the time, but you can't blindly trust the
             | answer. A lot of us just find that not very useful.
        
               | Aeolun wrote:
               | Modt of the time when using AI I have a lot more than 1
               | shot to ensure everything is correct.
        
               | ytoawwhra92 wrote:
               | > But in programming there is a hard minimum of quality.
               | Given a set of inputs, does the program return the
               | correct answer or not? When you ask it what 2+2, do you
               | get 4?
               | 
               | Whether something works or not matters less than whether
               | someone will pay for it.
        
               | extr wrote:
               | I am a SWE myself and use LLMs to write ~100% of my code.
               | That does not mean I fire and forget multiplexed codex
               | instances. Many times I step through and approve every
               | edit. Even if it was nothing but a glorified stenographer
               | - there are substantial time savings in being able to
               | prototype and validate ideas quickly.
        
           | MattGaiser wrote:
           | HN has an obsession with quality too, which has merit, but is
           | often economically irrelevant.
           | 
           | When US-East-1 failed, lots of people talked about how the
           | lesson was cloud agnosticism and multi cloud architecture.
           | The practical economic lesson for most is that if US-East-1
           | fails, nobody will get mad at you. Cloud failure is viewed as
           | an act of god.
        
           | hypeatei wrote:
           | > HN has a base of strong anti-AI bias
           | 
           | Quite the opposite, actually. You can always find five
           | stories on the front page about some AI product or feature.
           | Meanwhile, you have people like yourself who convince
           | themselves that any pushback is done by people who just don't
           | see the true value of it yet and that they're about to miss
           | out!! Some kind of attempt at spreading FOMO, I guess.
        
           | lionkor wrote:
           | I use AI every day. Without oversight, it does not work well.
           | 
           | If it doesn't work well, I will do it myself, because I care
           | that things are done well.
           | 
           | None of this is me being scared of being replaced; quite the
           | opposite. I'm one of the last generations of programmers who
           | learned how to program and can debug and fix the mess your
           | LLM leaves behind when you forgot to add "make sure it's a
           | clean design and works" to the prompt.
           | 
           | Okay, that's maybe hyperbole, but sadly only a little bit.
           | LLMs make me better at my job, they don't replace me.
        
           | sothatsit wrote:
           | I really don't think this is accurate. I think the median
           | opinion here is to be suspicious of claims made about AI, and
           | I don't think that's necessarily a bad thing. But I also
           | regularly see posts talking about AI positively (e.g.
           | simonw), or talking about it negatively. I think this is a
           | good thing, it is nice to have a diversity of opinions on a
           | technology. It's a feature, not a bug.
        
           | crote wrote:
           | > HN has a base of strong anti-AI bias
           | 
           | If anything, HN, has a _pro-AI_ bias. I don 't know of _any_
           | other medium where discussions about AI consistently get this
           | much frontpage time, this amount of discussion, and this many
           | people reporting positive experiences with it. It 's
           | definitely true that HN isn't the raging pro-AI hypetrain it
           | was two years ago, but that shouldn't be mistaken for "strong
           | anti-AI bias".
           | 
           | Outside of HN I am seeing, _at best_ , an ambivalent
           | reaction: plenty of people are interested, almost everyone
           | tried it, very few people genuinely like it. They are happy
           | to use it when it is convenient, but couldn't care less if it
           | disappeared tomorrow.
           | 
           | There's also a small but vocal group which absolutely _hates_
           | AI and will actively boycott any creative-related company
           | stupid enough to admit to using it, but that crowd doesn 't
           | really seem to hang out on HN.
        
             | impjohn wrote:
             | >but couldn't care less if it disappeared tomorrow.
             | 
             | Wonder how true that is. Some things incorporate in your
             | life so subtly that you only become aware of them when
             | totally switched off.
        
             | sph wrote:
             | > There's also a small but vocal group which absolutely
             | hates AI and will actively boycott any creative-related
             | company stupid enough to admit to using it, but that crowd
             | doesn't really seem to hang out on HN.
             | 
             | I do, but I certainly feel in the minority in here.
        
           | mr_toad wrote:
           | > HN has a base of strong anti-AI bias
           | 
           | HN constantly points out the flaws, gaps, and failings of AI.
           | But the same is true of any technology discussed on HN. You
           | could describe HN as having an anti-technology bias, because
           | HN complains about the failings of tech all day every day.
        
           | StarterPro wrote:
           | Anti-AI bias is motivated by the waste of natural resources
           | due to a handful of non-technical douchebag tech bros.
           | 
           | Everything isn't about money, I know that status and power
           | are all you ai narcissists dream about. But you'll never be
           | Bill Gates, nor will you be Elon Musk.
           | 
           | Once ai has gone the way of "Web3", "NFTs", "blockchain", "3D
           | tvs", etc; You'll find a new grift to latch your life savings
           | onto.
        
         | A4ET8a8uTh0_v2 wrote:
         | It is bad in a very specific sense, but I did not see any other
         | comments express the bad parts instead of focusing merely on
         | the accuracy part ( which is an issue, but not the issue ):
         | 
         | - this opens up ridiculous flood of data that would otherwise
         | be semi-private to one company providing this service - this
         | works well small data sets, but will choke on ones it will need
         | to divvy up into chunks inviting interesting ( and yet unknown
         | ) errors
         | 
         | There is a real benefit to being able to 'talk to data', but
         | anyone who has seen corporate culture up close and personal
         | knows exactly where it will end.
         | 
         | edit: an i saying all this as as person, who actually likes
         | llms.
        
         | mceoin wrote:
         | I second this. Spreadsheets are the primary tool used for 15%
         | of the U.S. economy. Productivity improvements will affect
         | hundreds of millions of users globally. Each increment in
         | progress is a massive time save and value add.
         | 
         | The criticisms broadly fall between "spreadsheets are bad" and
         | "AI will cause more trouble than it solves".
         | 
         | This release is a dot in a trend towards everyone having a
         | Goldman-Sachs level analyst at their disposal 24/7. This is a
         | huge deal for the average person or business. Our expectation
         | (disclaimer: I work in this space) is that spreadsheet
         | intelligence will soon be a solved problem. The "harder"
         | problem is the instruction set and human <> machine prompting.
         | 
         | For the "spreadsheets are bad" crowd -- sure, they have
         | problems, but users have spoken and they are the preferred
         | interface for analysis, project management and lightweight
         | database work globally. All solutions to "the spreadsheet
         | problem" come with their own UX and usability tradeoffs, so
         | it'a a balance.
         | 
         | Congrats to the Claude team and looking forward to the next
         | release!
        
           | bonoboTP wrote:
           | > Each increment in progress is a massive time save and value
           | add.
           | 
           | Based on the history of digitalization of businesses from the
           | 1980s onwards, the spreadsheets will just balloon in number
           | and size and there will be more rules and more procedures and
           | more forms and reports to file until the efficiency gains are
           | neutralized (or almost neutralized).
        
             | mceoin wrote:
             | We'll hit a new plateau somewhere, for sure. Still, I'm
             | glad I'm not doing my spreadsheets on paper so net win so
             | far!
        
         | trollbridge wrote:
         | The biggest problem with spreadsheets is that they tend to be
         | accounts for the accumulation of technical debt, which is an
         | area that AI tools are not yet very good at retiring, but very
         | good at making additional withdrawals from.
        
         | burnte wrote:
         | > What is with the negativity in these comments?
         | 
         | A lot of us have seen the effects of AI tools in the hands of
         | people who don't understand how or why to use the tools. I've
         | already seen AI use/misuse get two people fired. One was a
         | line-of-business employee who relied on output without ever
         | checking it, got herself into a pretty deep hole in 3 weeks.
         | Another was a C suite person who tried to run an AI tool
         | development project and wasted double their salary in 3 months,
         | nothing to show for it but the bill, fired.
         | 
         | In both cases the person did not understand the limits of the
         | tools and kept replacing facts with their desires and their own
         | misunderstanding of AI. The C suite person even tried to tell a
         | vendor they were wrong about their own product because "I found
         | out from AI".
         | 
         | AI right now is fireworks. It's great when you know how to use
         | it, but if you half-ass it you'll blow your fingers off very
         | easily.
        
           | liqilin1567 wrote:
           | Yeah, the danger lies not in AI itself, but in inexperienced
           | users treating it as a magic solution.
        
             | topaz0 wrote:
             | It's a bit much to blame the user for this when the product
             | is crafted specifically to give the impression of being
             | magical. Not to mention the marketing and media.
        
         | II2II wrote:
         | > but these jobs are going to be the first on the chopping
         | block as these integrations mature.
         | 
         | I'm not even sure that has to be true anymore. From my
         | admittedly superficial impression of the page, this appears to
         | be a tool for building tools. There are plenty of organizations
         | that are resource constrained, that are doing things the way
         | they have always done thing in Excel, simply because they
         | cannot allocate someone to modify what is already in place to
         | better suit their current needs. For them, this is more of a
         | quality of life and quality of out improvement. This is not
         | like traditional software development, where organizations are
         | far more likely to purchase a product or service to do a job
         | (and where the vendors of those products and services are going
         | to do their best to eliminate developers).
        
         | giancarlostoro wrote:
         | Honestly as a dev I hate Excel its a whole mess I dont
         | understand. I will gladly use Claude for Excel. It will
         | understand the business needs from the data more than I a mere
         | developer just trying to get back to regular developer work.
        
         | nelox wrote:
         | Indeed. Take the New Zealand Department of Health as an
         | example; it managed its entire NZD$28 billion budget (USD$16B)
         | in a single Excel spreadsheet.
         | 
         | https://www.theregister.com/2025/03/10/nz_health_excel_sprea...
         | 
         | [edit: Added link]
        
         | singleshot_ wrote:
         | Can't speak for everyone, but the reason I'm negative in the
         | context of this idea is that it's a stupid idea.
        
         | timpieces wrote:
         | Yes it's surprising to see so much cynicism for something that
         | has a real possibility of making so many people so much more
         | productive. My mental model of the average excel user is of
         | someone who doesn't care about excel, but cares about their
         | business. If Claude can help them use excel and learn about
         | their business faster, then this should make the world more
         | productive and we all get richer. Claude can make mistakes, but
         | it's not clear to me why people think that the ratio of results
         | to mistakes will get worse here. I think there are many
         | possible reasons why this could not work out, but many of the
         | comments here just seem like unfounded cynicism.
        
         | meesles wrote:
         | My theory: a lot of software we build is the supposed solve for
         | a 'crappy spreadsheet'. a) that isnt' much of a moat, b) you're
         | watching generalization of software happen in real time.
        
           | impjohn wrote:
           | Crappy spreadsheet is just the codification of business
           | processes. Those are inherently messy and there's lots of
           | assumptions, lots of edge cases. That's why spreadsheets tend
           | towards crappy on a long enough timeline. It's a
           | fundamentally messy problem.
           | 
           | Spreadsheets are an abstraction over a messy reality, lossy.
           | They were already generalizing reality.
           | 
           | Now we generalize the generalization. It is this lossy
           | reality that people are worried about with AI in HN.
        
         | fragmede wrote:
         | > What is with the negativity in these comments?
         | 
         | > these jobs are going to be the first on the chopping block as
         | these integrations mature.
         | 
         | Those two things are maybe related? So many of my friends don't
         | enjoy the same privileges as I do, and have a more tenuous
         | connection to being gainfully employed.
        
         | eviks wrote:
         | > offense to these people but Sonnet 4.5 is already at the
         | level where it would be able to replicate or beat the level of
         | analysis they typically provide.
         | 
         | No offense, but this is pure fantasy. The level of analysis
         | they typically provide doesn't suffer from the same high
         | baseline level of completely made up numbers of your favorite
         | LLM.
        
         | rekabis wrote:
         | > Even just basic automation/scaffolding of spreadsheets would
         | be a big productivity boost for many employees.
         | 
         | When most of it is wild hallucinations? Not really.
         | 
         | For many employees leveraging Excel for manipulating important
         | data, it could cripple careers.
         | 
         | For spreadsheets that influence financial decisions or touch
         | PPI/PII, it could lead to regulatory disasters and even
         | bankruptcies.
         | 
         | Purge hallucinations from LLMs, _then_ let it touch the
         | important shite. Doing it in the reverse order is just begging
         | for a FAFO apocalypse.
        
         | UltraSane wrote:
         | You would be far better off using an LLM to replace a complex
         | spreadsheet with a Python script and SQLite.
        
         | 3uler wrote:
         | The lady doth protest too much. People see every AI limitation
         | crystal clear, but zero self awareness of their own
         | fallibility.
        
         | vincnetas wrote:
         | Non-reproducability is the biggest issue here. You deliver a
         | report in 5 minutes to CFO, he comes back after lunch, gives
         | you updated data to adjust a bit of a report and 5 minutes
         | later gets a new report that has some non related to update
         | number changed and asks why? what do you do?
        
         | atwrk wrote:
         | Because people will be deeply affected by this, and not in the
         | positive way. We already had this with copilot:
         | https://i.imgur.com/nguIAsv.jpeg
         | 
         | Just as with copilot, this combines LLM's inability to
         | repeatably do math correctly with peoples' overassurance in
         | LLM's capabilities.
        
         | hoppp wrote:
         | I don't like to use excel so if I ever have to touch it I will
         | use AI.
        
         | lizardking wrote:
         | First time at HN?
        
         | ferguess_k wrote:
         | > No offense to these people but Sonnet 4.5 is already at the
         | level where it would be able to replicate or beat the level of
         | analysis they typically provide.
         | 
         | If this is true then why your wife is going to be happy about
         | it? I found it really hard to understand. Do you prefer your
         | wife to be jobless and her employer happily cut costs without
         | impacting productivity? Even if it _just_ replaces the line
         | workers, do you think your wife is going to be safe?
         | 
         | I don't get it.
        
         | slightwinder wrote:
         | > What is with the negativity in these comments?
         | 
         | Excel and AIs are huge clusterfucks on their own, where insane
         | errors happens for various reasons. Combine them, and maybe we
         | will see improvement, but surely we will see catastrophic
         | outcomes which could not only ruin the lives of ordinary
         | people, whole companies and countries, as already happened
         | before...
        
       | martinald wrote:
       | This is going to be massive if it works as well as I suspect it
       | might.
       | 
       | I think many software engineers overlook how many companies have
       | huge (billion dollar) processes run through Excel.
       | 
       | It's much less about 'greenfield' new excel sheets and much more
       | about fixing/improving existing ones. If it works as well as
       | Claude Code works for code, then it will get pretty crazy
       | adoption I suspect (unless Microsoft beats them to it).
        
         | thewebguyd wrote:
         | > This is going to be massive if it works as well as I suspect
         | it might.
         | 
         | Until Microsoft does its anti-competitive thing and find a way
         | to break this in the file format, because this is exactly what
         | copilot in excel does.
         | 
         | That said, Copilot in Excel is pretty much hot garbage still so
         | anything will be better than that.
        
           | NotMichaelBay wrote:
           | What do you mean, what is copilot in excel doing exactly?
        
         | lm28469 wrote:
         | > I think many software engineers overlook how many companies
         | have huge (billion dollar) processes run through Excel.
         | 
         | So they can fire the two dudes that take care of it, lose 15
         | years of in house knowledge to save 200k a year and cry in a
         | few months when their magic tool shits the bed ?
         | 
         | Massive win indeed
        
           | bsenftner wrote:
           | If the company is half baked, those "two dudes" will become
           | indispensable beyond belief. They are the ones that
           | understand how Excel works far deeper, and paired with Claude
           | for Excel they become far far more valuable.
        
             | Balgair wrote:
             | At my org it more that these AI tools finally allow the
             | employees to get through things at all. The deadlines are
             | getting met for the first time, maybe ever. We can at last
             | get to the projects that will make the company money
             | instead of chasing ghosts from 2021. The burn down charts
             | are warm now.
        
           | brookst wrote:
           | You think it's better for the company to have "two dudes"
           | that are completely indispensable and whose work will be
           | completely useless if they die / leave?
           | 
           | I think you're making an argument _for_ LLMs, not against.
        
             | lm28469 wrote:
             | These two dudes can train the next generation, you know,
             | like we've been doing since humans exist... instead of
             | relying on some centralised point of failure somewhere
             | thousands of km away which might or might not break your
             | company whenever they decide to update something.
             | 
             | You're one of the people who saw nothing wrong with moving
             | all our industries to asia right ? "It's cheaper so it's
             | obviously better", if you don't think about any of the
             | externalities and long term consequences sure...
        
           | blitzar wrote:
           | Management have been executing this genius plan for decades
           | without Ai.
        
       | warthog wrote:
       | Tough day to be an AI Excel add-in startup
        
         | jonathanstrange wrote:
         | That seems to be true for any startup that offers a wrapper to
         | existing AIs rather than an AI on their own. The lucky ones
         | might be bought but many if not most of them will perish trying
         | to compete with companies that actually create AI models and
         | companies large enough to integrate their own wrappers.
        
           | warthog wrote:
           | Actually just wrote about this:
           | https://aimode.substack.com/p/openai-is-below-above-and-
           | arou...
           | 
           | not sure if it binary like that but as startups we will
           | probably collect the scraps leftover indeed instead
        
         | 8note wrote:
         | its a great time for your ai excel add-in to start getting
         | acquired by a claude competitor though
        
           | NotMichaelBay wrote:
           | Not OpenAI, though, because they already gave $14M to an AI
           | Excel add-in startup (Endex)
        
         | mitjam wrote:
         | Ask Rosie is actually shutting down right now:
         | https://www.askrosie.ai/
         | 
         | I would love to learn more about their challenges as I have
         | been working on an Excel AI add-in for quite some time and have
         | followed Ask Rosie from almost their start.
         | 
         | That they now gone through the whole cycle worries me I'm too
         | slow as a solo building on the side in these fast paced times.
        
       | intended wrote:
       | As an inveterate Excel lover, I can just sense the blinding pain
       | wafting off the legions of accountants, associates, seniors, and
       | tech people who keep the machine spirits placated.
       | 
       | lies, damn lies, statistics, and then Excel deciding cell data
       | types.
        
       | garyclarke27 wrote:
       | I guess Claude maybe useful for finding errors in large Excel
       | Workbooks. May also help beginners to learn the more complex
       | Excel functions (which are still pretty easy). But if you are
       | proficient at building Excel models I don't see any benefit.
       | Excel already has a superb very efficient UI for entering
       | formulas, ranges, tables, data sources etc I'm sceptical that a
       | different UI especially a text based one can improve on this.
        
         | proteal wrote:
         | I understand the sentiment about a skilled user not needing
         | this, but I think having a little buddy that I can use to
         | offload some menial tasks would be helpful for me to iterate
         | through my models more efficiently; even if the AI is not
         | perfect. As a highly skilled excel user, I admit the software
         | has terrible ergonomics. It would be a productivity boon for me
         | if an AI can help me stay focused on model design vs model
         | implementation.
        
         | intended wrote:
         | For some reason, I find that these tools are TERRIBLE at
         | helping someone learn. I suspect because turning one on,
         | results in turning the problem solving part of ones brain off.
         | 
         | Its obviously not the same experience for everyone. ( If you
         | are one of those energized while working in a chat window, you
         | might be in a minority - given what we see from the ongoing
         | massacre of brains in education. )
         | 
         | Paraphrasing something I read here "people don't use ChatGPT to
         | do learn more, they use it to study less".
         | 
         | Maybe some folk would be better off.
        
       | mattas wrote:
       | I'm not excited about having LLMs generate spreadsheets or
       | formulas. But, I think LLMs could be particularly useful in
       | helping me find inconsistent formulas or errors that are
       | challenging to identify. Especially in larger, complex
       | spreadsheets touched by multiple people over the course of
       | months.
        
         | thesuitonym wrote:
         | For once in my life, I actually had a delightful interaction
         | with an LLM last week. I was changing some text in an Excel
         | sheet in a very progromatic way that could have easily been
         | done with the regex functions in Excel. But I'm not really
         | great with regex, and it was only 15 or so cells, so I was
         | content to just do it manually. After three or four cells,
         | Copilot figured out what I was doing and suggested the rest of
         | the changes for me.
         | 
         | This is what I want AI to do, not generate wrong answers and
         | hallucinate girlfriends.
        
           | klausnrooster wrote:
           | Thanks for reminding me to check if the REGEXEXTRACT,
           | REGEXREPLACE, and REGEXTEST functions had landed for me yet.
           | They have! Good, because sometime in 2027 the library
           | providing RegEx in VBA will be yanked.
           | https://youtu.be/pGH9LdgkJio
        
         | bambax wrote:
         | One approach is to produce read-only data in BI tools: users
         | are free to export anything they want and make their own
         | spreadsheets, but those are for their own use only. Reference
         | data is produced every day by a central, controlled process and
         | cannot in any circumstance be modified by the end user.
         | 
         | I have implemented this a couple of times and not only does it
         | work well, it tends to be fairly well accepted. People need
         | spreadsheets to work on them, but generally they kind of hate
         | sending those around via email. Having a reference source of
         | data is welcomed.
        
       | gedy wrote:
       | Cool but now companies POs will be like "you must add the Excel
       | export for all the user data!" and when asked why, will basically
       | be "so I can do this roundabout query of data for some number in
       | a spreadsheet using AI (instead of just putting the number or
       | chart directly in the product with a simple db call)"
        
       | racl101 wrote:
       | This could be huge! Very exciting!
        
       | michaelmarkell wrote:
       | IMO, a real solution here has to be hybrid, not full LLM, because
       | these sheets can be massive and have very complicated structures.
       | You want to be able to use the LLM to identify / map column
       | headers, while using non-LLM tool calling to run Excel operations
       | like SUMIFs or VLOOKUPs. One of the most important traits in
       | these systems is consistency with slight variation in file
       | layout, as so much Excel work involves consolidating /
       | reconciling between reports made on a quarterly basis or produced
       | by a variety of sources, with different reporting structures.
       | 
       | Disclosure: My company builds ingestion pipelines for large
       | multi-tab Excel files, PDFs, and CSVs.
        
         | dcre wrote:
         | That's exactly what they're doing.
         | 
         | https://www.anthropic.com/news/advancing-claude-for-financia...
        
           | levocardia wrote:
           | "This won't work because (something obvious that engineers at
           | Anthropic clearly thought of already)"
        
             | michaelmarkell wrote:
             | Not really. Take for example:
             | 
             | item, date, price
             | 
             | abc, 01/01/2023, $30
             | 
             | cde, 02/01/2023, $40
             | 
             | ... 100k rows ...
             | 
             | subtotal. $1000
             | 
             | def, 03/01,2023, $20
             | 
             | "Hey Claude, what's the total from this file? > grep for
             | headers > "Ah, I see column 3 is the price value" >
             | SUM(C2:C) -> $2020 > "Great! I found your total!"
             | 
             | If you can find me an example of tech that can solve this
             | at scale on large, diverse Excel formats, then I'll
             | concede, but I haven't found something actually trustworthy
             | for important data sets
        
               | stevenhuang wrote:
               | That's a basic tool call that current models already can
               | do well. All the sql query generation LLMs can do this
               | for example.
        
         | sunnybeetroot wrote:
         | So more or less like what AI has been doing for the last couple
         | of years when it comes to writing code?
        
       | pdyc wrote:
       | I have just launched a product (easyanalytica.com) to create
       | dashboards from spreadsheets, and Excel is on my to-do list of
       | formats to be supported. However, I'm having second thoughts.
       | Although, from the description, it seems like it would be more
       | helpful on the modeling side rather than the presentation side. I
       | guess I'll have to wait until it's publicly available
        
         | sunnybeetroot wrote:
         | Why second thoughts?
        
           | pdyc wrote:
           | everyone will use claude if they support it why would they
           | use my product. so i will have to find some other angle to
           | differentiate.
        
       | causal wrote:
       | Seems everyone is speculating features instead of just reading
       | TFA which does in fact list features:
       | 
       | - Get answers about any cell in seconds: Navigate complex models
       | instantly. Ask Claude about specific formulas, entire worksheets,
       | or calculation flows across tabs. Every explanation includes
       | cell-level citations so you can verify the logic.
       | 
       | - Test scenarios without breaking formulas: Update assumptions
       | across your entire model while preserving all dependencies. Test
       | different scenarios quickly--Claude highlights every change with
       | explanations for full transparency.
       | 
       | - Debug and fix errors: Trace #REF!, #VALUE!, and circular
       | reference errors to their source in seconds. Claude explains what
       | went wrong and how to fix it without disrupting the rest of your
       | model.
       | 
       | - Build models or fill existing templates: Create draft financial
       | models from scratch based on your requirements. Or populate
       | existing templates with fresh data while maintaining all formulas
       | and structure.
        
         | Balgair wrote:
         | If this can reliably deal with the REF, VALUE, and NA problems,
         | it'll be worth it for that alone.
         | 
         | Oh and deal with dates before 1900.
         | 
         | Excel is a gift from God if you stay in its lane. If you ever
         | so slightly deviate, not even the Devil can help you.
         | 
         | But maybe, juuuuust maybe, AI can?
        
           | libraryatnight wrote:
           | "not even Devil can help you.
           | 
           | But maybe, juuuuust maybe, AI can?"
           | 
           | Bold assumption that the devil and AI aren't aligned ;)
        
             | lavishlibra0810 wrote:
             | The greatest trick the devil ever pulled was convincing the
             | world he didn't exist
        
               | ACCount37 wrote:
               | Nah, the greatest trick the devil ever pulled was
               | convincing the world that Machine Learning is a
               | legitimate field of study, and not just thinly veiled
               | demon summoning.
        
           | globular-toast wrote:
           | I feel similarly about MS Word. It can actually produce
           | decent documents _if_ you learn how to use it, in
           | particularly if you use styles consistently and never, ever
           | touch the bold, italic, colours etc. (outside of defining
           | said styles, although the defaults are probably all most
           | people need). Unfortunately I think the appeal of Word is
           | that you don 't have to learn this and it will just do what
           | you want. Is AI the panacea that will both do what you want
           | _and_ give you the right answers every time?
        
         | beefnugs wrote:
         | Also people complaining about AI inaccuracy are just technical
         | people that like precision. The vast majority of the world is
         | people who dont give a damn about accuracy or even correctness.
         | They just want to appear as if not completely useless to people
         | that could potentially affect their salary
        
           | lionkor wrote:
           | "just" technical people who like precision are the reason we
           | are here, typing this, and why lots of parts of our world is
           | pretty cool and comfortable. I wouldn't say that's useless
           | and "just" some people when it clearly is generating
           | unmistakable value
        
           | Yizahi wrote:
           | I can pretty reliably guess that approximately 100% of all
           | companies in the world use excel tables for financial data
           | and for processes. Ok, this was a joke. It's actually 99.99%
           | of all companies. One would think that financial data,
           | inventory and stuff like that should be damn precise. No?
        
             | fragmede wrote:
             | How precise do they really need to be? If there's 3 of a
             | widget on the shelf in the factory, and the factory uses
             | 1000 per day, is it crucial to know that there's 3 of them,
             | and not 0 or 50? Either way, the factory ain't running
             | today or tomorrow or until more of those things come in.
             | Similarly, what's $3 missing from an internal spreadsheet
             | when the company costs $5,000 an hour to operate (or $10
             | million a year). Obviously errors accumulate so the books
             | need to be reconciled, but apl that stuff only need to be
             | sufficiently directionally accurate with enough precision.
             | If precision is free, then sure, but if a good enough job
             | is cheaper? We all make that call every day.
        
               | Yizahi wrote:
               | If you have 2000 hectares of land you need to buy the
               | exact amount of seeds to sow them. If you buy less you
               | are losing money, if you buy more it is useless and you
               | are losing money. If you have trucks or other machinery
               | in the company you need to report exact amount of fuel
               | needed/used, or either they won't run or you lose money
               | on machinery missing fuel. If you need to tax a company,
               | it is pretty important if there are 100 tons of steel
               | used or 1000 tons. Or if the company has 5 factories to
               | be taxed or 15. Etc.
               | 
               | You are anthropomorphizing LLM programs, you assume that
               | if a number in a spreadsheed is big, then program can
               | somehow understand it that it is a big number and if it
               | will make an error it will be a small order error like a
               | human would make. Human process: "hmm, here is a
               | calculation where we divide our imports by number of
               | subsidiaries, let me estimate this in my head, ok, looks
               | like 7320." (actual correct answer was 7340, bu human
               | made a small, typical mistake in the math) LLM program
               | process: it literally uses heat maps and randomization to
               | arrive at each particular character in a row. So it may
               | be 7340, or it may be 8745632, or 1320, or whatever.
               | There is a comment here at a top, from another user,
               | where he queried LLM to make a change of value in the
               | document and it did it correctly. But at the same time it
               | replaced bank account number with a different bank
               | account number. Because to LLM it is the same - sixteen
               | digit in the field, or another sixteen digits in a field,
               | it is the same for LLM. Because it is not AI and doesn't
               | "understand" what it does.
        
               | fragmede wrote:
               | If you have 2000 hecatares land, there is no way you're
               | buying the exact right amount of seeds. You overbuy seeds
               | by as little as you can, but seeds get loaded via tractor
               | bucket, which is fairly messy. You're going to lose a
               | decent amount of seeds. Thus, a lb or kilo of seeds or <
               | 1% in the scheme of things isn't even going to be
               | noticed, much less cause the demise of your farm
               | 
               | For fuel, similarly, you're going to lose militers to
               | evaporation on a hot day, so similarly, being off my ml
               | isn't material.
               | 
               | If you tax a company, fine, sure, the company is going to
               | want it to be right, but 1 or two tons in a 10,000 ton
               | order is again, < 1%. There is some threshold below which
               | precision is extra unnecessary work, though if you have
               | problems with thieves and corruption, you're going to
               | want additional precision that isn't necessary elsewhere.
               | 
               | As to where in my comment I'm anthropomorphizing LLMs,
               | you're going to have to point out where I did that, as
               | the word LLM doesn't appear anywhere in my comment, so it
               | feels like you're projecting claims my comment does not
               | make, as it is LLM neutral and merely point of that 100%
               | exact precision doesn't come without a cost.
        
       | serf wrote:
       | Anthropic is in a weird place for me right now. They're growing
       | fast , creating little projects that i'd love to try, but their
       | customer service was so bad for me as a max subscriber that I set
       | an ethical boundary for myself to avoid their services until such
       | point that it appears that they care about their customers
       | whatsoever.
       | 
       | I keep searching for a sign, but everyone I talk to has horror
       | stories. It sucks as a technologist that just wants to play with
       | the thing; oh well.
        
         | cmrdporcupine wrote:
         | Best way to think of it is this: Right now you are not the
         | customer. Investors are.
         | 
         | The money people pay in monthly fees to Anthropic for even the
         | top Max sub likely doesn't come closer to covering the energy &
         | infrastructure costs for running the system.
         | 
         | You can prove this to yourself by just trying to cost out what
         | it takes to build the hardware capable of running a model of
         | this size at this speed and running it locally. It's tens of
         | thousands of dollars just to build the hardware, not even
         | considering the energy bills.
         | 
         | So I imagine the goal right now is to pull in a mass audience
         | and prove the model, to get people hooked, to get management
         | and talent at software firms pushing these tools.
         | 
         | And I guess there's some in management and the investment
         | community that thinks this will come with huge labour cost
         | reductions but I think they may be dreaming.
         | 
         | ... And then.. I guess... jack the price up? Or wait for
         | Moore's Law?
         | 
         | So it's not a surprise to me they're not jumping to try and
         | service individual subscribers who are paying probably a
         | fraction of what it costs them to the run the service.
         | 
         | I dunno, I got sick of paying the price for Max and I now use
         | the Claude Code tool but redirect it to DeepSeek's API and use
         | their (inferior but still tolerable) model via API. It's
         | probably 1/4 the cost for about 3/4 the product. It's actually
         | amazing how much of the intelligence is built into the tool
         | itself instead of just the model. It's often incredibly hard to
         | tell the difference bertween DeepSeek output and what I got
         | from Sonnet 4 or Sonnet 4.5
        
           | kridsdale1 wrote:
           | You are bang on.
           | 
           | Every AI company right now (except Google Meta and Microsoft)
           | has their valuations based on the expectation of a future
           | monopoly on AGI. None of their business models today or in
           | the foreseeable horizon are even positive let alone world-
           | dominating. The continued funding rounds are all apparently
           | based on expectation of becoming the sole player.
           | 
           | The continuing advancement of open source / open weights
           | models keeps me from being a believer.
           | 
           | I've placed my bet and feel secure where it is.
        
           | Wowfunhappy wrote:
           | I've been playing around with local LLMs in Ollama, just for
           | fun. I have an RTX 4080 Super, a Ryzen 5950X with 32 threads,
           | and 64 GB of system memory. A very good computer, but
           | decidedly consumer-level hardware.
           | 
           | I have primarily been using the 120b gpt-oss model. It's
           | definitely worse than Claude and GPT-5, but not by, like, an
           | order of magnitude or anything. It's also clearly better than
           | ChatGPT was when it first came out. Text generates a bit
           | slowly, but it's perfectly usable.
           | 
           | So it doesn't seem so unreasonable to me that costs could
           | come down in a few years?
        
             | cmrdporcupine wrote:
             | It's possible. Systems like the AMD AI Max 395+ with 128GB
             | RAM thing get close to being able to run good coding models
             | at reasonable speeds from what I hear. But, no, I'm given
             | to understand they couldn't run e.e. the DeepSeek 3.2 model
             | full size because there simply isn't enough GPU RAM still.
             | 
             | To build out a system that can, I'd imagine you're looking
             | at what... $20k, $30k? And then that's a machine that is
             | basically _for one customer_ -- meanwhile a Claude Code Max
             | or Codex Pro is $200 USD a month.
             | 
             | The math doesn't add up.
             | 
             | And once it _does_ add up, and these models can be
             | reasonable run on lower end hardware... then the moat
             | ceases to exist and there 'll be dozens of providers. So
             | the valuation of e.g. Anthropic makes little sense to me.
             | 
             | Like I said, I'm using the Claude Code tool/front-end
             | pointing against the page-per-use DeepSeek platform API, it
             | costs a fraction of what Anthropic is charging, and feels
             | to me like the quality is about 80% there... So ...
        
               | Wowfunhappy wrote:
               | > But, no, I'm given to understand they couldn't run e.e.
               | the DeepSeek 3.2 model full size because there simply
               | isn't enough GPU RAM still.
               | 
               | My RTX 4080 only has 16 GB of VRAM, and gpt-oss 120b is
               | 4x that size. It looks like Ollama is actually running
               | ~80% of the model off of the CPU. I was made to believe
               | this would be unbearably slow, but it's really not, at
               | least with my CPU.
               | 
               | I can't run the full sized DeepSeek model because I don't
               | have enough system memory. That would be relatively easy
               | to rectify.
               | 
               | > And once it does add up, and these models can be
               | reasonable run on lower end hardware... then the moat
               | ceases to exist and there'll be dozens of providers.
               | 
               | This is a good point and perhaps the bigger problem.
        
         | consumer451 wrote:
         | > I keep searching for a sign, but everyone I talk to has
         | horror stories. It sucks as a technologist that just wants to
         | play with the thing; oh well.
         | 
         | The reason that Claude Code doesn't have an IDE is because ~"we
         | think the IDE will obsolete in a year, so it seemed like a
         | waste of time to create one."
         | 
         | Noam Shazeer said on a Dwarkesh podcast that he stopped
         | cleaning his garage, because a robot will be able to do it very
         | soon.
         | 
         | If you are operating under the beliefs these folks have, then
         | things like IDEs, cleaning up, and customer service are stupid
         | annoyances that will become obsolete very soon.
         | 
         |  _To be clear, I have huge respect for everyone mentioned
         | above, especially Noam._
        
           | chairmansteve wrote:
           | "Noam Shazeer said on a Dwarkesh podcast that he stopped
           | cleaning his garage, because a robot will be able to do it
           | very soon".
           | 
           | How much is the robot going to cost in a year? 100k? 200k?
           | Not mass market pricing for sure.
           | 
           | Meanwhile, today he could pay someone $1000 to clean his
           | garage.
        
             | consumer451 wrote:
             | I would do it for free, just to answer the question of what
             | does a genius of his caliber have in his garage? Probably
             | the same stuff most people do, but it would still be
             | interesting.
             | 
             | I don't think the point was about having a clean space, it
             | was in response to a question along the lines of: when do
             | you think we will achieve AGI?
        
               | y-curious wrote:
               | Trust me, I'm a genius of his caliber. Want to clean my
               | garage? You free next week?
        
           | Thrymr wrote:
           | > Noam Shazeer said on a Dwarkesh podcast that he stopped
           | cleaning his garage, because a robot will be able to do it
           | very soon.
           | 
           | We all come up with excuses for why we haven't done a chore,
           | but some of us need to sound a bit more plausible to other
           | members of the household than that.
           | 
           | It would get about the same reaction as "I'm not going to
           | wash the dishes tonight, the rapture is tomorrow."
        
             | consumer451 wrote:
             | I want to make it very clear that this was a lighthearted
             | response from Noam to the "AGI timeline" question.
             | 
             | Noam does not do a lot of interviews, and I really hope
             | that stuff like my dumb comment does not prevent him from
             | doing more in the future. We could all learn a lot from
             | him. I am not sure that everyone understands everything
             | that this man has given us.
        
         | redhale wrote:
         | What happened? I'm a Max subscriber and I'd like to know what
         | to look out for!
        
         | informal007 wrote:
         | bad customer service comes from low priority. I think anthropic
         | prioritize new growth point over small number of customer's
         | feedback, that's why they publish new product, features so
         | frequently, there are so much possible potential opportunities
         | for them to focus
        
         | Yizahi wrote:
         | Customer service at B2C companies can only go downhill or stay
         | level. See Google, Apple, Microsoft etc. At B2B it maaaybe can
         | improve, but only when a ten times bigger customer strongarms a
         | company into doing it.
        
         | empiko wrote:
         | There is this homogenization happening in AI. No matter what
         | their original mission was, all the AI companies are now
         | building AI-powered gimmicks hoping to stumble upon something
         | profitable. The investors are waiting...
        
       | vjvjvjvjghv wrote:
       | Hope it's better than what MS is currently shipping as AI.
       | Everything I try to do something, the response is "sorry, I can't
       | do this".
        
         | smithkl42 wrote:
         | Copilot is getting better - I'm getting fewer of those than I
         | used to - but it's still significantly more stupid than other
         | agents, even when in theory it's using the same model.
        
       | throawayonthe wrote:
       | R.I.P. global economy
        
       | fudged71 wrote:
       | Interesting their X post mentions "pre-built Agent Skills" but
       | it's not on the webpage. I wonder if they will give you the
       | ability to edit/add/delete Skills, that would be phenomenal.
       | 
       | Edit: found it on their other blog post
       | https://www.anthropic.com/news/advancing-claude-for-financia...
        
         | luccasiau wrote:
         | You can add and customize skills in claude.ai and other
         | surfaces
        
       | Havoc wrote:
       | They can try, but doubt anyone serious will adopt it.
       | 
       | Tried integrating chatgpt into my finance job to see how far I
       | can get. Mega jikes...millions of dollars of hallucinated
       | mistakes.
       | 
       | Worse you don't have the same tight feedback loop you've got in
       | programming that'll tell you when something is wrong. Compile
       | errors, unit tests etc. You basically need to walk through
       | everything it did to figure out what's real and what's
       | hallucinations. Basically fails silently. If they roll that out
       | at scale in the financial system...interesting times ahead.
       | 
       | Still presumably there is something around spreadsheets it'll be
       | able to do - the spreadsheet equivalent of boilerplate code
       | whatever that may be
        
         | AppleBananaPie wrote:
         | I'm bad with spread sheets so maybe this is trivial but having
         | an llm tell me how to connect my sheet to whatever data I'm
         | using at the moment and it coming up with a link or sql query
         | or both has allowed me to quickly pull in data where I'd
         | normally eyeball it and move on or worst case do it partially
         | manually if really important.
         | 
         | It's like one off scripts in a sense? I'm not doing complex
         | formulas I just need to know how I can pull data into a sheet
         | and then I'll bucketize or graph it myself.
         | 
         | Again probably because I'm not the most adept user but it has
         | definitely been a positive use case for me.
         | 
         | I suspect my use case is pretty boilerplatey :)
        
           | Havoc wrote:
           | Good to know that it works well for that.
           | 
           | >I'm not doing complex formulas
           | 
           | Neither am I frankly. Finance stuff can get conceptually
           | complicated even with simple addition & multiplication
           | though. e.g. I deal with a lot of offshore stuff, so the
           | average spreadsheet is a mix of currencies, jurisdictions and
           | companies that are interlinked. I could probably talk you
           | through it high level in an hour with a pen & paper, but the
           | LLMs just can't see the forest for all the trees in the raw
           | sheet.
        
         | Culonavirus wrote:
         | AI slop eaters will still eat it up and ask for seconds. Pigs
         | in oats seeing dollar signs.
        
       | humanfromearth9 wrote:
       | This could be invaluable for reverse engineering complex
       | workbooks with multiple data sources and hundreds or thousands of
       | formulas.
        
         | pumnikol wrote:
         | If it has a concept of data sources and can digest them, sure.
         | Anecdotally, most issues with Excel at my job are caused by
         | data sources being renamed, moved or reformatted, by broken
         | logins, or by insufficient access rights.
        
       | keernan wrote:
       | If AI turns out to be the powerhouse it is claimed to be, AI's
       | impact will be corporations replacing corporate dependencies upon
       | 'Excel projects' created by self-taught assistants to department
       | managers.
        
       | travisgriggs wrote:
       | As I was reading through the post, and the comments here, and
       | pondering my own many hours with these tools, I was suddenly
       | reminded of one of my favorite studio C sketches: An Unfortunate
       | Fortune
       | 
       | https://www.youtube.com/watch?v=SF-psoWdSpo
       | 
       | Curious, if others see the connection. :D
        
       | davidpolberger wrote:
       | I'm a co-founder of Calcapp, an app builder for formula-driven
       | apps using Excel-like formulas. I spent a couple of days using
       | Claude Code to build 20 new templates for us, and I was blown
       | away. It was able to one-shot most apps, generating competent,
       | intricate apps from having looked at a sample JSON file I put
       | together. I briefly told it about extensions we had made to Excel
       | functions (including lambdas for FILTER, named sort type enums
       | for XMATCH, etc), and it picked those up immediately.
       | 
       | At one point, it generated a verbose formula and mentioned, off-
       | handedly, that it would have been prettier had Calcapp supported
       | LET. "It does!", I replied, "and as an extension, you can use :=
       | instead of , to separate names and values!") and it promptly
       | rewrote it using our extended syntax, producing a sleek formula.
       | 
       | These templates were for various verticals, like real estate,
       | financial planning and retail, and I would have been hard-pressed
       | to produce them without Claude's domain knowledge. And I did it
       | in a weekend! Well, "we" did it in a weekend.
       | 
       | So this development doesn't really surprise me. I'm sure that
       | Claude will be right at home in Excel, and I have already thought
       | about how great it would be if Claude Code found a permanent home
       | in our app designer. I'm concerned about the cost, though, so I'm
       | holding off for now. But it does seem unfair that I get to use
       | Claude to write apps with Calcapp, while our customers don't get
       | that privilege.
       | 
       | (I wrote more about integrating Claude Code here:
       | https://news.ycombinator.com/item?id=45662229)
        
       | unshavedyak wrote:
       | Dumb question, but is this Claude for Excel the.. app? The
       | webapp? Does it work on Google sheets? etc
       | 
       | There are quite a few spreadsheet apps out there, just curious
       | what their implementation is or how it's implemented to work with
       | multiple apps.
       | 
       | I always find Excel (and the Office ecosystem) confusing heh.
        
         | p_ing wrote:
         | Modern Excel add-ins work in desktop Windows, macOS, and web.
         | They're just a bit of XML that Excel looks at to call a
         | whatever web endpoint is defined in the XML.
        
       | rahimnathwani wrote:
       | How is this different from the existing Claude skill, that uses a
       | prompt and pandas to edit an Excel file?
       | 
       | https://github.com/anthropics/skills/blob/main/document-skil...
        
         | shooker435 wrote:
         | This isn't built for Excel users who use Github and Claude
         | Skills, it's built for Excel users who would run away from Git
         | commands.
        
           | rahimnathwani wrote:
           | The Claude skill I linked to is built into the Claude desktop
           | client. You just attach an Excel file to your chat and ask
           | away.
           | 
           | I linked to the skill prompt just to more clearly explain the
           | approach that's currently available to all Claude users.
           | 
           | It requires zero familiarity with git or command line.
        
       | mamonster wrote:
       | On the one hand, most financial companies have a lot of processes
       | in Excel that could be made better with something like Claude.
       | 
       | Banking secrecy laws + customer identifying data + AI tool = No
       | bueno.
        
       | grim_io wrote:
       | If this works very well and reliable, it might not kill
       | programming as such, but it might put a lot of small businesses
       | who do custom software for other small businesses out of work.
       | 
       | The HN bubble might not realize the implications.
        
       | surume wrote:
       | Checkmate, Altman
        
       | kaspermarstal wrote:
       | So cool, I hope they pull it off. So many people use Excel.
       | Although, I always thought the power of AI in Excel would come
       | from the ability to use AI _as_ a formula. For example,
       | =PROMPT("Classify user feedback as positive, neutral or
       | negative", A1). This would enable normal people (non-programmers)
       | to fire off thousands of prompts at once and automate workflows
       | like programmers do (disclaimer: I am the author of Cellm that
       | does exactly this). Combined with Excel's built-in functions for
       | deterministic work, Claude could really kill the whole copy-
       | pasting data in and out of chat windows for bulk-processing data.
        
         | starik36 wrote:
         | I can't wait until someone does this, then autofills 50k rows
         | down, then gets a $50k bill for all the tokens.
         | 
         | Reminds me of when our CIO insisted on moving to the cloud
         | (back when AWS was just getting started) and then was super
         | pissed when he got a $60k bill because no one knew to shutdown
         | their VMs when leaving for the day.
        
           | kaspermarstal wrote:
           | If someone is processing 50k rows, that means they found real
           | value and the UX is working. That's the whole point.
           | 
           | Also, 50k rows wouldn't cost $50k. More like $100 with Sonnet
           | 4.5 pricing and typical numbers of input/output tokens.
           | Imagine the time needed to go through 50k rows manually and
           | math doesn't really work for a horror story.
        
         | NotMichaelBay wrote:
         | You may already be aware but Microsoft recently released a
         | COPILOT() function that does this:
         | https://support.microsoft.com/en-us/office/copilot-function-...
        
           | kaspermarstal wrote:
           | Thanks, appreciate it. Indeed, and Anthropic did something
           | similar for Google sheets a year ago. I am dying to know why
           | they decided this should not be part of their excel effort.
           | They obviously put a lot of work and thought into claude for
           | excel so it must be intentional.
           | 
           | Anyone from Anthropic here that would like elaborate?
        
       | btown wrote:
       | From the signup form mentioning Private Equity / Venture Capital,
       | Hedge Fund, Investment Banking... this seems squarely aimed at
       | financial modeling. Which is really, really cool.
       | 
       | I've worked alongside sell-side investment bankers in a prior
       | startup, and so much of the work is in taking a messy set of
       | statements from a company, understanding the underlying
       | assumptions, and building, and rebuilding, and rebuilding,
       | 3-statement models that not only adhere to standard conventions
       | (perhaps best introed by
       | https://www.wallstreetprep.com/knowledge/build-integrated-3-... )
       | but also are highly customized for different assumptions that can
       | range from seasonality to sensitivity to creative deal
       | structures.
       | 
       | It is quite common for people to pull many, many all-nighters to
       | try to tweak these models in response to a senior banker or a
       | client having an idea! And one might argue there are way too many
       | similar-looking numbers to keep a human banker from
       | "hallucinating," much less an LLM.
       | 
       | But fundamentally, a 3-statement model and all its build-sheets
       | are a dependency graph with loosely connected human-readable
       | labels, and that means you can write tools that let an LLM crawl
       | that dependency graph in a reliable and semantically meaningful
       | way. And _that_ lets you build really cool things, really fast.
       | 
       | I'm of the opinion that giving small companies the ability to
       | present their finances to investors, the same way Fortune 500
       | companies hire _armies_ of bankers to do, is vital to a healthy
       | economy, and to giving Main Street the best possible chance to
       | succeed and grow. This is a massive step in the right direction.
        
         | JonChesterfield wrote:
         | Presenting your finances to investors via a tool designed for
         | generation of plausible looking data is fraud.
        
           | ceh123 wrote:
           | Presenting false data to investors is fraud, doesn't matter
           | how it was generated. In fact, humans are quite good at
           | "generating plausible looking data", doesn't mean human
           | generated spreadsheets are fraud.
           | 
           | On the other hand, presenting truthful data to investors is
           | distinctly not fraud, and this again does not depend on the
           | generation method.
        
             | alfalfasprout wrote:
             | If humans "generate plausible looking data" despite any
             | processes to ensure data quality they've likely engaged in
             | willful fraud.
             | 
             | An LLM doing so needn't even be willful from the author's
             | part. We're going to see issues with forecasts/slide decks
             | full of inaccuracies that are hard to review.
        
               | ceh123 wrote:
               | I think my main point is just because an LLM can lie,
               | doesn't necessarily mean an LLM generated slide is fraud.
               | It could very easily be correct and verified/certified by
               | the accountant and not fraud. Just cuz the text was
               | generated first by an LLM doesn't mean fraud.
               | 
               | That being said, oh for sure this will lead to more
               | incidental fraud (and deliberate fraud) and I'm sure it
               | already has. Would be curious to see the prevalence of
               | em-dash's in 10k's over the years.
        
             | lionkor wrote:
             | > doesn't matter how it was generated
             | 
             | is there precedent for this supposed ruling?
        
               | ceh123 wrote:
               | US v Simon 1969, see [0] for a review.
               | 
               | Establishes that accountants who certify financials are
               | liable if they are incorrect. In particular, if they have
               | a reason to believe they might not be accurate and they
               | certify anyway they are liable. And at this stage of
               | development it's pretty clear that you need to double
               | check LLM generated numbers.
               | 
               | Obviously no clue if this would hold up with today's
               | court, but I also wasn't making a legal statement before.
               | I'm not a lawyer and I'm not trying to pretend to be one.
               | 
               | [0] https://scholarship.law.stjohns.edu/cgi/viewcontent.c
               | gi?arti...
        
               | lionkor wrote:
               | Fascinating thank you for the link
        
           | Kydlaw wrote:
           | You might have accidentally described what accounting is.
        
           | btown wrote:
           | Completely understand the sentiment, but it doesn't apply
           | here, because what's being generated are formulas!
           | 
           | Standardized 3-statement models in Excel are designed to be
           | auditable, with or without AI, because (to only slightly
           | simplify) every cell is either a blue input (which must come
           | from standard exports of the company's accounting books,
           | other auditable inventory/CRM/etc. data, or a visible
           | hardcoded constant), or a black formula that cannot have
           | hardcoded values, and must be simple.
           | 
           | If every buyer can audit, with tools like this, that the
           | formulas match the verbal semantics of the model, there's
           | even less incentive than there is now to fudge the formula
           | level. (And with Wall Street conventions, there's nowhere to
           | hide a prompt injection, because you're supposed to keep
           | every formula to only a few characters, and use breakout
           | "build" rows that can themselves be visually audited.)
           | 
           | And sure, you could conceivably use any AI tool to generate a
           | plausible list of numbers at the input level, but that was
           | equally easy, and equally dependent on context to be
           | fraudulent or not, ever since that famous Excel 1990 elevator
           | commercial: https://www.youtube.com/watch?v=kOO31qFmi9A&t=61s
           | 
           | At the end of the day, the difference between "they want to
           | see this growth, let's fudge it" and "they want to see this
           | growth, let's calculate the exact metrics we need to hit to
           | make that happen, and be transparent about how that's
           | feasible" has always been a matter of trust, not technology.
           | 
           | Tech like this means that people who want to do things the
           | right way can do it as quickly as people who wanted to play
           | loose with the numbers, and that's an equalizer that's on the
           | right side of history.
        
       | ed_elliott_asc wrote:
       | I use excel but not for financial modelling, I'll use it
        
       | mainecoder wrote:
       | Yeah now tell the Auditors that the financial spreadsheet we have
       | here has AI touching it left and right. "I did not cook the books
       | I promise it is the AI that made our financials seem better than
       | they actually are trust me bro!", said Joe from Accounting.
        
       | JonChesterfield wrote:
       | The thing really missing from multi-megabyte excel sheets of
       | business critical carnage was a non-deterministic rewrite tool.
       | It'll interact excitingly with the industry standard of no
       | automated testing whatsoever.
       | 
       | I 100% believe generative AI can change a spreadsheet. Turn the
       | xslx into text, mutate that, turn it back into an xslx, throw it
       | away if it didn't parse at all. The result will look pretty
       | similar to the original too, since spreadsheets are great at
       | showing immediately local context and nothing else.
       | 
       | Also, we've done a pretty good job of training people that
       | chatgpt works great, so there's good reason for them to expect
       | claude for excel to work great too.
       | 
       | I'd really like the results of this to be considered negligence
       | with non-survivable fines for the reckless stupidity, but more
       | likely, it'll be seen as an act of god. Like all the other broken
       | shit in the IT world.
        
       | patife wrote:
       | Fodasse a Rows e pelo menos 3x melhor
        
         | supermalvo wrote:
         | 100%
        
       | gwbas1c wrote:
       | I wonder if this will be more/less useful than what we have with
       | AI in software development.
       | 
       | There's a lot less to understand than a whole codebase.
       | 
       | I don't do spreadsheets very often, but I can emphasize with
       | tracking down "Trace #REF!, #VALUE!, and circular reference
       | errors to their source in seconds." I once hit something like
       | that, and I found it a lot harder to trace a typical compiler
       | error.
        
       | wonderwonder wrote:
       | Been working with Claude Code lately and been pretty impressed.
       | If this works as well could be a nice add on. Its probably a
       | smart market to enter as Excel is essentially everywhere.
       | 
       | Just like Claude Code allows 1 dev to potentially do the work of
       | 2 or 3, I could see this allowing 1 accountant or operations
       | person to do the work of 2 or 3. Financial savings but human cost
        
       | NumberCruncher wrote:
       | On the first glance this seems to be a very bad idea. But re-
       | readig this:
       | 
       | > Get answers about any cell in seconds: Navigate complex models
       | instantly. Ask Claude about specific formulas, entire worksheets,
       | or calculation flows across tabs. Every explanation includes
       | cell-level citations so you can verify the logic.
       | 
       | this might just be an excellent tool for refactoring Excel sheets
       | into something more robust and maintainable. And making a bunch
       | of suits redundant.
        
       | lionkor wrote:
       | There's already a language for this, or multiple, that isn't
       | English. Not having to use this language is NOT going to make
       | anything better.
       | 
       | It will, however, make people resort more quickly to "I guess
       | it's just not possible if Claude can't figure it out".
        
       | teddyh wrote:
       | "Copilot in Excel is a global financial crisis waiting to
       | happen."
       | 
       | -- Zack Korman,
       | <https://x.com/ZackKorman/status/1974828240679166396>
        
       | ada1981 wrote:
       | Can we get it in Sheets?
        
         | frankacter wrote:
         | for sheets, Gemini exists already natively.
         | 
         | Alternatively, Perplexity Comet browser (or OpenAI Atlas) would
         | presumably provide sidebar functionality to act within your
         | spreadsheets.
        
       | alex43578 wrote:
       | On a related note, has anyone found a good local LLM option for
       | working with Excel files?
       | 
       | Here's my use case: I have a set of responses from a survey and
       | want to perform sentiment analysis on them, classify them, etc.
       | Ideally, I'd like to feed them one at a time to a local LLM with
       | a prompt like: "Classify this survey response as positive,
       | negative, or off-topic...etc".
       | 
       | If I dump the whole spreadsheet into ChatGPT, I found that
       | because of the context window, it can get "lazy"; while with a
       | local LLM, I could just literally prompt it one row at a time to
       | accomplish my goal, even if it takes a little longer in terms of
       | GPU and wall-clock time.
       | 
       | However, I can't find _anything_ that works off the shelf like
       | this. It seems like a prime use case for local models.
        
         | santadays wrote:
         | Don't know about excel, but for Google Sheets. You can ask
         | chatgpt to write you a appsscript custom function e.g
         | CALL_OPENAI. Then you can pass in variables into.
         | =CALL_OPEN("Classify this survey response as positive,
         | negative, or off-topic: "&A1)
        
           | thisguy47 wrote:
           | Sheets also has an `AI` formula now that you can use to
           | invoke Gemini models directly.
        
             | santadays wrote:
             | When I tried the Gemini/AI formula it didn't work very
             | well, gpt-5 mini or nano are cheap and generally do what
             | you want if you are asking something straightforward about
             | a piece of content you give them. You can also give a json
             | schema to make the results more deterministic.
        
         | sexy_seedbox wrote:
         | Cellm + Ollamma?
         | 
         | https://docs.getcellm.com/models/local-models
        
           | alex43578 wrote:
           | That looks like a great fit! Not sure how I missed it, but I
           | appreciate the link.
        
       | dosnem wrote:
       | Anyone understand how this could work? My mental model for llm is
       | predictive text but here how can it understand cell A1 which has
       | a string is the "header" for all values under it? How does it
       | learn to understand table data like that?
        
         | bonsai_spool wrote:
         | > Anyone understand how this could work? My mental model for
         | llm is predictive text but here how can it understand cell A1
         | which has a string is the "header" for all values under it? How
         | does it learn to understand table data like that?
         | 
         | I imagine it uses the new Agent Skills features
         | 
         | https://www.anthropic.com/news/skills
        
         | brookst wrote:
         | LLMs already understand table data. "Predictive text" is
         | somewhat true but so reductive that it leads to that kind of
         | misconception.
         | 
         | HN is going to mangle this but here's a quick table:
         | 
         | | Type of Horse | Average Height | Typical Color |
         | |----------------|----------------|-----------------| | Arabian
         | | 15 hh | Bay, Gray | | Thoroughbred | 16 hh | Chestnut, Bay |
         | | Clydesdale | 17.5 hh | Bay with White | | Shetland Pony |
         | 10.5 hh | Black, Chestnut |
         | 
         | And after a prompt "pivot the table so rows are colors":
         | 
         | | Typical Color | Type of Horse | Average Height | |-----------
         | -----|----------------------------------------|----------------
         | -------| | Bay | Arabian, Thoroughbred, Clydesdale | 15 hh, 16
         | hh, 17.5 hh | | Gray | Arabian | 15 hh | | Chestnut |
         | Thoroughbred, Shetland Pony | 16 hh, 10.5 hh | | Bay with White
         | | Clydesdale | 17.5 hh | | Black | Shetland Pony | 10.5 hh |
        
       | flowingfocus wrote:
       | Version control and meaningful diffs for .xlsx will be in high
       | demand in a few months
        
         | andyferris wrote:
         | Honestly those things are well past due - if this tips the
         | scales then I hope we can all benefit.
        
       | jprd wrote:
       | If Claude is going to work on the underpinning technology of
       | every business in the Capitalist world, We should let Claude
       | loose on the COBOL code out there too, I can't imagine anything
       | going wrong.
        
       | StarterPro wrote:
       | HA!
       | 
       | I've worked at MULTIPLE million dollar firms whose entire
       | business relies on 10 Excel workbooks that were created 30 years
       | ago by a person who is either passed on or retired.
       | 
       | Give users who aren't intimately knowledgeable with their source
       | material ai, and you're asking for trouble.
       | 
       | The undo function has a history limit.
       | 
       | The real issue is: at what point are we going to stop chasing
       | efficiency and profit at the sake of humanity?
       | 
       | Claude and OpenAI are built on stretched truths, stolen
       | creativity and what-if statements.
        
       | voidmain0001 wrote:
       | From their FAQ "Claude doesn't have advanced Excel capabilities
       | including pivot tables, conditional formatting, data validation,
       | data tables, macros, and VBA. We're actively working on these
       | features."
        
       | 6thbit wrote:
       | Last week OpenAI hired ex-investment bankers to train a model to
       | build financial models, now anthropic coming for excel.
       | 
       | Sounds like there's some sort of an AI race after finance people
       | and businesses?
        
       | SteveLauC wrote:
       | I really hope that all these kinds of integrations:
       | 
       | * Claude for Chrome * Gemini for Chrome * ChatGPT Atlas * ...
       | 
       | will be built on top of the ACP protocol, so that these "AI
       | extensions" to everything can become standardized
        
       | xouse wrote:
       | I'm decent at excel, but not amazing. I've tried again and again
       | to use LLMs including Claude to solve specific, small, well
       | defined problems in excel with a 0% success rate. My experience
       | so far has been if I can't do it LLMs can't either.
       | 
       | If LLMs are a 6/10 right now at basic coding then they're a 3/10
       | at excel from my experience.
        
         | NotMichaelBay wrote:
         | What kinds of problems in Excel are you trying to solve? Just
         | curious as I'm also building an AI Excel addin, as a side
         | project. :)
        
       | klausnrooster wrote:
       | I'd like to see it compete in the Financial Modeling World Cup,
       | say in Las Vegas this December. https://excel-esports.com
        
       | sherinjosephroy wrote:
       | Pretty cool idea -- AI inside spreadsheets makes sense since most
       | of our work already lives there. But I'm a bit cautious too --
       | spreadsheets are messy enough, and adding probabilistic AI could
       | make mistakes harder to spot. Useful if done right, risky if not.
        
       | user3939382 wrote:
       | Anthropic knows almost nothing about their own products and you
       | guys know even less.
        
       | anshulbhide wrote:
       | Just spent an hour trying to figure out how to create a waterfall
       | chart. ChatGPT's python interpreter failed.
       | 
       | If this works right, this could be a game changer.
        
         | fragmede wrote:
         | https://chatgpt.com/share/69005eec-6ee0-8009-a8d3-ebb1c30e72...
         | 
         | took me four prompts to do generate _a_ waterfall chart using
         | d3 js because it didn 't want to run it. obviously with real
         | numbers and not generated data, you'd need to check the results
         | thoroughly.
        
       | scrappyjoe wrote:
       | Maybe this is how we get code versioning for Excel.
       | 
       | Git LFS for workbook + the following prompt :
       | 
       | "Create a commit explains what has changed in the workbook since
       | the last commit. Be brief, but explain the change in business
       | terms as well as code change terms."
        
       | bugsense wrote:
       | That's it. 1T EV added.
        
       | d4rkp4ttern wrote:
       | Weird to see so much discussion when it's still behind a
       | waitlist. And it seems aimed at "enterprise" only
        
       | theshrike79 wrote:
       | The best thing that can come from this is unit tests for Excel.
       | 
       | LLMs work best when they can call tools (edit the sheet) and test
       | their results in a loop.
       | 
       | It's like the "value seek" thing Excel has had since forever;
       | "adjust these values until this cell is X"
       | 
       | Excel doesn't have any way to verify that every formula in that
       | 60k line sheet is correct and someone hasn't accidentally
       | replaced one with a static number for example.
        
         | filearts wrote:
         | In a previous professional life, I did financial modelling for
         | a big 4 accounting firm. We had tooling that allowed us to
         | visualize contiguous ranges of identical formulas (if you
         | convert formulas to R1C1 addressing, similar formulas have the
         | same representation). This allowed for overrides to stick out
         | like a sore thumb.
         | 
         | I suspect similar tools could be made for Claude and other LLMs
         | except that it wouldn't be plagued by the mind-numbing tedium
         | of doing this sort of audit.
        
       | hufdr wrote:
       | AI can definitely save time, but sometimes it hides the real
       | problems. Most spreadsheet issues aren't math errors they're
       | logic messes. Claude can fix your sheet, but it can't fix your
       | company culture.
        
       ___________________________________________________________________
       (page generated 2025-10-28 23:02 UTC)