[HN Gopher] Three kinds of AI products work
___________________________________________________________________
Three kinds of AI products work
Author : emschwartz
Score : 106 points
Date : 2025-11-16 16:56 UTC (6 hours ago)
(HTM) web link (www.seangoedecke.com)
(TXT) w3m dump (www.seangoedecke.com)
| 8organicbits wrote:
| > It's easy to verify changes by running tests or checking if the
| code compiles
|
| This is actually a low bar, when the agent wrote those tests.
| baxtr wrote:
| From the article:
|
| _> Summary
|
| By my count, there are three successful types of language model
| product:
|
| - Chatbots like ChatGPT, which are used by hundreds of millions
| of people for a huge variety of tasks
|
| - Completions coding products like Copilot or Cursor Tab, which
| are very niche but easy to get immediate value from
|
| - Agentic products like Claude Code, Codex, Cursor, and Copilot
| Agent mode, which have only really started working in the last
| six months
|
| On top of that, there are two kinds of LLM-based product that
| don't work yet but may soon:
|
| - LLM-generated feeds
|
| - Video games that are based on AI-generated content_
| Shebanator wrote:
| Author forgot about image, video, and music creation. These have
| all been quite successfully commercially, though maybe not as
| much artistically.
| carsoon wrote:
| Recent articles seem only to mean LLMs when they reference AI.
| There are tons of commercial usecases for other models. Image
| Classification models, Image Generation models (traditionally
| difusion models, although some do use llm for image now), TTS
| models, Speach Transcription, translation models, AI driving
| models(autopilot), AI risk assessment for fraud, 3D structural
| engineering enhancement models.
|
| With many of the good usecases of AI the end user doesn't know
| that ai exists and so it doesn't feel like there is AI present.
| qayxc wrote:
| > With many of the good usecases of AI the end user doesn't
| know that ai exists and so it doesn't feel like there is AI
| present.
|
| This! The best technology is the one that you don't notice
| and that doesn't get in the way. A prominent example is the
| failure of the first generation of smart phones: they only
| took off once someone (Apple) managed to the hide OS and its
| details properly from the user. We need the same for AI -
| chat is simply not a good interface for every use case.
| wongarsu wrote:
| This seems to be biased heavily towards products that look like
| an LLM. And yes, only a small number of those work. But that's
| because if your product is a thing I chat with, it immediately is
| in competition with ChatGPT/Claude/Grok/etc, leading to
| everything the article expressed. But those are hardly the only
| use cases for LLMs, let alone AI (whatever people nowadays mean
| by AI)
|
| To name some of the obvious counter-examples, Grammarly and Deepl
| are both AI (and now partially LLM-based) products that don't fit
| any of the categories in the post, but seem pretty successful to
| me. Lots of successful applications of Vison-LLMs in document
| scanning too, whether you are deciphering handwritten text or
| just trying to get structured data out of pdfs.
| themanmaran wrote:
| Perhaps I'm biased since we're in a document heavy industry,
| but I think the original post misses a lot of the non-tech
| company use cases. An insane percentage of human time is spent
| copy pasting things from documents.
| dbreunig wrote:
| Agree. I bucket things into three piles:
|
| 1. Batch/Pipeline: Processing a ton of things, with no
| oversight. Document parsing, content moderation, etc.
|
| 2. AI Features: An app calls out to an AI-powered function.
| Grammarly might pass out a document for a summary, a CMS
| might want to generate tags for a post, etc.
|
| 3. Agents: AI manages the control flow.
|
| So much of discussion online is heavily focused towards
| agents so that skews the macro view, but these patterns are
| pretty distinct.
| echelon wrote:
| > One other thing I haven't mentioned is image generation. Is
| this part of a chatbot product, or a tool in itself? Frankly, I
| think AI image generation is still more of a toy than a
| product, but it's certainly seeing a ton of use. There's
| probably some fertile ground for products here, if they can
| successfully differentiate themselves from the built-in image
| generation in ChatGPT.
|
| This guy is so LLM-biased that he's missing the entire media
| gen ecosystem.
|
| I feel like image, video, music, voice, and 3D generation are a
| much bigger deal than text. Text and code are mundane compared
| to rich signals.
|
| These tools are production ready today and can accomplish
| design, marketing, previz, concept art, game assets, web
| design, film VFX. It's incredibly useful. As a tool. Today.
|
| Don't sleep on generative media.
| taherchhabra wrote:
| I am building one such tool. Flickpseed.ai Give it a shot
| echelon wrote:
| Hope you don't mind the unsolicited feedback -
|
| ComfyUI-inspired node graphs are the wrong approach for
| visual media. Nodes are great for the 1% of artists that
| get into it, but you really need to build the Adobe / Figma
| of image and video tools. Not Unreal Engine Blueprint /
| ComfyUI spaghetti.
|
| ShaderToy and TouchDesigner and Comfy are neat toys, but
| they're not what the majority of people will use.
|
| We want to mold ideas like clay.
|
| Watch the demos Adobe just gave from their conference two
| weeks ago. That's what you should build. Something artists
| and designers and creatives intuit as an extension of
| themselves. Not a mathematical abacus.
| msabalau wrote:
| Yeah, the "normal" people I know, use AI in Grammarly or Adobe
| Express, or a astonished and delighted by NotebookLM, mostly
| because of the audio overviews--but also because grounding chat
| with sources gets you better, focused chat.
|
| And, outside of chat, it's less clear that that big labs win
| all the time. People who care about making films, rather than
| video memes, often look to Kling or Runway, not just Sora.
| People who want to make images often have a passion for
| Midjourney that I've never seen for ImageFX.(Nanobanna for
| editing often sparks joy, so a big lab can play successfully in
| such a space, but that is diffferent from saying it is destined
| to win.)
| bix6 wrote:
| On agents it's interesting but not surprising coding has seen so
| much initial success.
|
| Personally I'm waiting for better O365 and SharePoint agents. I
| think there's a lot of automation and helper potential there.
| airstrike wrote:
| I'm building an opinionated take on this. It's shaping up
| nicely.
|
| If you're a Rust developer reading this, interested in AI + GUI
| + Enterprise SaaS, and wants to talk, I'm building a team as we
| speak. E-mail in profile.
| bix6 wrote:
| So like an o365 ServiceNow?
| esseph wrote:
| At this point MS should probably sunset SharePoint and try
| again.
| bix6 wrote:
| How come?
| torlok wrote:
| So the only AI products that work is a chat bot you can talk to,
| or a chat bot that can perform tasks for you. Next thing you'll
| tell me is that the only businesses that work are ones where you
| can ask somebody to do something for you in exchange for money.
| owenpalmer wrote:
| > Next thing you'll tell me is that the only businesses that
| work are ones where you can ask somebody to do something for
| you in exchange for money.
|
| What other type of business is there?
| hobs wrote:
| That is the joke.
| gordonhart wrote:
| The best kind of businesses are the ones I don't have to
| ask; they've already built a better product than what I
| would have asked for. That's kinda the point the OP is
| making about chat vs a [good] dedicated interface.
| ohyoutravel wrote:
| Realistically there are only four types of businesses writ
| large: tourism, food service, railroads, and sales. People
| building AI-based products should focus on those verticals.
| lelandbatey wrote:
| Really only two kinds:
|
| - Energy generation and
|
| - Expending energy to convince the folks generating energy to
| give you money for activating their neurons (food service,
| entertainment, tourism, transportation, sales).
|
| Any other fun ways to compartmentalize an economy?
| tehjoker wrote:
| Not shown: any activity involved in production, science, or
| healthcare just off the top of my head
| gervwyk wrote:
| lol. would love an episode on how Micheal and Dwight responds
| to Jims Ai slop.
| cpill wrote:
| Games? One of the biggest industries, I mean verticals, in
| the world?
| alickz wrote:
| The only GUI products that work are GUIs that you can interface
| with, or that perform tasks for you
|
| Maybe the real value of AI, particularly LLMs, is in the
| interface it provides for other things, and not in the AI
| itself
|
| What if AI isn't the _thing_? What if it's the thing that gets
| us _to_ the thing?
| theptip wrote:
| I think this is kind of like saying "Only three kinds of internet
| products work, SaaS, webpages, and mobile apps"
|
| At the level of granularity selected, maybe true. But too coarse
| to make any interesting distinctions or predictions.
| Aldipower wrote:
| In my current project the agent (GPT-5) isn't helpful at all.
| Damn thing, lying all the time to me.
| kevin_thibedeau wrote:
| They're idiot savants. Use them for their strengths. Know their
| weaknesses.
| Aldipower wrote:
| So, what are their strengths then? I've fed it with a
| detailed, very well documented and typed API description.
| Asking to construct me some not too hard code snippets based
| on that. GPT-5 then pretend to do the right thing, but
| actually is creating meaningless nonsense out of it. Even
| after I tried to reiterate and refine my tasks. Every junior
| dev is waaay better.
| ohyoutravel wrote:
| Parsing a thousand line stack trace and telling me what the
| problem was. Writing regexes. Spitting out ffmpeg commands.
| gherkinnn wrote:
| I recently had something no longer compile. I got bored
| sniffing around after maybe an hour, set Claude in Zed on
| to it, got a snack, and by the time I was back it had found
| the problem.
|
| When I am unsure how to implement something, I give an LMM
| a rough description and then tell it to ask me five
| questions it needs to get a good solution. More often than
| not, that uncovers a blind spot.
|
| LLMs remain unhelpful at writing code beyond trivial tasks
| though.
| skerit wrote:
| The article claims Claude Sonnet 3.5 was released less than 9
| months ago, but this is wrong.
|
| Claude 3.5 was released in june 2024.
|
| Maybe he has been writing this article for a while, maybe he
| meant Claude Code or Claude 4.0
| simonw wrote:
| He meant Sonnet 3.7 which was released on the same day as
| Claude Code, Feb 24th 2025:
| https://www.anthropic.com/news/claude-3-7-sonnet
|
| With hindsight, given that Claude Code turned into a billion
| dollar precut category, it was a bit of a miss bundling those
| two announcements together like that!
| SirensOfTitan wrote:
| I've been working on a learning / incremental reading tool for a
| while, and I've found LLM and LLM adjacent tech useful, but as
| ways of resolving ambiguity within a product that doesn't
| otherwise show any use of LLM. It's like LLM-as-parser.
| owenpalmer wrote:
| Is there somewhere I can try the tool out? I'm interested in
| that kind of thing.
| zkmon wrote:
| >> in five years time most internet users will spend a big part
| of their day scrolling an AI-generated feed.
|
| Yep. Looking forward to the future where you can eat plastic pop-
| corn while watching the AI-generated video feeds.
| pixl97 wrote:
| Why 5 years, I'm pretty sure we're there today.
| jongjong wrote:
| Yeah the popcorn probably has microplastics in it.
| vorticalbox wrote:
| By Ai generated feeds do you mean a feed that is just full of
| AI posts or an AI generating a feed to one can scroll?
| koliber wrote:
| A few more seem to work as well, because I've used them and found
| them valuable
|
| - human language translation
|
| - summarization
|
| - basic content generation
|
| - spoken language transcription
| loloquwowndueo wrote:
| > basic content generation
|
| Dunno, man, I can spot ai-generated content a mile away, it
| tends to be incredibly useless so once I spot it, I'll run in
| the opposite direction.
| HelloUsername wrote:
| > _once_ I spot it
|
| Exactly; pretty sure you've seen media or read text that you
| thought was human created..
| carsoon wrote:
| You spot bad ai content. Since there is no button that will
| tell you if something was Ai generated you never know if what
| you read was/wasn't.
| loloquwowndueo wrote:
| Wow didn't take long for the machine fanbois to show up.
| koliber wrote:
| I hate what LLM spit out and would never accept the whole
| output verbatim.
|
| I love how they occasionally come up with a turn of phrase, a
| thought path, or surprising perspective. I work with them
| iteratively to brainstorm, transform, and crate compose
| content that I incorporate into my own work.
|
| Regarding spotting AI-generate content, I was once accused of
| posting AI-generated content where I bona-fide typed every
| single letter myself without as much as glancing at an LLM.
| People's confidence in spotting AI content will vary and err
| on fake-positives and fake-negatives too. My kids now think
| all CG movies are AI generated, even the ones that pre-date
| image and video gen. They're pretty sure it's AI though.
| thewebguyd wrote:
| I've also found LLMs helpful for breaking down user requests
| into a technical spec or even just clarifying requests.
|
| I make a lot of business reporting where I work and dashboards
| for various things. When I get user requests for data, it's
| rarely clear or well thought out. They struggle with
| articulating their actual requirements and usually leads to a
| lot of back and forth emails or meetings and just delays things
| further.
|
| I now paste their initial request emails into an LLM and tell
| it "This is what I think they are trying to accomplish,
| interpret their request into defined business metrics" or
| something similar and it does a pretty good job and saves a ton
| of the back and forth. I can usually then feed it a sample json
| response or our database schema and have it also make something
| quick with streamlit.
|
| It's saved me (and the users) a ton of time and headaches of me
| trying to coerce more and more information from them, the LLMs
| have been decent enough at interpreting what they're actually
| asking for.
|
| I'd love to see a day where I can hook them up with RO access
| to a data warehouse or something and make a self-service tool
| that users can prompt and it spits out a streamlit site or
| something similar for them.
| notatoad wrote:
| > summarization
|
| can you point me to a useful example of this? i see websites
| including ai-generated summaries all the time, but i've yet to
| see one that is actually useful and it seems like the product
| being sold here is simply "ai", not the summary itself - that
| is, companies and product managers are under pressure to
| implement some sort of AI, and sticking summaries in places is
| a way for them to fill that requirement and be able to say
| "yes, we have AI in our product"
| koliber wrote:
| I sometimes get contracts, NDAs, or terms and conditions
| which normally I would automatically accept because they are
| low stakes and I don't have time to read them. At best I
| would skim them.
|
| Now I pass them through an LLM and ask them to point out
| interesting, unconventional, or surprising things, and to
| summarize the document in a few bullet points. They're quite
| good at this, and I am can use what I discover later in my
| relationship with the counterparty in various ways.
|
| I also use it to "summarize" a large log output and point out
| the interesting bits that are relevant to my inquiry.
|
| Another use case is meeting notes. I use fireflies.ai for
| some of my meetings and the summaries are decent.
|
| I guess summarization might not be the right word for all the
| cases, but it deals with going through the hay stack to find
| the needle.
| gregates wrote:
| Do you go through the haystack yourself first, find the
| needle, and then use that to validate your hypothesis that
| the AI is good at accomplishing that task (because it
| usually finds the same needle)? If not, how do you know
| they're good at the task?
|
| My own experience using LLMs is that we frequently disagree
| about which points are crucial and which can be omitted
| from a summary.
| koliber wrote:
| It depends on how much time I have, and how important the
| task is. I've been surprised and I've been disappointed.
|
| One particular time I was wrestling with a CI/CD issue. I
| could not for the life of me figure it out. The logs were
| cryptic and there was a lot of them. In desperation I
| pasted the 10 or so pages of raw logs into ChatGPT and
| asked it to see if it can spot the problem. It have me
| three potential things to look at, and the first one was
| it.
|
| By directing my attention it saved me a lot of time.
|
| At the same time, I've seen it fail. I recently pasted
| about 10 meetings worth of conversation notes and asked
| it to summarize what one person said. It came back with
| garbage, mixed a bunch of things up, and in general did
| not come up with anything useful.
|
| In some middle-of-the road cases, what you said mirrors
| my experience: we disagree what is notable and what is
| not. Still, this is a net positive. I take the stuff it
| gives me, discard the things I disagree on, and at least
| I have a partial summary. I generally check everything it
| spits out against the original and ask it to cite the
| original sources, so I don't end up with hallucinated
| facts. It's less time than writing up a summary myself,
| and it's the kind of work that I find more enjoyable than
| typing summaries.
|
| Still, the hit to miss ration is good enough and the time
| savings on the hits are impressive so I continue to use
| it in various situations where I need a summary or I need
| it to direct my attention to something.
| gregates wrote:
| I really don't see how it can save you time if you have
| to summarize the same source for yourself every time in
| order to learn whether the AI did a good job in this
| particular case.
| notatoad wrote:
| for your first one, if you're just feeding docs into a
| chatbot prompt and asking for a summary, i think that
| matches what the article would call a "chatbot product"
| rather than a summarization product.
|
| fireflies.ai is interesting though, that's more what i was
| looking for. i've used the meeting summary tool in google
| meet before and it was hilariously bad, it's good to hear
| that there are some companies out there having success with
| this product type.
| koliber wrote:
| I guess you're right re chatbot for summaries. I was
| thinking about the use case and not the whole integrated
| product experience.
|
| For example, for code gen I use agents like Claude Code,
| one-shot interfaces like Codex tasks, and chatbots like
| the generic ChatGPT. It depends on the task at hand, how
| much time I have, whether I am on the phone or on a
| laptop, and my mood. It's all code gen though.
| aunty_helen wrote:
| We built a system that uses summaries of video clips to build
| a shorts video against a screenplay. Customer was an events
| company. So think 15 minute wedding highlights video that has
| all of the important parts to it, bride arrival, ring
| exchange, kiss the bride, first dance, drunken uncle etc
| renewiltord wrote:
| The classic problem that online commenters face is that they only
| know products that are on Hacker News and Reddit. And I get why.
| Not being plugged into anything the only way to get information
| is social media so you only know social media.
|
| E.g. https://www.thomsonreuters.com/en/press-
| releases/2025/septem...
|
| B2B AI company, 2 years in sold for hundreds of millions, not an
| agent, chatbot, or completion. Do you know it exists? No. You
| only read Hacker News. How could you know?
| Dilettante_ wrote:
| Additive's GenAI-native platform streamlines the repetitive,
| time-consuming task of ingesting and parsing pass-through
| entity documents
|
| From TFA: There's another kind of agent that
| isn't about coding: the research agent. LLMs are particularly
| good at tasks like "skim through ten pages of search results"
| or "keyword search this giant dataset for any information on a
| particular topic".
| PopAlongKid wrote:
| >The company's [Additive] technology automates complex tasks
| such as extracting footnotes from K-1s, K-3s, and related
| forms, so every staff member can become a reviewer and complete
| work that used to take weeks in a matter of hours.
|
| Any tax professional who takes weeks to enter footnote info
| from a K-1 form into their professional tax prep software is
| probably just as bad at other job-related tasks and either
| needs more training or to find another job.
| larodi wrote:
| More than three kinds are then actually listed in the article
| shermantanktop wrote:
| Formatting did not help. Three kinds, but then subheadings in
| the same size font, and then here come two more kinds, plus a
| side journey into various topics.
| adammarples wrote:
| >I think there are serious ethical problems with this kind of
| product.
|
| Unless there are serious ethical problems with people generating
| arbitrary text ie. Writing - then no there isn't
| bob1029 wrote:
| Chatbot is the only one I agree with (human in the loop).
|
| Agents are essentially the chatbot, but without the human in the
| loop. Chatbot without human in the loop is a slop factory. Things
| like "multi-agent systems" are a clever ploy to get you to burn
| tokens and ideally justify all this madness.
|
| Copilot/completion does not work in business terms for me. It
| _looks like_ it works and it might _feel like_ it 's working in
| some localized technical sense, but it does not actually work on
| strategic timescales with complex domains in such a way that a
| customer would eventually be willing to pay you money for the
| results. The hypothesis that work/jobs will be created due to
| sloppy AI is proving itself out very quickly. I think
| "completion" tools like classic IntelliSense are still at the
| peak of efficiency.
| mrweasel wrote:
| Chatbot in many environment simply doesn't work, because we
| won't let them and if we did, they'd be agents. Here I'm mostly
| thinking in terms of things like customer service chats. A
| chatbot that can't reach into other systems are essentially
| only useful for role playing.
|
| The copilot/completion thing also doesn't work for me. I have
| no doubt that a lot of developers are having a lot of benefits
| from the coding LLMs, but I can't make them work.
|
| I think one glaring obvious missing kind of AI is medical image
| recognition, which is already deployed and working in many
| scenarios.
| happyopossum wrote:
| Very myopic view here - agents are turning out useful output in
| many fields outside of coding..
| Xiol wrote:
| Such as?
| carsoon wrote:
| Legal seems to be a big usecase for AI. I think more for
| simplification and classification versus generation though.
| ZeroConcerns wrote:
| Well, the elephant in the room here is that the generic AI
| product that is being promised, i.e. "you get into your car in
| the morning, and on your drive to the office dictate your
| requirements for _one_ of the apps that is going to guarantee
| your retirement, in order to find it _completely done_ , rolled
| out to all the app stores and making money already once you
| arrive" isn't happening anytime soon, if ever, yet everyone
| pretty much is acting like it's already there.
|
| Can "AI" in its current form deliver value? Sure, and it
| _absolutely does_ but it 's more in the form of "several hours
| saved per FTE per week" than "several FTEs saved per week".
|
| The way I currently frame it: I have a Claude 1/2-way-to-the-Max
| subscription that costs me 90 Euros a month. And it's absolutely
| worth it! Just today, it helped me debug and enhance my iSCSI
| target in new and novel ways. But is it worth double the price?
| Not sure yet...
| madeofpalk wrote:
| The other part to this is that LLMs as a technology definitely
| has _some_ value as a foundation to build features /products on
| other than chat bots. But unclear to be whether that value can
| sustain current valuations.
|
| Is a better de-noisier algorithm in Adobe Lightroom worth $500
| billion?
| ansgri wrote:
| A bit off-topic, but denoise in LR is like 3 years behind the
| purpose-built products like Topaz, so a bad example. They've
| added any ML-based denoise to it when, like a year ago?
| ZeroConcerns wrote:
| > Is a better de-noisier algorithm in Adobe Lightroom worth
| $500 billion?
|
| No.
|
| But: a tool that allows me to de-noise some images, just by
| uploading a few samples and describing what I want to change,
| just might be? Even more so, possibly, if I can also upload a
| desired result and let the "AI" work on things until it
| matches that?
|
| But also: cool, that saves me several hours per week! Not:
| oh, wow, that means I can get rid of this entire
| department...
| pixl97 wrote:
| Skeptics always like to toss in 'if ever' as some form of
| enlightenment they they are aware of some fundamental
| limitation of the universe only they are privy to.
| mzajc wrote:
| Of the universe, perhaps, but humans certainly are a limiting
| factor here. Assuming we get this technology someday, why
| would one buy your software when the mere description of its
| functionality allows one to recreate it effortlessly?
| pixl97 wrote:
| >humans certainly are a limiting factor here.
|
| Completely disagree. Intelligence is self reinforcing. The
| smarter we get as humans the more likely we'll create
| sources of intelligence.
| falseprofit wrote:
| Let's say there are three options: {soon, later, not at all}.
| Ruling out only one to arrive at {later, not at all} implies
| less knowledge than ruling out two and asserting {later}.
|
| Awareness of a fundamental limitation would eliminate
| possibilities to just {not at all}, and the phrasing would be
| "never", rather than "not soon, if ever".
| pixl97 wrote:
| But we know that the fundamental limitation of intelligence
| does not exist, nature has already created that with animal
| and eventually human intelligence via random walk. So 'AI
| will never exist' is lazy magical thinking. That
| intelligence can be self reinforcing is a good reason why
| AI will exist much sooner than later.
| madeofpalk wrote:
| Theorising something will exist before the heat death of the
| universe isn't really interesting.
| adastra22 wrote:
| Agentic tools is already delivering an increase in productivity
| equivalent to many FTEs. I say this as someone in the position
| of having to hire coders and needing far fewer than we
| otherwise would have.
| ZeroConcerns wrote:
| Well, yeah, as they say on Wikipedia: {{Citation Needed}}
|
| _Can_ AI-as-it-currently-is save FTEs? Sure: but, again,
| there 's a template for that: {{How Many}} -- 1% of your org
| chart? 10%? In my case it's around 0.5% right now.
|
| Or, to reframe it a bit: can AI pay Sam A's salary? Sure! His
| stock options? Doubtful. His future plans? Heck nah!
| adastra22 wrote:
| 400-800%. That is to say, I am hiring 4x-8x fewer
| developers for the same total output (measured in burn down
| progress, not AI-biased metrics like kLOC).
| vorticalbox wrote:
| I use mongo at work and LLM helped me find index issues.
|
| Feeding it the explain, query and current indexes it can
| quickly tell what it was doing and why it was slow.
|
| I saved a bunch time as I didn't have to read large amounts of
| json from explain to see what is going on.
| ebiester wrote:
| > Can "AI" in its current form deliver value? Sure, and it
| absolutely does but it's more in the form of "several hours
| saved per FTE per week" than "several FTEs saved per week".
|
| Yes but...
|
| First, what we're seeing with coding is that it is just
| exposing the next bottleneck quickly. The bottlenecks are
| always things that don't lend themselves to LLMs yet.
|
| Second, that still can mean 4 hours a week for 20-50 bucks. At
| US white collar wages, that might mean 8 people are needed
| rather than 9. In profit centers that's more budget for
| advancing goals. At cost centers, though, that's a reduction in
| headcount.
| websap wrote:
| > Users simply do not want to type out "hey, can you increase the
| font size for me" when they could simply hit "ctrl-plus" or click
| a single button3.
|
| I would def challenge this. "Turn off private relay", "send this
| photo to X", "Add a pit stop at a coffee shop along the way" are
| all voice commands I would love to use
| chrisweekly wrote:
| Yes, this! esp the last one. Finding coffee shop / restaurant
| options ALONG THE WAY seems like it should've been solved years
| ago. Scenario: while driving, "want to eat in about an hour,
| must have vegetarian options, don't add more than 10m extra
| drive time" and get a shortlist to pick from.
| hencq wrote:
| Yeah that one is surprisingly difficult even with a Human
| Intelligence in the passenger seat.
| Mikhail_Edoshin wrote:
| Old Apple Newton had a feature, I don't remember how they
| called it, but on any screen you could write "please", and then
| describe what to do, e. g. using one of their examples: "please
| fax this to Bob". And it worked. Internally it was a rather
| simple keyword match plus access to data, such as the system
| address book. New applications could register their own names
| for actions and relevant dictionaries.
| levocardia wrote:
| Very obviously missing the mundane agentic work. I think the
| following things are basically already solved, and are just
| waiting for the right harness:
|
| - Call this government service center, wait on hold for 45
| minutes, then when they finally answer, tell them to reactivate
| my insurance marketplace account that got wrongly deleted.
|
| - Find a good dentist within 2mi from my house, call them to make
| sure they take my insurance, and book an appointment sometime in
| the next two weeks no earlier than 11am
|
| - Figure out how I'm going to get from Baltimore to Boston next
| Thursday, here's $100 and if you need more, ask me.
|
| - I want to apply a posterizing filter in photoshop, take control
| of my mouse for the next 10sec and show me where it is in the
| menu
|
| - Call that gym I never go to and cancel my membership
| input_sh wrote:
| Basically already solved = you've never used it for any of
| those purposes and have no idea if or how well would they work?
| irq-1 wrote:
| > - Find a good dentist within 2mi from my house, call them to
| make sure they take my insurance, and book an appointment
| sometime in the next two weeks no earlier than 11am
|
| The web caused dentists to make websites, but they don't post
| their appointment calendar; they don't have to.
|
| Will AI looking for appointments cause businesses to post live,
| structured data (like calendars)? The complexity of scheduling
| and multiple calendars is perfect for an AI solution. What
| _other_ AI uses and interactive systems will come soon?
|
| - Accounting: generate balance sheets, audit in real-time, and
| have human accountants double check it (rather than doing)
|
| - Correspondence: create and send notifications of all sorts,
| and consume them
|
| - Purchase selection: shifting the lack of knowledge about
| products in the customers favor
|
| - Forms: doing taxes or applying for a visa
| qayxc wrote:
| The problem is that we're reverting back to the stone age by
| throwing unnecessary resources at problems that have a simple
| and effective solution: open, standardised, and accessible
| APIs.
|
| We wouldn't need to use an expensive (compute-wise) AI agent
| to do things like making appointments. Especially if in the
| end you'd end up with bots talking to bots anyway. The
| digital equivalent of always up-to-date yellow pages would
| solve many of these issues. Super simple and "dumb" but
| reliable programs could perform such tasks.
|
| Scheduling multiple calendars doesn't require "AI" - it's a
| comparatively simple optimisation problem that can be solved
| using computationally cheap existing algorithms. It seems
| more and more to me that AI - and LLMs in particular - are
| the hammer and now literally everything looks like a nail...
| thisisit wrote:
| > Figure out how I'm going to get from Baltimore to Boston next
| Thursday, here's $100 and if you need more, ask me.
|
| I tried something like this last month. I was going on a
| holiday and asked LLM to prepare a sightseeing guide on a fixed
| budget. The LLMs plan looked feasible unless you looked closer.
|
| The first issue was the opening/closing times of certain
| attractions. It kept saying stuff like - "At 6pm you can go and
| visit place X". While in reality X closed at 5pm.
|
| Second issue was underestimating the walking speed/distance.
| The plans were often fully packed with lots of walking. Now
| without a Google maps guidance it often underestimated the
| time. Instead of say 10 mins between A and B it routinely
| underestimated the time to be 5-6 mins.
|
| I keep prompting it go back and check the opening hours. And
| once it took that into account the walking routes became
| complicated- often double backing to same location. Lots of
| prompts and re-prompts to get it right.
|
| So, I don't know if this is already solved - at least at scale
| and within costs - especially given the token costs.
| Mikhail_Edoshin wrote:
| AI would make a very good librarian. It doesn't understand, only
| comprehends, but in this case it is enough.
|
| Thing is, there is no library for it to work in.
| theonething wrote:
| seem like data analysis would be a good one. Company ingests
| massive amounts of disparate business data. Ask AI to clean and
| normalize it, visualize it and give recommendations.
| jongjong wrote:
| I think this is not a sure bet because of the relatively high
| cost of inference. It is likely not suitable for large amounts
| of data. We don't actually know yet because current prices are
| so heavily subsidized, we don't know if it would actually be
| viable in a normal financial environment without subsidies.
|
| We know it's not viable to hire humans to do this, but we don't
| know if it's viable for LLMs to do it.
| kken wrote:
| Well, considering that the long term idea is to have AGI, general
| intelligence, it seems that the goal as also to only have a
| single product in the end.
|
| There may be different ways to access it, but the product is
| always the same.
| EagnaIonat wrote:
| > This doesn't work well because savvy users can manipulate the
| chatbot into calling tools. So you can never give a support
| chatbot real support powers like "refund this customer", ...
|
| I would disagree with this.
|
| Part of how security is handled in current agentic systems is to
| not let the LLM have any access to how the underlying tools work.
| At best it's like hitting "inspect" in your browser and changing
| the web page.
|
| Of course, that assumes that the agentic chatbot has been built
| correctly.
| samuelknight wrote:
| I look at LLMs with an engineering mindset. It is an intelligence
| black box that goes in a tool box with the old classical
| algorithms and frameworks. In order to use it in a solution I
| need to figure out:
|
| 1) Whether I can give it information in a compatible and cost
| effective way
|
| 2) Whether the model is likely to to produce useful output
|
| I have use language models for years before LLMs such as part of
| speech classifiers in the Python NLTK framework.
| sgt101 wrote:
| The product is the LLM, the wrap has marginal value atm.
|
| You can write an agent, it's cool. I can copy it.
|
| I cannot build my own LLM (although I can run open source ones).
| YesBox wrote:
| Regarding games: > A third reason could be that generated content
| is just not a good fit for gaming.
|
| This is my current opinion as a game developer. IMO this isn't
| going to be fun for most once the novelty wears off. Games are
| goal oriented at the end of the day and the great games are
| masterfully curated multi-disciplinary experiences. I'd argue
| throwing a game wrapper around an LLM is a new LLM experience,
| not a new game experience.
| leksak wrote:
| I would consider profitable to be a requirement to qualify as a
| product working and none of these fit the bill I believe?
| throwawaymaths wrote:
| I think there's a space for something that wraps an llm
| (especially multimodal) to do something that's halfway to
| agentic. Yes you could do it yourself but it's not worth it to
| you to figure out prompts etc, especially when someone has
| already optimized it. Plus, it could go from 100 clicks and 10
| minutes in front of chatgpt to zero clicks, automated ingest, and
| get an email when the results is baked.
|
| A good example I saw recently was stripping ads from podcasts.
| Animats wrote:
| > So you can never give a support chatbot real support powers
| like "refund this customer", because the moment you do, thousands
| of people will immediately find the right way to jailbreak your
| chatbot into giving them money.
|
| And that's the elephant in the room. AI "agents" can't do much
| until someone solves that problem. Most AI "agents" work for and
| favor the business operating the agent, but impose the costs of
| their errors on the customer. Errors are an externality, like
| pollution. This is no good.
| PunchyHamster wrote:
| Probably want to add "scamming clueless out of their savings" by
| combination of LLMs and voice generation.
| aunty_helen wrote:
| And the other side, detecting in real time as phishing is
| happening and intervening.
| rckt wrote:
| > The third real AI product is the coding agent. People have been
| talking about this for years, but it was only really in 2025 that
| the technology behind coding agents became feasible
|
| No, they didn't.
|
| > LLMs are particularly good at tasks like "skim through ten
| pages of search results" or "keyword search this giant dataset
| for any information on a particular topic".
|
| No. They are not.
|
| Amazing article without any support of the statements made. Just,
| "because I think so". Cool.
| Kiro wrote:
| I don't think any of those statements are controversial. Do you
| mind elaborating?
| joshribakoff wrote:
| I think this is a pretty narrow take. Without going into too much
| detail I can imagine many use cases. For example, there is a
| whole class of predictive algorithms and one limitation is that
| data has to be cleaned, ingested, and feature engineered. For
| example, clustering is only as good as your vectorization. With
| an LLM, it is easy to imagine predictive use cases that skip
| entire etl pipelines and just directly operate on less structured
| inputs, not just summarizing those inputs, but actually making
| decisions or predictions. You're already seeing frameworks like
| BERT-topic integrating LLMs (for labeling topics), that is
| already far removed from the "3 use cases" listed here.
|
| By fine tuning llm based predictive systems we might unlock
| entirely new use cases, and prediction is just one thing i
| imagine, there are many other use cases.
|
| And then it's not just the fact that frameworks like bert-topic
| are integrating LLMs. It's also the fact that if you zoom out the
| architecture looks a lot like the architecture of an LLM... text
| -> embedding -> text
|
| An LLM could and is already being used to generate
| recommendations systems, like the ones used at Youtube and
| Netflix, it captures more semantics than older techniques.
| AndrewKemendo wrote:
| This is ridiculous
|
| Shazam is almost perfect song recognition and is built into iOS
|
| Every Google/Apple Maps route is based on an AI system
|
| I have at least a dozen apps that use almost perfect visual
| recognition based on images for search (plant identification,
| stochastic object identification etc...)
| cpill wrote:
| This guy is full of horse manure.
___________________________________________________________________
(page generated 2025-11-16 23:01 UTC)