hngopher.com

       [HN Gopher] Claude Integrations
       ___________________________________________________________________
        
       Claude Integrations
        
       Author : bryanh
       Score  : 363 points
       Date   : 2025-05-01 16:02 UTC (6 hours ago)
        
 (HTM) web link (www.anthropic.com)
 (TXT) w3m dump (www.anthropic.com)
        
       | behnamoh wrote:
       | That "Allow for this chat" pop up should be optional. It ruins
       | the entire MCP experience. Maybe make it automatic for non-
       | mutating MCP tools.
        
         | pcwelder wrote:
         | In the latest update they've replaced "Allow for this chat"
         | with "Always Allow".
        
           | avandekleut wrote:
           | MCP also has support for "hints" which note whether an action
           | is destructive.
        
       | arjie wrote:
       | The cookie banner type constant Allow Allow Allow makes their
       | client unusable. Are there any alternative desktop MCP clients?
        
         | rahimnathwani wrote:
         | https://github.com/patruff/ollama-mcp-bridge
        
       | jarbus wrote:
       | Anyone have any data on how effective models are at leveraging
       | MCP? Hard to tell if these things are a buggy mess or a game
       | changer
        
         | striking wrote:
         | Claude Code is doing pretty well in my experience :) I've built
         | a tool in our CI environment that reads Jira tickets, files
         | GitHub PRs, etc. automatically. Great for one-shotting bugs,
         | and it's only getting better.
        
       | xnx wrote:
       | Integrations are nice, but the superpower is having an AI smart
       | enough to operate a computer/keyboard/mouse so it can do anything
       | without the cooperation/consent of the service being used.
       | 
       | Lots of people are making moves in this space (including
       | Anthropic), but nothing has broken through to the mainstream.
        
         | WillAdams wrote:
         | Or even access multiple files?
         | 
         | Why can't one set up a prompt, test it against a file, then
         | once it is working, apply it to each file in a folder in a
         | batch process which then provides the output as a single
         | collective file?
        
           | xnx wrote:
           | You can probably achieve what you want with
           | https://github.com/simonw/llm and a little bit of command
           | line.
           | 
           | Not sure what OS you're on, but in Windows it might look like
           | this:
           | 
           | FOR %%F IN (*.txt) DO (TYPE "%%F" | llm -s "execute this
           | prompt" >> "output.txt)
        
             | WillAdams wrote:
             | I want to work with PDFs (or JPEGs), but that should be a
             | start, I hope.
        
               | xnx wrote:
               | llm supports attachments too
               | 
               | FOR %%F IN (*.pdf) DO (llm -a %%F -s "execute this
               | prompt" >> output.txt)
        
           | TheOtherHobbes wrote:
           | I've just done something similar with Claude Desktop and its
           | built-in MCP servers.
           | 
           | The limits are still buggy responses - Claude often gets
           | stuck in a useless loop if you overfeed it with files - and
           | lack of consistency. Sometimes hand-holding is needed to get
           | the result you want. And it's slow.
           | 
           | But when it works it's amazing. If the issues and limitations
           | were solved, this would be a complete game changer.
           | 
           | We're starting to get somewhat self-generating automation and
           | complex agenting, with access to all of the world's public
           | APIs and search resources, controlled by natural language.
           | 
           | I can't see the edges of what could be possible with this.
           | It's limited and clunky for now, but the potential is
           | astonishing - at least as radical an invention as the web
           | was.
        
             | WillAdams wrote:
             | I would be fine with storing the output from one run,
             | spooling up a new one, then concatenating after multiple
             | successive runs.
        
         | arnaudsm wrote:
         | I get often ratelimited or blocked from websites because I
         | browse them too fast with my keyboard and mouse. The AI would
         | be slowed down significantly.
         | 
         | LLM-desktop interfaces make great demos, but they are too slow
         | to be usable in practice.
        
           | xnx wrote:
           | Good point. Probably makes sense to think of it as an
           | assistant you assign a job to and get results back later.
        
       | boh wrote:
       | I think all the retail LLM's are working to broaden the available
       | context, but in most practical use-cases it's having the ability
       | to minimize and filter the context that would produce the most
       | value. Even a single PDF with too many similar datapoints leads
       | to confusion in output. They need to switch gears from the high
       | growth, "every thing is possible and available" narrative, to one
       | that narrows the scope. The "hallucination" gap is widening with
       | more context, not shrinking.
        
         | mikepurvis wrote:
         | That's a tough pill to swallow when your company valuation is a
         | $62B based on the premise that you're building a bot capable of
         | transcendent thought, ready to disrupt every vertical in
         | existence.
         | 
         | Tackling individual use-cases is supposed to be something for
         | third party "ecosystem" companies to go after, not the
         | mothership itself.
        
         | Etheryte wrote:
         | This has been my experience as well. The moment you turn
         | internet access on, Kagi Assistant starts outputting garbage.
         | Turn it off and you're all good.
        
         | fhd2 wrote:
         | Definitely my experience. I manage context like a hawk, be it
         | with Claude-as-Google-replacement or LLM integrations into
         | systems. Too little and the results are off. Too much and the
         | results are off.
         | 
         | Not sure what Anthropic and co can do about that, but
         | integrations feel like a step in the wrong direction. Whenever
         | I've tried tool use, it was orders of magnitude more expensive
         | and generally inferior to a simple model call with curated
         | context from SerpApi and such.
        
           | loufe wrote:
           | Couldn't agree more. I wish all major model makers would
           | build tools into their proprietary UIs to "summarize contents
           | and start a new conversation with that base". My biggest
           | slowdown with working with LLMs while coding is moving my
           | conversation to a new thread because context limit is hit
           | (Claude) or the coherent-thought threshold is exceeded
           | (Gemini).
        
             | fhd2 wrote:
             | I never use any web interfaces, just hooked up gptel (an
             | Emacs package) to Claude's API and a few others I regularly
             | use, and I just have a buffer with the entire conversation.
             | I can modify it as needed, spawn a fresh one quickly etc.
             | There's also features to add files and individual snippets,
             | but I usually manage it all in a single buffer. It's a
             | powerful text editor, so efficient text editing is a given.
             | 
             | I bet there are better / less arcane tools, but I think
             | powerful and fast mechanisms for managing context are key
             | and for me, that's really just powerful text editing
             | features.
        
         | medhir wrote:
         | you hit the nail on the head. my experience with prompting LLMs
         | is that providing extra context that isn't explicitly needed
         | leads to "distracted" outputs
        
         | ketzo wrote:
         | I mean, to be honest, they gotta do both to achieve what
         | they're aiming for.
         | 
         | A truly useful AI assistant has context on my last 100,000
         | emails - and also recalls the details of each individual one
         | perfectly, without confusion or hallucination.
         | 
         | Obviously I'm setting a high bar here; I guess what I'm saying
         | is "yes, and"
        
         | energy123 wrote:
         | There's a niche for the kitchen sink approach. It's a type of
         | search engine.
         | 
         | Throw in all context --> ask it what is important for problem
         | XYZ --> curate what it tells you, and feed that to another
         | model to actually solve XYZ
        
         | roordan wrote:
         | This is my concern as well. How successful is it in selecting
         | the correct tool out of hundreds or thousands?
         | 
         | Different to what this integration is pushing, the LLMs usage
         | in production based products where high accuracy is a
         | requirement (99%), you have to give a very limited tool set to
         | get any degree of success.
        
       | bredren wrote:
       | Had been planning a custom mcp for our orgs' jira.
       | 
       | I'm a bit skeptical that it's gonna work out of the box because
       | of the amount of custom fields that seem to be involved to make
       | successful API requests in our case.
       | 
       | But I would welcome, not having to solve this problem. Jira's
       | interface is among the worst of all the ticket tracking
       | applications I have encountered.
       | 
       | But, I have found using a LM conversation paired within enough
       | context about what is involved for successful POSTs against the
       | API allow me to create update and relate issues via curl.
       | 
       | It's begging for a chat based LLM solution like this. I'd just
       | prefer the underlying model not be locked to a vendor.
       | 
       | Atlassian should be solving this for its customers.
        
       | rubenfiszel wrote:
       | I feel dumb but how do you actually add Zapier or Confluence or
       | custom MCP on the web version of claude? I only see it for
       | Drive/Gmail/Github. Is it zoned/slow release?
        
         | throwaway314155 wrote:
         | edit: <Incorrect>im fairly certain these additions only work on
         | Claude Desktop?</Incorrect>
         | 
         | That or they're pulling an OpenAI and launching a feature that
         | isn't actually fully live.
        
           | rubenfiszel wrote:
           | But the videos show claude web
        
         | 85392_school wrote:
         | This part seems relevant:
         | 
         | > in beta on the Max, Team, and Enterprise plans, and will soon
         | be available on Pro
        
       | joshwarwick15 wrote:
       | Created a list of remote MCP servers here so people can keep
       | track of new releases - https://github.com/jaw9c/awesome-remote-
       | mcp-servers
        
       | zhyder wrote:
       | Is there any way to access this via the API, after perhaps some
       | oauth from the Anthropic user account?
        
       | throwup238 wrote:
       | The leap frogging at this point is getting insane (in a good way,
       | I guess?). The amount of time each state of the art feature gets
       | before it's supplanted is a few weeks at this point.
       | 
       | LLMs were always a fun novelty for me until OpenAI DeepResearch
       | which started to actually come up with useful results on more
       | complex programming questions (where I needed to write all the
       | code by hand but had to pull together lots of different libraries
       | and APIs), but it was limited to 10/month for the cheaper plan.
       | Then Google Deep Research upgraded to 2.5 Pro and with paid usage
       | limits of 20/day, which allowed me to just throw everything at it
       | to the point where I'm still working through reports that are a
       | week or more old. Oh and it searched up to 400 sources at a time,
       | significantly more than OpenAI which made it quite useful in
       | historical research like identifying first edition copies of
       | books.
       | 
       | Now Claude is releasing the same research feature with
       | integrations (excited to check out the Cloudflare MCP auth
       | solution and hoping Val.town gets something similar), and a run
       | time of up to 45 minutes. The pace of change was overwhelming
       | half a year ago, now it's just getting ridiculous.
        
         | user_7832 wrote:
         | I agree with your overall message - rapid growth appears to
         | encourage competition and forces companies to put their best
         | foot forward.
         | 
         | However, unfortunately, I cannot shower much praise on Claude
         | 3.7. And if you (or anyone) asks why - 3.7 seems much better
         | than 3.5, surely? - Then I'm moderately sure that you use
         | Claude much more for coding than for any kind of conversation.
         | In my opinion, even 3.5 _Haiku_ (which is available for free
         | during high loads) is better than 3.7 Sonnet.
         | 
         | Here's a simple test. Try asking 3.7 to intuitively explain
         | anything technical - say, mass dominated vs spring dominated
         | oscillations. I'm a mechanical engineer who studied this stuff
         | and _I_ could not understand 3.7's analogies.
         | 
         | I understand that coders are the largest single group of
         | Claude's users, but Claude went from being my most used app to
         | being used only after both chatgpt and Gemini, something that I
         | absolutely regret.
        
           | airstrike wrote:
           | I too like 3.5 better than 3.7 and I use it pretty often.
           | It's like 3.7 is better in 2 metrics but worse in 10
           | different ones
        
           | joshstrange wrote:
           | I use Claude mostly for coding/technical things and something
           | about 3.7 does not feel like an upgrade. I haven't gone back
           | to 3.5 (mostly started using Gemini Pro 2.5 instead).
           | 
           | I haven't been able to use Claude research yet (it's not
           | rolled out to the Pro tier) but o1 -> o3 deep research was a
           | massive jump IMHO. It still isn't perfect but o1 would often
           | give me trash results but o3 deep research actually starts to
           | be useful.
           | 
           | 3.5->3.7 (even with extended thinking) felt like a
           | nothingburger.
        
           | mattlutze wrote:
           | The expectation that one model be top marks for all things
           | is, imo, asking too much.
        
           | tiberriver256 wrote:
           | 3.7 did score higher in coding benchmarks but in practice 3.5
           | is much better at coding. 3.7 ignores instructions and does
           | things you didn't ask it to do.
        
         | ilrwbwrkhv wrote:
         | None of those reports are any good though. Maybe for shallow
         | research, but I haven't found them deep. Can you share what
         | kind of research you have been trying there where it has done a
         | great job of actual deep research.
        
           | Balgair wrote:
           | I'm echoing this sentiment.
           | 
           | Deep Research hasn't really been that good for me. Maybe I'm
           | just using it wrong?
           | 
           | Example: I want the precipitation in mm and monthly high and
           | low temperature in C for the top 250 most populous cities in
           | North America.
           | 
           | To me, this prompt seems like a pretty anodyne and obvious
           | task for Deep Research. It's long, tedious, but mostly coming
           | from well structured data sources (wikipedia) across two
           | languages at most.
           | 
           | But when I put this in to any of the various models, I mostly
           | get back ways to go and find that data myself. Like, I know
           | how to look at Wikipedia, it's that I don't want to comb
           | through 250 pages manually or try to write a script to handle
           | all the HTML boxes. I want the LLM/model to do this days long
           | tedious task for me.
        
             | 85392_school wrote:
             | The funny thing is that if your request only needed the top
             | 100's temperature or the top 33's precipitation, it could
             | just read "List of cities by average temperature" or "List
             | of cities by average precipitation" and that would be it,
             | but the top 250 requires reading 184x more pages.
             | 
             | My perspective on this is that if Deep Research can't do
             | something, you should do it yourself and put the results on
             | the internet. It'll help other humans and AIs trying to do
             | the same task.
        
           | xrdegen wrote:
           | It is because you are just such a genius that already knows
           | everything unlike us stupid people that find these tools
           | amazingly useful and informative.
        
         | greymalik wrote:
         | Out of curiosity - can you give any examples of the programming
         | questions you are using deep research on? I'm having a hard
         | time thinking of how it would be helpful and could use the
         | inspiration.
        
       | WhitneyLand wrote:
       | The integrations feel so rag-ish. It talks, tells you it's going
       | to use a tool, searches, talks about what it found...
       | 
       | Hope one day it will be practical to do nightly finetunes of a
       | model per company with all core corporate data stores.
       | 
       | This could create a seamless native model experience that knows
       | about (almost) everything you're doing.
        
         | pyryt wrote:
         | I would love to do this on my codebase after every commit
        
         | notgiorgi wrote:
         | why is finetuning talked about so much less than RAG? is it not
         | viable at all?
        
           | mring33621 wrote:
           | i'm not an expert in either, but RAG is like dropping some
           | 'useful' info into the prompt context, while fine tuning is
           | more like a performing mix of retraining, appending re-
           | interpretive model layers and/or brain surgery.
           | 
           | I'll leave it to you to guess which one is harder to do.
        
           | disgruntledphd2 wrote:
           | RAG is much cheaper to run.
        
           | computerex wrote:
           | It's significantly harder to get right, it's a very big
           | stepwise increase in technical complexity over in context
           | learning/rag.
           | 
           | There are now some light versions of fine tuning that don't
           | update all the model weights but train a small adapter layer
           | called Lora which is way more viable commercially atm in my
           | opinion.
        
           | ijk wrote:
           | There were initial difficulties in finetuning that made it
           | less appealing early on, and that's snowballed a bit into
           | having more of a focus on RAG.
           | 
           | Some of the issues still exist, of course:
           | 
           | * Finetuning takes time and compute; for one-off queries
           | using in-context learning is vastly more efficient (i.e.,
           | look it up with RAG).
           | 
           | * Early results with finetuning had trouble reliably
           | memorizing information. We've got a much better idea of how
           | to add information to a model now, though it takes more
           | training data.
           | 
           | * Full finetuning is very VRAM intensive; optimizations like
           | LoRA were initially good at transferring style and not
           | content. Today, LoRA content training is viable but requires
           | training code that supports it [1].
           | 
           | * If you need a very specific memorized result and it's
           | costly to get it wrong, good RAG is pretty much always going
           | to be more efficient, since it injects the exact text in
           | context. (Bad RAG makes the problem worse, of course).
           | 
           | * Finetuning requires more technical knowledge: you've got to
           | understand the hyperparameters, avoid underfitting and
           | overfitting, evaluate the results, etc.
           | 
           | * Finetuning requires more data. RAG works with a handful
           | datapoints; finetuning requires at least three orders of
           | magnitude more data.
           | 
           | * Finetuning requires extra effort to avoid forgetting what
           | the model already knows.
           | 
           | * RAG works pretty well when the task that you are trying to
           | perform is well-represented in the training data.
           | 
           | * RAG works when you don't have direct control over the model
           | (i.e., API use).
           | 
           | * You can't finetune most of the closed models.
           | 
           | * Big, general models have outperformed specialized models
           | over the past couple of years; if it doesn't work now, just
           | wait for OpenAI to make their next model better on your
           | particular task.
           | 
           | On the other hand:
           | 
           | * Finetuning generalizes better.
           | 
           | * Finetuning has more influence on token distribution.
           | 
           | * Finetuning is better at learning new tasks that aren't as
           | present in the pretraining data.
           | 
           | * Finetuning can change the style of output (e.g.,
           | instruction training).
           | 
           | * When finetuning pays off, it gives you a bigger moat (no
           | one else has that particular model).
           | 
           | * You control which tasks you are optimizing for, without
           | having to wait for other companies to maybe fix your problems
           | for you.
           | 
           | * You can run a much smaller, faster specialized model
           | because it's been optimized for your tasks.
           | 
           | * Finetuning + RAG outperforms just RAG. Not by a lot,
           | admittedly, but there's some advantages.
           | 
           | Plus the RL Training for reasoning has been demonstrating
           | unexpectedly effective improvements on relatively small
           | amounts of data & compute.
           | 
           | So there's reasons to do both, but the larger investment that
           | finetuning requires means that RAG has generally been more
           | popular. In general, the past couple of years have been won
           | by the bigger models scaling fast, but with finetuning
           | difficulty dropping there is a bit more reason to do your own
           | finetuning.
           | 
           | That said, for the moment the expertise + expense + time of
           | finetuning makes it a tough business proposition if you don't
           | have a very well-defined task to perform, a large dataset to
           | leverage, or other way to get an advantage over the multi-
           | billion dollar investment in the big models.
           | 
           | [1] https://unsloth.ai/blog/contpretraining
        
           | omneity wrote:
           | RAG is infinitely more accessible and cheaper than
           | finetuning. But it is true that finetuning is getting
           | severely overlooked in situations where it would outperform
           | alternatives like RAG.
        
             | riku_iki wrote:
             | > RAG is infinitely more accessible and cheaper than
             | finetuning.
             | 
             | it depends on your data access pattern. If some text goes
             | through LLM input many times, it is more efficient for LLM
             | to be finetuned on it once.
        
               | omneity wrote:
               | This assumes the team deploying the RAG-based solution
               | has equal ability to either engineer a RAG-based system
               | or to finetune an LLM. Those are different skillsets and
               | even selecting which LLM should be finetuned is a complex
               | question, let alone aligning it, deploying it, optimizing
               | inference etc.
               | 
               | The budget question comes into play as well. Even if text
               | is repetitively fed to the LLM, that might happen over a
               | long enough time compared to finetuning which is a sort
               | of capex that it is financially more accessible.
               | 
               | Now bear in mind, I'm a big proponent of finetuning where
               | applicable and I try to raise awareness to the
               | possibilities it opens. But one cannot deny RAG is a lot
               | more accessible to teams which are likely developers / AI
               | engineers compared to ML engineers/researchers.
        
               | riku_iki wrote:
               | > But one cannot deny RAG is a lot more accessible to
               | teams which are likely developers / AI engineers compared
               | to ML engineers/researchers.
               | 
               | It looks like major vendors provide simple API for fine-
               | tuning, so you don't need ML engineers/researchers:
               | https://platform.openai.com/docs/guides/fine-tuning
               | 
               | Setting RAG infra is likely more complicated than that.
        
               | omneity wrote:
               | You are certainly right, managed platforms make
               | finetuning much easier. But managed/closed model
               | finetuning is pretty limited and in fact should be named
               | "distribution modeling" or something.
               | 
               | Results with this method are significantly more limited
               | compared to all the power open-weight finetuning gives
               | you (and the skillset needed in return).
               | 
               | And in either case don't forget alignment and evals.
        
             | retinaros wrote:
             | fine tuning can cost 80$ and a few hours. a good rag doesnt
             | exist
        
       | VSerge wrote:
       | Ongoing demo of integrations with Claude by a bunch of A-list
       | companies: Linear, Stripe, Paypal, Intercom, etc.. It's live now
       | on: https://www.youtube.com/watch?v=njBGqr-BU54
       | 
       | In case the above link doesn't work later on, the page for this
       | demo day is here: https://demo-day.mcp.cloudflare.com/
        
       | mkagenius wrote:
       | are people really doing this mcp thing, yikes. Tomorrow, let me
       | reinvent css as model context design (mcd)
        
         | warkdarrior wrote:
         | Do you have a better solution to give models on-demand access
         | to data sources?
        
           | mkagenius wrote:
           | you mean other than writing an api? no
        
             | cruffle_duffle wrote:
             | And what is the protocol for the interface between the GPU-
             | based LLM and the API? How does the LLM signal to make a
             | tool call? What mechanism does it use?
             | 
             | Because MCP isn't an API it's the protocol that defines how
             | the LLM even calls the API in the first place. Without it,
             | all you've got is a chat interface.
             | 
             | A lot of people misunderstand what is the role of MCP. It's
             | the signaling the LLM uses to reach out of its context
             | window and do things.
        
         | turblety wrote:
         | Is there a reason they went and built some new standard, rather
         | than just using a http api?
        
       | imbnwa wrote:
       | Feel like middle management is gonna go well before engineers do
       | with LLM rate of advancement
        
         | DebtDeflation wrote:
         | That started awhile ago. Google "the great flattening".
        
       | 6stringmerc wrote:
       | Feed Claude the data willingly to learn more about human behavior
       | they can't scrape or obtain otherwise without consent? Hard pass.
       | I'm not telling any AI any more about what it means to be a
       | creative person because training it how to suffer will only
       | further hurt my job prospects. Nice try, no dice.
        
       | n_ary wrote:
       | Is this the beginning of the apps for everything era and finally
       | the SaaS for your LLM begins? Initially we had internet but value
       | came when instead of installed apps, webapps arrived to become
       | SaaS. Now if LLMs can use specific remote MCP which is another
       | SaaS for your LLM, the remote MCP powered service can charge a
       | subscription to do wonderful things and voila! Let the new golden
       | age of SaaS for LLMs begin and the old fad(replace job XYZ with
       | AI) die already.
        
         | throwaway7783 wrote:
         | MCP is yet another interface for an existing SaaS (like UI and
         | APIs), but now magically "agent enabled". And $$$ of course
        
         | clvx wrote:
         | I'm more excited I can run now a custom site, hook an MCP for
         | it, and have all the cool intelligence I had to pay for _SaaS_
         | without having to integrate to them plus govern my data, it 's
         | a massive win. I just see AI assistant coding replicating
         | current SaaS services that I can run internally. If my shop was
         | a specific stack, I could aim to have all my supporting apps in
         | that specific stack using AI assistant coding, simplifying
         | operations, and being able to hook up MCP's to get intelligence
         | from all of them.
         | 
         | Truly, OSS should be more interesting in the next decade for
         | this alone.
        
           | heyheyhouhou wrote:
           | We should all thank the chinese companies for releasing so
           | many incredible open weight models. I hope they keep doing
           | it, I dont want to rely on OpenAI, Anthropic or Google for
           | all my future computer interactions.
        
             | achierius wrote:
             | Don't forget Meta, without them we probably wouldn't have
             | half the publicly available models we do today.
        
         | naravara wrote:
         | On one hand, yes this is very cool for a whole host of personal
         | uses. On the other hand giving any company this level of access
         | to as many different personal data sources as are out there
         | scares the shit out of me.
         | 
         | I'd feel a lot better if we had something resembling a
         | comprehensive data privacy law in the United States because I
         | don't want it to basically be the Wild West for anyone handling
         | whatever personal info doesn't get covered under HIPAA.
        
           | falcor84 wrote:
           | Absolutely agreed, but just wanted to mention that it's
           | essentially the same level of access you would give to
           | Zapier, which is one of their top examples of MCP
           | integrations.
        
         | OtherShrezzing wrote:
         | I'd love a _tip jar_ MCP, where the LLM vendor can
         | automatically tip my website for using its
         | content/feature/service in a query's response. Even if the
         | amount is absolutely minuscule, in aggregate, this might make
         | up for ad revenue losses.
        
           | fredoliveira wrote:
           | Not that exactly, but I just saw this on twitter a few
           | minutes ago from Stripe:
           | https://x.com/jeff_weinstein/status/1918029261430255626
        
         | insin wrote:
         | It's perfect, nobody will have time to care about how many 9s
         | your service has because the nondeterministic failure mode now
         | sitting slap-bang in the middle is their problem!
        
         | donmcronald wrote:
         | > Now if LLMs can use specific remote MCP which is another SaaS
         | for your LLM, the remote MCP powered service can charge a
         | subscription to do wonderful things and voila!
         | 
         | I've always worked under the assumption the best employees make
         | themselves replaceable via well defined processes and high
         | quality documentation. I have such a hard time understanding
         | why there's so much willingness to integrate irreplaceable SaaS
         | solutions into business processes.
         | 
         | I haven't used AI a ton, but everything I've done has focused
         | on owning my own context, config, etc.. How much are people
         | going to be willing to pay if someone else owns 10+ years of
         | their AI context?
         | 
         | Am I crazy or is owning the context massively valuable?
        
           | brumar wrote:
           | Hello fellow context owner. I like my modules with their
           | context.sh at their root level. If crafted with care, magic
           | happens. Reciprocally, when AI derails, it's most often due
           | to bad context management and fixed by improving it.
        
       | drivingmenuts wrote:
       | Is each Claude instance a separate individual or is a shared AI?
       | Because I'm not sure I would want an AI that learned about my
       | confidential business information sharing that with anyone else,
       | without my express permission.
       | 
       | This does not sound like it would be learning general information
       | helpful across an industry, but specific, actionable information.
       | 
       | If not available now, is that something that AI vendors are
       | working toward? If so, what is to keep them from using that
       | knowledge to benefit themselves or others of their choosing,
       | rather than the people they are learning from?
       | 
       | While people understand ethics, morals and legality (and ignore
       | them), that does not seem like something that an AI understands
       | in a way that might give them pause before doing an action.
        
       | zoogeny wrote:
       | I'm curious what kind of research people are doing that takes 45
       | minutes of LLM time. Is this a poke at the McKinsey consultant
       | domain?
       | 
       | Perhaps I am just frivolous with my own time, but I tend to use
       | LLMs in a more iterative way for research. I get partial answers,
       | probe for more information, direct the attention of the LLM away
       | from areas I am familiar and towards areas I am less familiar. I
       | feel if I just let it loose for 45 minutes it would spend too
       | much time on areas I do not find valuable.
       | 
       | This seems more like a play for "replacement" rather than
       | "augmentation". Although, I suppose if I had infinite wealth, I
       | could kick of 10+ research agents each taking 45 minutes and then
       | review their output as it became available, then kick off round
       | 2, etc. That is, I could do my process but instead of
       | interactively I could do it asynchronously.
        
         | throwup238 wrote:
         | That iterative research process is exactly how I use Google
         | Deep Research since it has a 20/day rate limit. Research a
         | problem, notice some off hand assumption or remark the report
         | made, and fire off another research run asking about it. It
         | depends on what you work on; in my use case I often have to do
         | hours of research for 30 minutes of work like when integrating
         | a bunch of different vendors' APIs or pouring over datasheets
         | for EE, so it's worth firing off research and then working on
         | something else for 10-20 minutes (it helps that the Gemini app
         | fires off a push notification when the report is done -
         | Anthropic please do this! Even for requests made from the web
         | app).
         | 
         | As for long research times, one thing I've been using it for is
         | historical research on old books. Gemini DeepResearch was the
         | first one able to properly explain the nuances of identifying a
         | chimeral first edition Origin of Species after taking half an
         | hour and reading 400 sources. It went into all the important
         | details like spelling errors and the properties of chimeral
         | FY2** copies found in various libraries around the world.
        
       | abhisek wrote:
       | Where is Skynet and when is judgement day?
        
       | pton_xd wrote:
       | "To start, you can choose from Integrations for 10 popular
       | services, including Atlassian's Jira and Confluence, Zapier,
       | Cloudflare, Intercom, Asana, Square, Sentry, PayPal, Linear, and
       | Plaid. ... Each integration drastically expands what Claude can
       | do."
       | 
       | Give us an LLM with better reasoning capabilities, please! All
       | this other stuff just feels like a distraction.
        
         | Centigonal wrote:
         | Building integrations is a more predictable way of developing a
         | smaller competitive advantage versus research. I think most of
         | the leading AI companies are adopting a multi-arm strategy of
         | research + product/ecosystem development to balance their
         | risks.
        
         | atonse wrote:
         | I disagree. They can walk and chew gum, do both things at once.
         | And this practical stuff is very important.
         | 
         | I've been using the Atlassian MCP for nearly a month now, and
         | it's completely changed (and eliminated) the feeling of having
         | an overwhelming backlog.
         | 
         | I can have it do things like "find all the tickets related to
         | profile editing and combine them into one epic" where it works
         | perfectly. Or "help me prioritize the 15 tickets assigned to me
         | this sprint" and it'll actually go through and suggest "maybe
         | you can do these two tickets first since they seem smaller,
         | then do this big one" - i haven't hooked it up to my calendar
         | yet.
         | 
         | But I'd love for it to suggest things like "do this one ticket
         | that requires a lot of heads down time on wednesday since you
         | don't have any meetings. I can create a block on your calendar
         | so that nobody will schedule a meeting then"
         | 
         | Those are all superhuman things that can be done with MCP and a
         | smart model.
         | 
         | I've defined rules in cursor that say "when I ask you to mark
         | something ready for test, change the status and assign it to <x
         | person>, and leave a comment summarizing the changes"
         | 
         | If you look at my JIRA comments now, you'd wonder how I had so
         | much time to write such thorough comments. I don't, Cursor and
         | whatever model is doing it for me.
         | 
         | It's been an absolute game changer. MCP is going to be what the
         | App store was to mobile. Yes you can get by without it, but
         | actually hooking into all your daily tool is when this stuff
         | gets insanely valuable in a practical sense.
        
           | OJFord wrote:
           | > If you look at my JIRA comments now, you'd wonder how I had
           | so much time to write such thorough comments. I don't, Cursor
           | and whatever model is doing it for me.
           | 
           | How do your colleagues feel about it?
        
             | warkdarrior wrote:
             | My colleagues' LLM assistants think that my LLM assistant
             | leaves great JIRA comments.
        
               | atonse wrote:
               | haha! Funny enough I do have to tell the LLMs to leave
               | concise comments.
               | 
               | I also don't want to read too many unnecessary words.
        
               | sdesol wrote:
               | Joking aside, I do believe we are moving into a era where
               | we have LLMs write for each other and humans have a
               | dedicated TL;DR. This includes code with a lot of
               | comments or design styles that might seem obvious or
               | stupid but can help another LLM.
        
               | eknkc wrote:
               | Why use JIRA at this point then?
               | 
               | Can't we point an LLM to a sqlite db and tell it to treat
               | it as an issue tracking db and have everyone do the same.
               | 
               | The service (jira) would materialize inside the LLMs
               | then.
               | 
               | Why even use abstractions like tickets etc. Ask LLM what
               | to do.
        
               | zoogeny wrote:
               | JIRA is more than just ticket management for most big
               | orgs. It provides a reporting interface for business with
               | long-term planning capabilities. A lot of the annoying
               | things that devs have to do in JIRA is often there to
               | make those functions more valuable. In other cases it is
               | a compliance thing as well. Some certifications necessary
               | for enterprise sales require audit trails for all code
               | changes, from the bug report to the code commit. JIRA
               | provides the integration and reporting necessary for
               | that.
               | 
               | Unless you can provide the same visibility, long-term
               | planning features and compliance aspects of JIRA on top
               | of you sqlite db, you won't compete with JIRA. But if you
               | do add those things on top of SQLite and LLMs, you
               | probably have a solid business idea. But you'd first need
               | to understand JIRA well enough to know why they are there
               | in the first place.
        
               | falcor84 wrote:
               | Exactly, applying the principle of Chesterton's Fence
               | [0].
               | 
               | [0] https://en.wikipedia.org/w/index.php?title=Wikipedia:
               | FENCE
        
             | atonse wrote:
             | Well I had half a mind to not tell them to see what they'd
             | say, but I also was excited to show everyone so they can
             | also be empowered with it.
             | 
             | One of them said "yeah I was wondering cuz you never write
             | that much" - as a leader, I actually don't set a good
             | example of how to leave quality JIRA comments. And my view
             | with all these things is that I have to lead by example,
             | not by orders.
             | 
             | With the help of these kinds of tools, we can improve the
             | quality of these comments. And I wouldn't expect others to
             | write them manually, more that I wanted to show that
             | everyone's use of JIRA on the team can improve.
        
               | OJFord wrote:
               | Notice they commented on the quantity, not the quality?
               | 
               | I don't think it's good leadership to unleash drivel on
               | an organisation, have people waste time reading and
               | perhaps replying to it, thinking it's something important
               | and thoughtful coming from atonse.
               | 
               | Good thing you told them though, now they can ignore it.
        
               | stefan_ wrote:
               | It sure seems like the next evolution of Jira though.
               | Designed to waste everyones time, picked by "leaders"
               | that don't use it. Why not spam tickets with LLM drivel?
               | They are perfect to pick up on all the inconsistency in
               | the PM insanity driven custom designed workflow - and
               | comment on it tagging a bunch of stray people seen in the
               | ticket history, the universal exit hatch.
        
               | sensanaty wrote:
               | Someone please shoot me if my PM ever gets this idea in
               | his head of using LLM slop to spam tickets with en masse.
               | 
               | There's nothing I hate more than people sending me their
               | AI messages, be it in a ticket or a PR or even on Slack.
               | I'm forced to engage and spend effort on something it
               | took them all of 3 seconds to generate without even
               | proofreading what they're sending me says. The amount of
               | times I've had to ask 11 clarifying questions because
               | their message has 11 contradictions within itself is
               | maddening to the highest degree.
               | 
               | The worst is when I call out one of these numerous
               | contradictions, and the reply is "oh haha, stupid Claude
               | :)", makes my blood boil and at the same time amazes me
               | that someone has so little pride and respect for their
               | fellow humans to do crap like that.
        
           | zoogeny wrote:
           | Honestly, that backlog management idea is probably the first
           | time an MCP actually sounded appealing to me.
           | 
           | I'm not in that world at the moment, but I've been the lead
           | on several projects where the backlog has became a dumping
           | ground of years of neglect. You end up with this tiered
           | backlog thing where one level of backlog gets too big so you
           | create a second tier of backlog for the stuff you are
           | actually going to work on. Pretty soon you end up with
           | duplicates in the second tier backlog for items already in
           | the base level backlog since no one even looks at that old
           | backlog anymore.
           | 
           | I've done a lot of tidy up myself when I inherit this kind of
           | mess, just closing tickets we definitely will never get to,
           | de-duping, adding context when available, grouping into
           | epics, tagging with relevant "tech-debt", "security", "bug",
           | "automation", etc. But when there are 100s of tickets it is a
           | slog. Having an LLM do this makes so much sense.
        
           | organsnyder wrote:
           | I have Claude hooked up to our project management system,
           | GitHub, and my calendar (among other things). It's already
           | proving extremely useful for various project management
           | tasks.
        
       | edaemon wrote:
       | Lots of reported security issues with MCP servers seemed to be
       | mitigated by their local-only setup. These MCP implementations
       | are remotely accessible, do they address security differently?
        
         | paulgb wrote:
         | Largely, yes -- one of the big issues with using other people's
         | random MCP servers is that they are run by default as a system
         | process, even if they only need to speak over an API. Remote
         | MCP mitigates this by not running any untrusted code locally.
         | 
         | What it _doesn't_ seem to yet mitigate is prompt injection
         | attacks, where a tool call description of one tool convinces
         | the model to do something it shouldn't (like send sensitive
         | data to a server owned by the attacker.) I think these concerns
         | are a little bit overblown though; things like pypi and the
         | Chrome Extension store scare me more and it doesn't stop them
         | from mostly working.
        
         | zoogeny wrote:
         | They offhand mention OAuth integration in their discussion of
         | Cloudflare integrated solutions. I can't see how that would be
         | any less secure than any other OAuth protected API offering.
        
       | Nijikokun wrote:
       | context windows are too small and conversely larger windows are
       | not accurate enough its annoying
        
       | indigodaddy wrote:
       | So any chat to Claude will now just auto-activate web search to
       | be included? What if I try to use it just as a search engine
       | exclusively? Also will proxies like Openrouter have access to the
       | web search capabilities?
        
       | gianpaj wrote:
       | > Web search is now globally available to all Claude.ai paid
       | plans.
        
         | surfingdino wrote:
         | I don't know why web search is such a big deal. You can
         | implement it with any LLM that offers an API and function
         | calling.
        
       | ChicagoDave wrote:
       | There is targeted value in integrations, but everything still
       | leads back to larger context windows.
       | 
       | I love MCP (it's way better than plain Claude) but even that runs
       | into context walls.
        
       | davee5 wrote:
       | I'm quite struck by the title of this announcement. The box being
       | drawn around "your world" shows how narrow the AI builder's
       | window into reality tends to be.
       | 
       | > a new way to connect your apps and tools to Claude. We're also
       | expanding... with an advanced mode that searches the web.
       | 
       | The notion of software eating the world, and AI accelerating that
       | trend, always seems to forget that The World is a vast thing, a
       | physical thing, a thing that by its very nature can never be
       | fully consumed by the relentless expansion of our digital
       | experiences. Your worldview /= the world.
       | 
       | The cynic would suggest that the teams that build these tools
       | should go touch grass, but I think that misses the mark. The real
       | indictment is of the sort of thinking that improvements to
       | digital tools [intelligences?] in and of themselves can
       | constitute truly substantial and far reaching changes.
       | 
       | The reach of any digital substrate inherently limited, and this
       | post unintentionally lays that bare. And while I hear
       | accelerationists invoking "robots" as the means for digital
       | agents to expand their potent impact deeper into the real world I
       | suggest this is the retort of those who spend all day in apps,
       | tools, and the web. The impacts and potential of AI is indeed
       | enormous, but some perspective remains warranted and occasional
       | injections of humility and context would probably do these teams
       | some good.
        
         | dang wrote:
         | (Just for context: we've since changed the title above.
         | Corporate press release titles are rarely a good fit for HN and
         | we usually change them.
         | 
         | https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
         | )
        
       | atonse wrote:
       | I think with MCPs and related tech, if Apple just internally went
       | back to the drawing board and integrated the concept of MCPs
       | directly into iOS (via the "Apple Intelligence" umbrella) and
       | seamlessly integrated it into the App Store and apps, they will
       | win the mobile race for this.
       | 
       | Being Apple, they would have to come up with something novel like
       | they did with push (where you have _one_ OS process running that
       | delegates to apps rather than every app trying to handle push
       | themselves) rather than having 20 MCP servers running. But I
       | think if they did this properly, it would be so amazing.
       | 
       | I hope Apple is really re-thinking their absolutely comical start
       | with AI. I hope they regroup and hit it out of the park (like how
       | Google initially stumbled with Bard, but are now hitting it out
       | of the park with Gemini)
        
         | mattlondon wrote:
         | Do you really think Apple can catch up with and then surpass
         | all these SOTA AI labs?
         | 
         | They bet big and got distracted on VR. It was obviously the
         | wrong choice at the time, and even more so now. They're going
         | to have to abandon all that VR crap and pivot hard to AI to try
         | and catch up. I think the more likely case is they _can 't_
         | catch up now and will just have to end up licensing Gemini from
         | Google/Google paying them to use Gemini as the default AI.
        
         | _pdp_ wrote:
         | Apple already has the equivalent of MCP.
         | https://developer.apple.com/documentation/appintents.
        
         | bloomca wrote:
         | That's just App Intents. I don't think they lack data at this
         | point, they just struggle how to use that data on the OS level
        
       | cruffle_duffle wrote:
       | The video demos never really showed the auth "story" but I assume
       | that there is some oauth step to connect Claude with your MCP
       | service, right?
        
       | belter wrote:
       | All these integrations are likely to cause a massive security
       | leak sooner or later.
        
       | OJFord wrote:
       | Where's the permissioning, the data protection?
       | 
       | People will say 'aaah ad company' (me too sometimes) but I'd
       | honestly trust a Google AI tool with this way more. Not just
       | because it already has access to my Google Workspace obviously,
       | but just because it's a huge established tech firm with decades
       | of experience in trying not to lose (or have taken) user data.
       | 
       | Even if they get the permissions right and it can only read my
       | stuff if I'm just asking it to 'research', now Anthropic has all
       | that and a target on their backs. And I don't even know what 'all
       | that' is, whatever it explored deeming it maybe useful.
       | 
       | Maybe I'm just transitioning into old guy not savvy with latest
       | tech, but I just can't trust any of this 'go off and do whatever
       | seems correct or helpful with access to my filesystem/Google
       | account/codebase/terminal' stuff.
       | 
       | I like chat-only (well, +web) interactions where I control the
       | input and taking the output, but even that is not an experience
       | that gives me any confidence in giving uncontrolled access to
       | stuff and it always doing something correct and reasonable. It's
       | often confidently incorrect too! I wouldn't give an intern free
       | reign in my shell either!
        
         | joshwarwick15 wrote:
         | Permissoning: OAuth Data protection: Local LLMs
        
       | weinzierl wrote:
       | If you do not enable "Web Search" are you guaranteed it does not
       | access the web anyway?
       | 
       | Sometimes I want a pure model answer and I used to use Claude for
       | that. For research tasks I preferred ChatGPT, but I found that
       | you cannot reliably deny it web access. If you are asking it a
       | research question, I am pretty sure it uses web search, even when
       | _" Search"_ and _" Deep Research"_ are off.
        
       | rafram wrote:
       | Oh no, remote MCP servers. Security was nice while it lasted!
        
         | rvz wrote:
         | This is a fantastic time to get into the security space and
         | trick all these LLMs into leaking sensitive data and make a lot
         | of money out of that.
         | 
         | MCP is a flawed spec and quite frankly a scam.
        
       | rvz wrote:
       | Can't wait for the first security incident relating to the
       | fundamentally flawed MCP specification which an LLM will
       | inadvertently be tricked to leak sensitive data.
       | 
       | Increasing the amount of "connections" to the LLM increases the
       | risk in a leak and it gives your more rope to hang yourself with
       | when at least one connection becomes problematic.
       | 
       | Now is a _great_ time to be a LLM security consultant.
        
       | dimgl wrote:
       | This is great, but can you fix Claude 3.7 and make it more like
       | 3.5? I'm seriously disappointed with 3.7. It seems to be
       | performing significantly worse for me on all tasks.
       | 
       | Even my wife, who normally used Claude to create interesting
       | recipes to bake cookies, has noticed a huge downgrade in 3.7.
        
       | bjornsing wrote:
       | The strategic business dynamic here is very interesting. We used
       | to have "GPT-wrapper SaaS". I guess what we're about to see now
       | is the opposite: "SaaS/MCP-wrapper GPTs".
        
       | hdjjhhvvhga wrote:
       | The people who connect a LLM to their Paypal and CLoudflare
       | accounts perfectly deserve the consequences, both positive and
       | negative.
        
       | conroy wrote:
       | Remote MCP servers are still in a strange space. Anthropic
       | updated the MCP spec about a month ago with a new Streamable HTTP
       | transport, but it doesn't appear that Claude supports that
       | transport yet.
       | 
       | When I hooked up our remote MCP server, Claude sends a GET
       | request to the endpoint. According to the spec, clients that want
       | to support both transports should first attempt to POST an
       | InitializeRequest to the server URL. If that returns a 4xx, it
       | should then assume the SSE integration.
        
       | gonzan wrote:
       | So there are going to be companies built on just an MCP server I
       | guess, wonder what the first big one will be, just a matter of
       | time I think
        
       | worldsayshi wrote:
       | Is it just me that would like to see more of confirmations before
       | making opaque changes to remote systems?
       | 
       | I might not dare to add an integration if it can potentially add
       | a bunch of stuff to the backing systems without my approval.
       | Confirmations and review should be part of the protocol.
        
         | sepositus wrote:
         | Yeah, this was my first thought. I was watching the video of it
         | creating all of these Jira tickets just thinking in my head: "I
         | hope it just did all that correctly." I think the level of
         | patience with my team would be very low if I started running an
         | LLM that accidentally deleted a bunch of really important
         | tickets.
        
           | worldsayshi wrote:
           | Yeah. Feels like it's breaking some fundamental UX principle.
           | If an action is going to make any significant change make
           | sure that it fulfills _at least_ one of these:
           | 
           | 1. Can be rollbacked/undone
           | 
           | 2. Clearly states exactly what it's going to do in a
           | reviewable way
           | 
           | If those aren't fulfilled you're going to end up with users
           | that are afraid of using your app.
        
       | todsacerdoti wrote:
       | Check out 2500+ MCP servers at https://mcp.pipedream.com
        
       | the_clarence wrote:
       | Been playing with MCP in the last few days and it's basically a
       | more streamlined way to define tools/function calls.
       | 
       | That + the agent SDK of openAI makes creating agentic flow so
       | easy.
       | 
       | On the other hand you're kinda forced to run these tools / MCP
       | servers in their own process which makes no sense to me.
        
         | nilslice wrote:
         | you might like mcp.run, a tool management platform we're
         | working on... totally agree running a process per tool, with
         | all kinds of permissions is nonsensical - and the move to
         | "remote MCP" is a good one!
         | 
         | but, we're taking it a step (or two) further, enabling you to
         | dynamically build up a MCP server from other servers managed in
         | your account with us.
         | 
         | try it out, or let me get you a demo! this goes for any casual
         | comment readers too ;)
         | 
         | https://cal.com/team/dylibso/mcp.run-demo
        
       | kostas_f wrote:
       | Anthropic's strategy seems to go towards "AI as universal glue".
       | They want to tie Claude into all the tools teams already live in
       | (Jira, Confluence, Zapier, etc.). That's a smart move for
       | enterprise adoption, but it also feels like they're compensating
       | for a plateau in core model capabilities.
       | 
       | Both OpenAI and Google continue to push the frontier on
       | reasoning, multimodality, and efficiency whereas Claude's recent
       | releases have felt more iterative. I'd love to see Anthropic push
       | into model research again.
        
         | bl4ckneon wrote:
         | I am sure they are already doing that. To think that an AI
         | researcher is doing essentially api integration work is a bit
         | silly. Multiple efforts can happen at the same time
        
         | freewizard wrote:
         | I would expect Slack do this. Maybe Slack and Claude should
         | merge one day, given MS and Google has their own core models.
        
           | tjsk wrote:
           | Slack is owned by Salesforce which is doing its own
           | Agentforce stuff
        
       | deanc wrote:
       | I find it absolutely astonishing that Atlassian hasn't yet
       | provided an LLM for confluence instances and instead a third
       | party is required. The sheer scale of documentation and
       | information I've seen at some organisations I've worked with is
       | overwhelming. This would be a killer feature. I do not recommend
       | confluence to my clients simply because the search is so
       | appalling .
       | 
       | Keyword search is such a naive approach to information discovery
       | and information sharing - and renders confluence in big orgs
       | useless. Being able to discuss and ask questions is a more
       | natural way of unpacking problems.
        
       | artur_makly wrote:
       | on their announcement page they wrote " In addition to these
       | updates, we're making WEB SEARCH available globally for all
       | Claude users on paid plans."
       | 
       | So I tested a basic prompt:
       | 
       | 1. go to : SOME URL
       | 
       | 2. copy all the content found VERBATIM, and show me all that
       | content as markdown here.
       | 
       | Result : it FAILED miserably with a few basic html pages - it
       | simply is not loading all the page content in its internal
       | browser.
       | 
       | What worked well: - Gemini 2.5Pro (Experimental) - GPT 4o-mini //
       | - Gemini 2.0 Flash ( not verbatim but summarized )
        
       | meander_water wrote:
       | Looks like this is possible due to the relatively recent addition
       | of OAuth2.1 to the MCP spec [0] to allow secure comms to remote
       | servers.
       | 
       | However, there's a major concern that server hosters are on the
       | hook to implement authorization. Ongoing discussion here [1].
       | 
       | [0] https://modelcontextprotocol.io/specification/2025-03-26
       | 
       | [1]
       | https://github.com/modelcontextprotocol/modelcontextprotocol...
        
         | dmarble wrote:
         | Direct link to the spec page on authorization:
         | https://modelcontextprotocol.io/specification/2025-03-26/bas...
         | 
         | Source:
         | https://github.com/modelcontextprotocol/modelcontextprotocol...
        
       ___________________________________________________________________
       (page generated 2025-05-01 23:00 UTC)