[HN Gopher] Claude Integrations
___________________________________________________________________
Claude Integrations
Author : bryanh
Score : 363 points
Date : 2025-05-01 16:02 UTC (6 hours ago)
(HTM) web link (www.anthropic.com)
(TXT) w3m dump (www.anthropic.com)
| behnamoh wrote:
| That "Allow for this chat" pop up should be optional. It ruins
| the entire MCP experience. Maybe make it automatic for non-
| mutating MCP tools.
| pcwelder wrote:
| In the latest update they've replaced "Allow for this chat"
| with "Always Allow".
| avandekleut wrote:
| MCP also has support for "hints" which note whether an action
| is destructive.
| arjie wrote:
| The cookie banner type constant Allow Allow Allow makes their
| client unusable. Are there any alternative desktop MCP clients?
| rahimnathwani wrote:
| https://github.com/patruff/ollama-mcp-bridge
| jarbus wrote:
| Anyone have any data on how effective models are at leveraging
| MCP? Hard to tell if these things are a buggy mess or a game
| changer
| striking wrote:
| Claude Code is doing pretty well in my experience :) I've built
| a tool in our CI environment that reads Jira tickets, files
| GitHub PRs, etc. automatically. Great for one-shotting bugs,
| and it's only getting better.
| xnx wrote:
| Integrations are nice, but the superpower is having an AI smart
| enough to operate a computer/keyboard/mouse so it can do anything
| without the cooperation/consent of the service being used.
|
| Lots of people are making moves in this space (including
| Anthropic), but nothing has broken through to the mainstream.
| WillAdams wrote:
| Or even access multiple files?
|
| Why can't one set up a prompt, test it against a file, then
| once it is working, apply it to each file in a folder in a
| batch process which then provides the output as a single
| collective file?
| xnx wrote:
| You can probably achieve what you want with
| https://github.com/simonw/llm and a little bit of command
| line.
|
| Not sure what OS you're on, but in Windows it might look like
| this:
|
| FOR %%F IN (*.txt) DO (TYPE "%%F" | llm -s "execute this
| prompt" >> "output.txt)
| WillAdams wrote:
| I want to work with PDFs (or JPEGs), but that should be a
| start, I hope.
| xnx wrote:
| llm supports attachments too
|
| FOR %%F IN (*.pdf) DO (llm -a %%F -s "execute this
| prompt" >> output.txt)
| TheOtherHobbes wrote:
| I've just done something similar with Claude Desktop and its
| built-in MCP servers.
|
| The limits are still buggy responses - Claude often gets
| stuck in a useless loop if you overfeed it with files - and
| lack of consistency. Sometimes hand-holding is needed to get
| the result you want. And it's slow.
|
| But when it works it's amazing. If the issues and limitations
| were solved, this would be a complete game changer.
|
| We're starting to get somewhat self-generating automation and
| complex agenting, with access to all of the world's public
| APIs and search resources, controlled by natural language.
|
| I can't see the edges of what could be possible with this.
| It's limited and clunky for now, but the potential is
| astonishing - at least as radical an invention as the web
| was.
| WillAdams wrote:
| I would be fine with storing the output from one run,
| spooling up a new one, then concatenating after multiple
| successive runs.
| arnaudsm wrote:
| I get often ratelimited or blocked from websites because I
| browse them too fast with my keyboard and mouse. The AI would
| be slowed down significantly.
|
| LLM-desktop interfaces make great demos, but they are too slow
| to be usable in practice.
| xnx wrote:
| Good point. Probably makes sense to think of it as an
| assistant you assign a job to and get results back later.
| boh wrote:
| I think all the retail LLM's are working to broaden the available
| context, but in most practical use-cases it's having the ability
| to minimize and filter the context that would produce the most
| value. Even a single PDF with too many similar datapoints leads
| to confusion in output. They need to switch gears from the high
| growth, "every thing is possible and available" narrative, to one
| that narrows the scope. The "hallucination" gap is widening with
| more context, not shrinking.
| mikepurvis wrote:
| That's a tough pill to swallow when your company valuation is a
| $62B based on the premise that you're building a bot capable of
| transcendent thought, ready to disrupt every vertical in
| existence.
|
| Tackling individual use-cases is supposed to be something for
| third party "ecosystem" companies to go after, not the
| mothership itself.
| Etheryte wrote:
| This has been my experience as well. The moment you turn
| internet access on, Kagi Assistant starts outputting garbage.
| Turn it off and you're all good.
| fhd2 wrote:
| Definitely my experience. I manage context like a hawk, be it
| with Claude-as-Google-replacement or LLM integrations into
| systems. Too little and the results are off. Too much and the
| results are off.
|
| Not sure what Anthropic and co can do about that, but
| integrations feel like a step in the wrong direction. Whenever
| I've tried tool use, it was orders of magnitude more expensive
| and generally inferior to a simple model call with curated
| context from SerpApi and such.
| loufe wrote:
| Couldn't agree more. I wish all major model makers would
| build tools into their proprietary UIs to "summarize contents
| and start a new conversation with that base". My biggest
| slowdown with working with LLMs while coding is moving my
| conversation to a new thread because context limit is hit
| (Claude) or the coherent-thought threshold is exceeded
| (Gemini).
| fhd2 wrote:
| I never use any web interfaces, just hooked up gptel (an
| Emacs package) to Claude's API and a few others I regularly
| use, and I just have a buffer with the entire conversation.
| I can modify it as needed, spawn a fresh one quickly etc.
| There's also features to add files and individual snippets,
| but I usually manage it all in a single buffer. It's a
| powerful text editor, so efficient text editing is a given.
|
| I bet there are better / less arcane tools, but I think
| powerful and fast mechanisms for managing context are key
| and for me, that's really just powerful text editing
| features.
| medhir wrote:
| you hit the nail on the head. my experience with prompting LLMs
| is that providing extra context that isn't explicitly needed
| leads to "distracted" outputs
| ketzo wrote:
| I mean, to be honest, they gotta do both to achieve what
| they're aiming for.
|
| A truly useful AI assistant has context on my last 100,000
| emails - and also recalls the details of each individual one
| perfectly, without confusion or hallucination.
|
| Obviously I'm setting a high bar here; I guess what I'm saying
| is "yes, and"
| energy123 wrote:
| There's a niche for the kitchen sink approach. It's a type of
| search engine.
|
| Throw in all context --> ask it what is important for problem
| XYZ --> curate what it tells you, and feed that to another
| model to actually solve XYZ
| roordan wrote:
| This is my concern as well. How successful is it in selecting
| the correct tool out of hundreds or thousands?
|
| Different to what this integration is pushing, the LLMs usage
| in production based products where high accuracy is a
| requirement (99%), you have to give a very limited tool set to
| get any degree of success.
| bredren wrote:
| Had been planning a custom mcp for our orgs' jira.
|
| I'm a bit skeptical that it's gonna work out of the box because
| of the amount of custom fields that seem to be involved to make
| successful API requests in our case.
|
| But I would welcome, not having to solve this problem. Jira's
| interface is among the worst of all the ticket tracking
| applications I have encountered.
|
| But, I have found using a LM conversation paired within enough
| context about what is involved for successful POSTs against the
| API allow me to create update and relate issues via curl.
|
| It's begging for a chat based LLM solution like this. I'd just
| prefer the underlying model not be locked to a vendor.
|
| Atlassian should be solving this for its customers.
| rubenfiszel wrote:
| I feel dumb but how do you actually add Zapier or Confluence or
| custom MCP on the web version of claude? I only see it for
| Drive/Gmail/Github. Is it zoned/slow release?
| throwaway314155 wrote:
| edit: <Incorrect>im fairly certain these additions only work on
| Claude Desktop?</Incorrect>
|
| That or they're pulling an OpenAI and launching a feature that
| isn't actually fully live.
| rubenfiszel wrote:
| But the videos show claude web
| 85392_school wrote:
| This part seems relevant:
|
| > in beta on the Max, Team, and Enterprise plans, and will soon
| be available on Pro
| joshwarwick15 wrote:
| Created a list of remote MCP servers here so people can keep
| track of new releases - https://github.com/jaw9c/awesome-remote-
| mcp-servers
| zhyder wrote:
| Is there any way to access this via the API, after perhaps some
| oauth from the Anthropic user account?
| throwup238 wrote:
| The leap frogging at this point is getting insane (in a good way,
| I guess?). The amount of time each state of the art feature gets
| before it's supplanted is a few weeks at this point.
|
| LLMs were always a fun novelty for me until OpenAI DeepResearch
| which started to actually come up with useful results on more
| complex programming questions (where I needed to write all the
| code by hand but had to pull together lots of different libraries
| and APIs), but it was limited to 10/month for the cheaper plan.
| Then Google Deep Research upgraded to 2.5 Pro and with paid usage
| limits of 20/day, which allowed me to just throw everything at it
| to the point where I'm still working through reports that are a
| week or more old. Oh and it searched up to 400 sources at a time,
| significantly more than OpenAI which made it quite useful in
| historical research like identifying first edition copies of
| books.
|
| Now Claude is releasing the same research feature with
| integrations (excited to check out the Cloudflare MCP auth
| solution and hoping Val.town gets something similar), and a run
| time of up to 45 minutes. The pace of change was overwhelming
| half a year ago, now it's just getting ridiculous.
| user_7832 wrote:
| I agree with your overall message - rapid growth appears to
| encourage competition and forces companies to put their best
| foot forward.
|
| However, unfortunately, I cannot shower much praise on Claude
| 3.7. And if you (or anyone) asks why - 3.7 seems much better
| than 3.5, surely? - Then I'm moderately sure that you use
| Claude much more for coding than for any kind of conversation.
| In my opinion, even 3.5 _Haiku_ (which is available for free
| during high loads) is better than 3.7 Sonnet.
|
| Here's a simple test. Try asking 3.7 to intuitively explain
| anything technical - say, mass dominated vs spring dominated
| oscillations. I'm a mechanical engineer who studied this stuff
| and _I_ could not understand 3.7's analogies.
|
| I understand that coders are the largest single group of
| Claude's users, but Claude went from being my most used app to
| being used only after both chatgpt and Gemini, something that I
| absolutely regret.
| airstrike wrote:
| I too like 3.5 better than 3.7 and I use it pretty often.
| It's like 3.7 is better in 2 metrics but worse in 10
| different ones
| joshstrange wrote:
| I use Claude mostly for coding/technical things and something
| about 3.7 does not feel like an upgrade. I haven't gone back
| to 3.5 (mostly started using Gemini Pro 2.5 instead).
|
| I haven't been able to use Claude research yet (it's not
| rolled out to the Pro tier) but o1 -> o3 deep research was a
| massive jump IMHO. It still isn't perfect but o1 would often
| give me trash results but o3 deep research actually starts to
| be useful.
|
| 3.5->3.7 (even with extended thinking) felt like a
| nothingburger.
| mattlutze wrote:
| The expectation that one model be top marks for all things
| is, imo, asking too much.
| tiberriver256 wrote:
| 3.7 did score higher in coding benchmarks but in practice 3.5
| is much better at coding. 3.7 ignores instructions and does
| things you didn't ask it to do.
| ilrwbwrkhv wrote:
| None of those reports are any good though. Maybe for shallow
| research, but I haven't found them deep. Can you share what
| kind of research you have been trying there where it has done a
| great job of actual deep research.
| Balgair wrote:
| I'm echoing this sentiment.
|
| Deep Research hasn't really been that good for me. Maybe I'm
| just using it wrong?
|
| Example: I want the precipitation in mm and monthly high and
| low temperature in C for the top 250 most populous cities in
| North America.
|
| To me, this prompt seems like a pretty anodyne and obvious
| task for Deep Research. It's long, tedious, but mostly coming
| from well structured data sources (wikipedia) across two
| languages at most.
|
| But when I put this in to any of the various models, I mostly
| get back ways to go and find that data myself. Like, I know
| how to look at Wikipedia, it's that I don't want to comb
| through 250 pages manually or try to write a script to handle
| all the HTML boxes. I want the LLM/model to do this days long
| tedious task for me.
| 85392_school wrote:
| The funny thing is that if your request only needed the top
| 100's temperature or the top 33's precipitation, it could
| just read "List of cities by average temperature" or "List
| of cities by average precipitation" and that would be it,
| but the top 250 requires reading 184x more pages.
|
| My perspective on this is that if Deep Research can't do
| something, you should do it yourself and put the results on
| the internet. It'll help other humans and AIs trying to do
| the same task.
| xrdegen wrote:
| It is because you are just such a genius that already knows
| everything unlike us stupid people that find these tools
| amazingly useful and informative.
| greymalik wrote:
| Out of curiosity - can you give any examples of the programming
| questions you are using deep research on? I'm having a hard
| time thinking of how it would be helpful and could use the
| inspiration.
| WhitneyLand wrote:
| The integrations feel so rag-ish. It talks, tells you it's going
| to use a tool, searches, talks about what it found...
|
| Hope one day it will be practical to do nightly finetunes of a
| model per company with all core corporate data stores.
|
| This could create a seamless native model experience that knows
| about (almost) everything you're doing.
| pyryt wrote:
| I would love to do this on my codebase after every commit
| notgiorgi wrote:
| why is finetuning talked about so much less than RAG? is it not
| viable at all?
| mring33621 wrote:
| i'm not an expert in either, but RAG is like dropping some
| 'useful' info into the prompt context, while fine tuning is
| more like a performing mix of retraining, appending re-
| interpretive model layers and/or brain surgery.
|
| I'll leave it to you to guess which one is harder to do.
| disgruntledphd2 wrote:
| RAG is much cheaper to run.
| computerex wrote:
| It's significantly harder to get right, it's a very big
| stepwise increase in technical complexity over in context
| learning/rag.
|
| There are now some light versions of fine tuning that don't
| update all the model weights but train a small adapter layer
| called Lora which is way more viable commercially atm in my
| opinion.
| ijk wrote:
| There were initial difficulties in finetuning that made it
| less appealing early on, and that's snowballed a bit into
| having more of a focus on RAG.
|
| Some of the issues still exist, of course:
|
| * Finetuning takes time and compute; for one-off queries
| using in-context learning is vastly more efficient (i.e.,
| look it up with RAG).
|
| * Early results with finetuning had trouble reliably
| memorizing information. We've got a much better idea of how
| to add information to a model now, though it takes more
| training data.
|
| * Full finetuning is very VRAM intensive; optimizations like
| LoRA were initially good at transferring style and not
| content. Today, LoRA content training is viable but requires
| training code that supports it [1].
|
| * If you need a very specific memorized result and it's
| costly to get it wrong, good RAG is pretty much always going
| to be more efficient, since it injects the exact text in
| context. (Bad RAG makes the problem worse, of course).
|
| * Finetuning requires more technical knowledge: you've got to
| understand the hyperparameters, avoid underfitting and
| overfitting, evaluate the results, etc.
|
| * Finetuning requires more data. RAG works with a handful
| datapoints; finetuning requires at least three orders of
| magnitude more data.
|
| * Finetuning requires extra effort to avoid forgetting what
| the model already knows.
|
| * RAG works pretty well when the task that you are trying to
| perform is well-represented in the training data.
|
| * RAG works when you don't have direct control over the model
| (i.e., API use).
|
| * You can't finetune most of the closed models.
|
| * Big, general models have outperformed specialized models
| over the past couple of years; if it doesn't work now, just
| wait for OpenAI to make their next model better on your
| particular task.
|
| On the other hand:
|
| * Finetuning generalizes better.
|
| * Finetuning has more influence on token distribution.
|
| * Finetuning is better at learning new tasks that aren't as
| present in the pretraining data.
|
| * Finetuning can change the style of output (e.g.,
| instruction training).
|
| * When finetuning pays off, it gives you a bigger moat (no
| one else has that particular model).
|
| * You control which tasks you are optimizing for, without
| having to wait for other companies to maybe fix your problems
| for you.
|
| * You can run a much smaller, faster specialized model
| because it's been optimized for your tasks.
|
| * Finetuning + RAG outperforms just RAG. Not by a lot,
| admittedly, but there's some advantages.
|
| Plus the RL Training for reasoning has been demonstrating
| unexpectedly effective improvements on relatively small
| amounts of data & compute.
|
| So there's reasons to do both, but the larger investment that
| finetuning requires means that RAG has generally been more
| popular. In general, the past couple of years have been won
| by the bigger models scaling fast, but with finetuning
| difficulty dropping there is a bit more reason to do your own
| finetuning.
|
| That said, for the moment the expertise + expense + time of
| finetuning makes it a tough business proposition if you don't
| have a very well-defined task to perform, a large dataset to
| leverage, or other way to get an advantage over the multi-
| billion dollar investment in the big models.
|
| [1] https://unsloth.ai/blog/contpretraining
| omneity wrote:
| RAG is infinitely more accessible and cheaper than
| finetuning. But it is true that finetuning is getting
| severely overlooked in situations where it would outperform
| alternatives like RAG.
| riku_iki wrote:
| > RAG is infinitely more accessible and cheaper than
| finetuning.
|
| it depends on your data access pattern. If some text goes
| through LLM input many times, it is more efficient for LLM
| to be finetuned on it once.
| omneity wrote:
| This assumes the team deploying the RAG-based solution
| has equal ability to either engineer a RAG-based system
| or to finetune an LLM. Those are different skillsets and
| even selecting which LLM should be finetuned is a complex
| question, let alone aligning it, deploying it, optimizing
| inference etc.
|
| The budget question comes into play as well. Even if text
| is repetitively fed to the LLM, that might happen over a
| long enough time compared to finetuning which is a sort
| of capex that it is financially more accessible.
|
| Now bear in mind, I'm a big proponent of finetuning where
| applicable and I try to raise awareness to the
| possibilities it opens. But one cannot deny RAG is a lot
| more accessible to teams which are likely developers / AI
| engineers compared to ML engineers/researchers.
| riku_iki wrote:
| > But one cannot deny RAG is a lot more accessible to
| teams which are likely developers / AI engineers compared
| to ML engineers/researchers.
|
| It looks like major vendors provide simple API for fine-
| tuning, so you don't need ML engineers/researchers:
| https://platform.openai.com/docs/guides/fine-tuning
|
| Setting RAG infra is likely more complicated than that.
| omneity wrote:
| You are certainly right, managed platforms make
| finetuning much easier. But managed/closed model
| finetuning is pretty limited and in fact should be named
| "distribution modeling" or something.
|
| Results with this method are significantly more limited
| compared to all the power open-weight finetuning gives
| you (and the skillset needed in return).
|
| And in either case don't forget alignment and evals.
| retinaros wrote:
| fine tuning can cost 80$ and a few hours. a good rag doesnt
| exist
| VSerge wrote:
| Ongoing demo of integrations with Claude by a bunch of A-list
| companies: Linear, Stripe, Paypal, Intercom, etc.. It's live now
| on: https://www.youtube.com/watch?v=njBGqr-BU54
|
| In case the above link doesn't work later on, the page for this
| demo day is here: https://demo-day.mcp.cloudflare.com/
| mkagenius wrote:
| are people really doing this mcp thing, yikes. Tomorrow, let me
| reinvent css as model context design (mcd)
| warkdarrior wrote:
| Do you have a better solution to give models on-demand access
| to data sources?
| mkagenius wrote:
| you mean other than writing an api? no
| cruffle_duffle wrote:
| And what is the protocol for the interface between the GPU-
| based LLM and the API? How does the LLM signal to make a
| tool call? What mechanism does it use?
|
| Because MCP isn't an API it's the protocol that defines how
| the LLM even calls the API in the first place. Without it,
| all you've got is a chat interface.
|
| A lot of people misunderstand what is the role of MCP. It's
| the signaling the LLM uses to reach out of its context
| window and do things.
| turblety wrote:
| Is there a reason they went and built some new standard, rather
| than just using a http api?
| imbnwa wrote:
| Feel like middle management is gonna go well before engineers do
| with LLM rate of advancement
| DebtDeflation wrote:
| That started awhile ago. Google "the great flattening".
| 6stringmerc wrote:
| Feed Claude the data willingly to learn more about human behavior
| they can't scrape or obtain otherwise without consent? Hard pass.
| I'm not telling any AI any more about what it means to be a
| creative person because training it how to suffer will only
| further hurt my job prospects. Nice try, no dice.
| n_ary wrote:
| Is this the beginning of the apps for everything era and finally
| the SaaS for your LLM begins? Initially we had internet but value
| came when instead of installed apps, webapps arrived to become
| SaaS. Now if LLMs can use specific remote MCP which is another
| SaaS for your LLM, the remote MCP powered service can charge a
| subscription to do wonderful things and voila! Let the new golden
| age of SaaS for LLMs begin and the old fad(replace job XYZ with
| AI) die already.
| throwaway7783 wrote:
| MCP is yet another interface for an existing SaaS (like UI and
| APIs), but now magically "agent enabled". And $$$ of course
| clvx wrote:
| I'm more excited I can run now a custom site, hook an MCP for
| it, and have all the cool intelligence I had to pay for _SaaS_
| without having to integrate to them plus govern my data, it 's
| a massive win. I just see AI assistant coding replicating
| current SaaS services that I can run internally. If my shop was
| a specific stack, I could aim to have all my supporting apps in
| that specific stack using AI assistant coding, simplifying
| operations, and being able to hook up MCP's to get intelligence
| from all of them.
|
| Truly, OSS should be more interesting in the next decade for
| this alone.
| heyheyhouhou wrote:
| We should all thank the chinese companies for releasing so
| many incredible open weight models. I hope they keep doing
| it, I dont want to rely on OpenAI, Anthropic or Google for
| all my future computer interactions.
| achierius wrote:
| Don't forget Meta, without them we probably wouldn't have
| half the publicly available models we do today.
| naravara wrote:
| On one hand, yes this is very cool for a whole host of personal
| uses. On the other hand giving any company this level of access
| to as many different personal data sources as are out there
| scares the shit out of me.
|
| I'd feel a lot better if we had something resembling a
| comprehensive data privacy law in the United States because I
| don't want it to basically be the Wild West for anyone handling
| whatever personal info doesn't get covered under HIPAA.
| falcor84 wrote:
| Absolutely agreed, but just wanted to mention that it's
| essentially the same level of access you would give to
| Zapier, which is one of their top examples of MCP
| integrations.
| OtherShrezzing wrote:
| I'd love a _tip jar_ MCP, where the LLM vendor can
| automatically tip my website for using its
| content/feature/service in a query's response. Even if the
| amount is absolutely minuscule, in aggregate, this might make
| up for ad revenue losses.
| fredoliveira wrote:
| Not that exactly, but I just saw this on twitter a few
| minutes ago from Stripe:
| https://x.com/jeff_weinstein/status/1918029261430255626
| insin wrote:
| It's perfect, nobody will have time to care about how many 9s
| your service has because the nondeterministic failure mode now
| sitting slap-bang in the middle is their problem!
| donmcronald wrote:
| > Now if LLMs can use specific remote MCP which is another SaaS
| for your LLM, the remote MCP powered service can charge a
| subscription to do wonderful things and voila!
|
| I've always worked under the assumption the best employees make
| themselves replaceable via well defined processes and high
| quality documentation. I have such a hard time understanding
| why there's so much willingness to integrate irreplaceable SaaS
| solutions into business processes.
|
| I haven't used AI a ton, but everything I've done has focused
| on owning my own context, config, etc.. How much are people
| going to be willing to pay if someone else owns 10+ years of
| their AI context?
|
| Am I crazy or is owning the context massively valuable?
| brumar wrote:
| Hello fellow context owner. I like my modules with their
| context.sh at their root level. If crafted with care, magic
| happens. Reciprocally, when AI derails, it's most often due
| to bad context management and fixed by improving it.
| drivingmenuts wrote:
| Is each Claude instance a separate individual or is a shared AI?
| Because I'm not sure I would want an AI that learned about my
| confidential business information sharing that with anyone else,
| without my express permission.
|
| This does not sound like it would be learning general information
| helpful across an industry, but specific, actionable information.
|
| If not available now, is that something that AI vendors are
| working toward? If so, what is to keep them from using that
| knowledge to benefit themselves or others of their choosing,
| rather than the people they are learning from?
|
| While people understand ethics, morals and legality (and ignore
| them), that does not seem like something that an AI understands
| in a way that might give them pause before doing an action.
| zoogeny wrote:
| I'm curious what kind of research people are doing that takes 45
| minutes of LLM time. Is this a poke at the McKinsey consultant
| domain?
|
| Perhaps I am just frivolous with my own time, but I tend to use
| LLMs in a more iterative way for research. I get partial answers,
| probe for more information, direct the attention of the LLM away
| from areas I am familiar and towards areas I am less familiar. I
| feel if I just let it loose for 45 minutes it would spend too
| much time on areas I do not find valuable.
|
| This seems more like a play for "replacement" rather than
| "augmentation". Although, I suppose if I had infinite wealth, I
| could kick of 10+ research agents each taking 45 minutes and then
| review their output as it became available, then kick off round
| 2, etc. That is, I could do my process but instead of
| interactively I could do it asynchronously.
| throwup238 wrote:
| That iterative research process is exactly how I use Google
| Deep Research since it has a 20/day rate limit. Research a
| problem, notice some off hand assumption or remark the report
| made, and fire off another research run asking about it. It
| depends on what you work on; in my use case I often have to do
| hours of research for 30 minutes of work like when integrating
| a bunch of different vendors' APIs or pouring over datasheets
| for EE, so it's worth firing off research and then working on
| something else for 10-20 minutes (it helps that the Gemini app
| fires off a push notification when the report is done -
| Anthropic please do this! Even for requests made from the web
| app).
|
| As for long research times, one thing I've been using it for is
| historical research on old books. Gemini DeepResearch was the
| first one able to properly explain the nuances of identifying a
| chimeral first edition Origin of Species after taking half an
| hour and reading 400 sources. It went into all the important
| details like spelling errors and the properties of chimeral
| FY2** copies found in various libraries around the world.
| abhisek wrote:
| Where is Skynet and when is judgement day?
| pton_xd wrote:
| "To start, you can choose from Integrations for 10 popular
| services, including Atlassian's Jira and Confluence, Zapier,
| Cloudflare, Intercom, Asana, Square, Sentry, PayPal, Linear, and
| Plaid. ... Each integration drastically expands what Claude can
| do."
|
| Give us an LLM with better reasoning capabilities, please! All
| this other stuff just feels like a distraction.
| Centigonal wrote:
| Building integrations is a more predictable way of developing a
| smaller competitive advantage versus research. I think most of
| the leading AI companies are adopting a multi-arm strategy of
| research + product/ecosystem development to balance their
| risks.
| atonse wrote:
| I disagree. They can walk and chew gum, do both things at once.
| And this practical stuff is very important.
|
| I've been using the Atlassian MCP for nearly a month now, and
| it's completely changed (and eliminated) the feeling of having
| an overwhelming backlog.
|
| I can have it do things like "find all the tickets related to
| profile editing and combine them into one epic" where it works
| perfectly. Or "help me prioritize the 15 tickets assigned to me
| this sprint" and it'll actually go through and suggest "maybe
| you can do these two tickets first since they seem smaller,
| then do this big one" - i haven't hooked it up to my calendar
| yet.
|
| But I'd love for it to suggest things like "do this one ticket
| that requires a lot of heads down time on wednesday since you
| don't have any meetings. I can create a block on your calendar
| so that nobody will schedule a meeting then"
|
| Those are all superhuman things that can be done with MCP and a
| smart model.
|
| I've defined rules in cursor that say "when I ask you to mark
| something ready for test, change the status and assign it to <x
| person>, and leave a comment summarizing the changes"
|
| If you look at my JIRA comments now, you'd wonder how I had so
| much time to write such thorough comments. I don't, Cursor and
| whatever model is doing it for me.
|
| It's been an absolute game changer. MCP is going to be what the
| App store was to mobile. Yes you can get by without it, but
| actually hooking into all your daily tool is when this stuff
| gets insanely valuable in a practical sense.
| OJFord wrote:
| > If you look at my JIRA comments now, you'd wonder how I had
| so much time to write such thorough comments. I don't, Cursor
| and whatever model is doing it for me.
|
| How do your colleagues feel about it?
| warkdarrior wrote:
| My colleagues' LLM assistants think that my LLM assistant
| leaves great JIRA comments.
| atonse wrote:
| haha! Funny enough I do have to tell the LLMs to leave
| concise comments.
|
| I also don't want to read too many unnecessary words.
| sdesol wrote:
| Joking aside, I do believe we are moving into a era where
| we have LLMs write for each other and humans have a
| dedicated TL;DR. This includes code with a lot of
| comments or design styles that might seem obvious or
| stupid but can help another LLM.
| eknkc wrote:
| Why use JIRA at this point then?
|
| Can't we point an LLM to a sqlite db and tell it to treat
| it as an issue tracking db and have everyone do the same.
|
| The service (jira) would materialize inside the LLMs
| then.
|
| Why even use abstractions like tickets etc. Ask LLM what
| to do.
| zoogeny wrote:
| JIRA is more than just ticket management for most big
| orgs. It provides a reporting interface for business with
| long-term planning capabilities. A lot of the annoying
| things that devs have to do in JIRA is often there to
| make those functions more valuable. In other cases it is
| a compliance thing as well. Some certifications necessary
| for enterprise sales require audit trails for all code
| changes, from the bug report to the code commit. JIRA
| provides the integration and reporting necessary for
| that.
|
| Unless you can provide the same visibility, long-term
| planning features and compliance aspects of JIRA on top
| of you sqlite db, you won't compete with JIRA. But if you
| do add those things on top of SQLite and LLMs, you
| probably have a solid business idea. But you'd first need
| to understand JIRA well enough to know why they are there
| in the first place.
| falcor84 wrote:
| Exactly, applying the principle of Chesterton's Fence
| [0].
|
| [0] https://en.wikipedia.org/w/index.php?title=Wikipedia:
| FENCE
| atonse wrote:
| Well I had half a mind to not tell them to see what they'd
| say, but I also was excited to show everyone so they can
| also be empowered with it.
|
| One of them said "yeah I was wondering cuz you never write
| that much" - as a leader, I actually don't set a good
| example of how to leave quality JIRA comments. And my view
| with all these things is that I have to lead by example,
| not by orders.
|
| With the help of these kinds of tools, we can improve the
| quality of these comments. And I wouldn't expect others to
| write them manually, more that I wanted to show that
| everyone's use of JIRA on the team can improve.
| OJFord wrote:
| Notice they commented on the quantity, not the quality?
|
| I don't think it's good leadership to unleash drivel on
| an organisation, have people waste time reading and
| perhaps replying to it, thinking it's something important
| and thoughtful coming from atonse.
|
| Good thing you told them though, now they can ignore it.
| stefan_ wrote:
| It sure seems like the next evolution of Jira though.
| Designed to waste everyones time, picked by "leaders"
| that don't use it. Why not spam tickets with LLM drivel?
| They are perfect to pick up on all the inconsistency in
| the PM insanity driven custom designed workflow - and
| comment on it tagging a bunch of stray people seen in the
| ticket history, the universal exit hatch.
| sensanaty wrote:
| Someone please shoot me if my PM ever gets this idea in
| his head of using LLM slop to spam tickets with en masse.
|
| There's nothing I hate more than people sending me their
| AI messages, be it in a ticket or a PR or even on Slack.
| I'm forced to engage and spend effort on something it
| took them all of 3 seconds to generate without even
| proofreading what they're sending me says. The amount of
| times I've had to ask 11 clarifying questions because
| their message has 11 contradictions within itself is
| maddening to the highest degree.
|
| The worst is when I call out one of these numerous
| contradictions, and the reply is "oh haha, stupid Claude
| :)", makes my blood boil and at the same time amazes me
| that someone has so little pride and respect for their
| fellow humans to do crap like that.
| zoogeny wrote:
| Honestly, that backlog management idea is probably the first
| time an MCP actually sounded appealing to me.
|
| I'm not in that world at the moment, but I've been the lead
| on several projects where the backlog has became a dumping
| ground of years of neglect. You end up with this tiered
| backlog thing where one level of backlog gets too big so you
| create a second tier of backlog for the stuff you are
| actually going to work on. Pretty soon you end up with
| duplicates in the second tier backlog for items already in
| the base level backlog since no one even looks at that old
| backlog anymore.
|
| I've done a lot of tidy up myself when I inherit this kind of
| mess, just closing tickets we definitely will never get to,
| de-duping, adding context when available, grouping into
| epics, tagging with relevant "tech-debt", "security", "bug",
| "automation", etc. But when there are 100s of tickets it is a
| slog. Having an LLM do this makes so much sense.
| organsnyder wrote:
| I have Claude hooked up to our project management system,
| GitHub, and my calendar (among other things). It's already
| proving extremely useful for various project management
| tasks.
| edaemon wrote:
| Lots of reported security issues with MCP servers seemed to be
| mitigated by their local-only setup. These MCP implementations
| are remotely accessible, do they address security differently?
| paulgb wrote:
| Largely, yes -- one of the big issues with using other people's
| random MCP servers is that they are run by default as a system
| process, even if they only need to speak over an API. Remote
| MCP mitigates this by not running any untrusted code locally.
|
| What it _doesn't_ seem to yet mitigate is prompt injection
| attacks, where a tool call description of one tool convinces
| the model to do something it shouldn't (like send sensitive
| data to a server owned by the attacker.) I think these concerns
| are a little bit overblown though; things like pypi and the
| Chrome Extension store scare me more and it doesn't stop them
| from mostly working.
| zoogeny wrote:
| They offhand mention OAuth integration in their discussion of
| Cloudflare integrated solutions. I can't see how that would be
| any less secure than any other OAuth protected API offering.
| Nijikokun wrote:
| context windows are too small and conversely larger windows are
| not accurate enough its annoying
| indigodaddy wrote:
| So any chat to Claude will now just auto-activate web search to
| be included? What if I try to use it just as a search engine
| exclusively? Also will proxies like Openrouter have access to the
| web search capabilities?
| gianpaj wrote:
| > Web search is now globally available to all Claude.ai paid
| plans.
| surfingdino wrote:
| I don't know why web search is such a big deal. You can
| implement it with any LLM that offers an API and function
| calling.
| ChicagoDave wrote:
| There is targeted value in integrations, but everything still
| leads back to larger context windows.
|
| I love MCP (it's way better than plain Claude) but even that runs
| into context walls.
| davee5 wrote:
| I'm quite struck by the title of this announcement. The box being
| drawn around "your world" shows how narrow the AI builder's
| window into reality tends to be.
|
| > a new way to connect your apps and tools to Claude. We're also
| expanding... with an advanced mode that searches the web.
|
| The notion of software eating the world, and AI accelerating that
| trend, always seems to forget that The World is a vast thing, a
| physical thing, a thing that by its very nature can never be
| fully consumed by the relentless expansion of our digital
| experiences. Your worldview /= the world.
|
| The cynic would suggest that the teams that build these tools
| should go touch grass, but I think that misses the mark. The real
| indictment is of the sort of thinking that improvements to
| digital tools [intelligences?] in and of themselves can
| constitute truly substantial and far reaching changes.
|
| The reach of any digital substrate inherently limited, and this
| post unintentionally lays that bare. And while I hear
| accelerationists invoking "robots" as the means for digital
| agents to expand their potent impact deeper into the real world I
| suggest this is the retort of those who spend all day in apps,
| tools, and the web. The impacts and potential of AI is indeed
| enormous, but some perspective remains warranted and occasional
| injections of humility and context would probably do these teams
| some good.
| dang wrote:
| (Just for context: we've since changed the title above.
| Corporate press release titles are rarely a good fit for HN and
| we usually change them.
|
| https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
| )
| atonse wrote:
| I think with MCPs and related tech, if Apple just internally went
| back to the drawing board and integrated the concept of MCPs
| directly into iOS (via the "Apple Intelligence" umbrella) and
| seamlessly integrated it into the App Store and apps, they will
| win the mobile race for this.
|
| Being Apple, they would have to come up with something novel like
| they did with push (where you have _one_ OS process running that
| delegates to apps rather than every app trying to handle push
| themselves) rather than having 20 MCP servers running. But I
| think if they did this properly, it would be so amazing.
|
| I hope Apple is really re-thinking their absolutely comical start
| with AI. I hope they regroup and hit it out of the park (like how
| Google initially stumbled with Bard, but are now hitting it out
| of the park with Gemini)
| mattlondon wrote:
| Do you really think Apple can catch up with and then surpass
| all these SOTA AI labs?
|
| They bet big and got distracted on VR. It was obviously the
| wrong choice at the time, and even more so now. They're going
| to have to abandon all that VR crap and pivot hard to AI to try
| and catch up. I think the more likely case is they _can 't_
| catch up now and will just have to end up licensing Gemini from
| Google/Google paying them to use Gemini as the default AI.
| _pdp_ wrote:
| Apple already has the equivalent of MCP.
| https://developer.apple.com/documentation/appintents.
| bloomca wrote:
| That's just App Intents. I don't think they lack data at this
| point, they just struggle how to use that data on the OS level
| cruffle_duffle wrote:
| The video demos never really showed the auth "story" but I assume
| that there is some oauth step to connect Claude with your MCP
| service, right?
| belter wrote:
| All these integrations are likely to cause a massive security
| leak sooner or later.
| OJFord wrote:
| Where's the permissioning, the data protection?
|
| People will say 'aaah ad company' (me too sometimes) but I'd
| honestly trust a Google AI tool with this way more. Not just
| because it already has access to my Google Workspace obviously,
| but just because it's a huge established tech firm with decades
| of experience in trying not to lose (or have taken) user data.
|
| Even if they get the permissions right and it can only read my
| stuff if I'm just asking it to 'research', now Anthropic has all
| that and a target on their backs. And I don't even know what 'all
| that' is, whatever it explored deeming it maybe useful.
|
| Maybe I'm just transitioning into old guy not savvy with latest
| tech, but I just can't trust any of this 'go off and do whatever
| seems correct or helpful with access to my filesystem/Google
| account/codebase/terminal' stuff.
|
| I like chat-only (well, +web) interactions where I control the
| input and taking the output, but even that is not an experience
| that gives me any confidence in giving uncontrolled access to
| stuff and it always doing something correct and reasonable. It's
| often confidently incorrect too! I wouldn't give an intern free
| reign in my shell either!
| joshwarwick15 wrote:
| Permissoning: OAuth Data protection: Local LLMs
| weinzierl wrote:
| If you do not enable "Web Search" are you guaranteed it does not
| access the web anyway?
|
| Sometimes I want a pure model answer and I used to use Claude for
| that. For research tasks I preferred ChatGPT, but I found that
| you cannot reliably deny it web access. If you are asking it a
| research question, I am pretty sure it uses web search, even when
| _" Search"_ and _" Deep Research"_ are off.
| rafram wrote:
| Oh no, remote MCP servers. Security was nice while it lasted!
| rvz wrote:
| This is a fantastic time to get into the security space and
| trick all these LLMs into leaking sensitive data and make a lot
| of money out of that.
|
| MCP is a flawed spec and quite frankly a scam.
| rvz wrote:
| Can't wait for the first security incident relating to the
| fundamentally flawed MCP specification which an LLM will
| inadvertently be tricked to leak sensitive data.
|
| Increasing the amount of "connections" to the LLM increases the
| risk in a leak and it gives your more rope to hang yourself with
| when at least one connection becomes problematic.
|
| Now is a _great_ time to be a LLM security consultant.
| dimgl wrote:
| This is great, but can you fix Claude 3.7 and make it more like
| 3.5? I'm seriously disappointed with 3.7. It seems to be
| performing significantly worse for me on all tasks.
|
| Even my wife, who normally used Claude to create interesting
| recipes to bake cookies, has noticed a huge downgrade in 3.7.
| bjornsing wrote:
| The strategic business dynamic here is very interesting. We used
| to have "GPT-wrapper SaaS". I guess what we're about to see now
| is the opposite: "SaaS/MCP-wrapper GPTs".
| hdjjhhvvhga wrote:
| The people who connect a LLM to their Paypal and CLoudflare
| accounts perfectly deserve the consequences, both positive and
| negative.
| conroy wrote:
| Remote MCP servers are still in a strange space. Anthropic
| updated the MCP spec about a month ago with a new Streamable HTTP
| transport, but it doesn't appear that Claude supports that
| transport yet.
|
| When I hooked up our remote MCP server, Claude sends a GET
| request to the endpoint. According to the spec, clients that want
| to support both transports should first attempt to POST an
| InitializeRequest to the server URL. If that returns a 4xx, it
| should then assume the SSE integration.
| gonzan wrote:
| So there are going to be companies built on just an MCP server I
| guess, wonder what the first big one will be, just a matter of
| time I think
| worldsayshi wrote:
| Is it just me that would like to see more of confirmations before
| making opaque changes to remote systems?
|
| I might not dare to add an integration if it can potentially add
| a bunch of stuff to the backing systems without my approval.
| Confirmations and review should be part of the protocol.
| sepositus wrote:
| Yeah, this was my first thought. I was watching the video of it
| creating all of these Jira tickets just thinking in my head: "I
| hope it just did all that correctly." I think the level of
| patience with my team would be very low if I started running an
| LLM that accidentally deleted a bunch of really important
| tickets.
| worldsayshi wrote:
| Yeah. Feels like it's breaking some fundamental UX principle.
| If an action is going to make any significant change make
| sure that it fulfills _at least_ one of these:
|
| 1. Can be rollbacked/undone
|
| 2. Clearly states exactly what it's going to do in a
| reviewable way
|
| If those aren't fulfilled you're going to end up with users
| that are afraid of using your app.
| todsacerdoti wrote:
| Check out 2500+ MCP servers at https://mcp.pipedream.com
| the_clarence wrote:
| Been playing with MCP in the last few days and it's basically a
| more streamlined way to define tools/function calls.
|
| That + the agent SDK of openAI makes creating agentic flow so
| easy.
|
| On the other hand you're kinda forced to run these tools / MCP
| servers in their own process which makes no sense to me.
| nilslice wrote:
| you might like mcp.run, a tool management platform we're
| working on... totally agree running a process per tool, with
| all kinds of permissions is nonsensical - and the move to
| "remote MCP" is a good one!
|
| but, we're taking it a step (or two) further, enabling you to
| dynamically build up a MCP server from other servers managed in
| your account with us.
|
| try it out, or let me get you a demo! this goes for any casual
| comment readers too ;)
|
| https://cal.com/team/dylibso/mcp.run-demo
| kostas_f wrote:
| Anthropic's strategy seems to go towards "AI as universal glue".
| They want to tie Claude into all the tools teams already live in
| (Jira, Confluence, Zapier, etc.). That's a smart move for
| enterprise adoption, but it also feels like they're compensating
| for a plateau in core model capabilities.
|
| Both OpenAI and Google continue to push the frontier on
| reasoning, multimodality, and efficiency whereas Claude's recent
| releases have felt more iterative. I'd love to see Anthropic push
| into model research again.
| bl4ckneon wrote:
| I am sure they are already doing that. To think that an AI
| researcher is doing essentially api integration work is a bit
| silly. Multiple efforts can happen at the same time
| freewizard wrote:
| I would expect Slack do this. Maybe Slack and Claude should
| merge one day, given MS and Google has their own core models.
| tjsk wrote:
| Slack is owned by Salesforce which is doing its own
| Agentforce stuff
| deanc wrote:
| I find it absolutely astonishing that Atlassian hasn't yet
| provided an LLM for confluence instances and instead a third
| party is required. The sheer scale of documentation and
| information I've seen at some organisations I've worked with is
| overwhelming. This would be a killer feature. I do not recommend
| confluence to my clients simply because the search is so
| appalling .
|
| Keyword search is such a naive approach to information discovery
| and information sharing - and renders confluence in big orgs
| useless. Being able to discuss and ask questions is a more
| natural way of unpacking problems.
| artur_makly wrote:
| on their announcement page they wrote " In addition to these
| updates, we're making WEB SEARCH available globally for all
| Claude users on paid plans."
|
| So I tested a basic prompt:
|
| 1. go to : SOME URL
|
| 2. copy all the content found VERBATIM, and show me all that
| content as markdown here.
|
| Result : it FAILED miserably with a few basic html pages - it
| simply is not loading all the page content in its internal
| browser.
|
| What worked well: - Gemini 2.5Pro (Experimental) - GPT 4o-mini //
| - Gemini 2.0 Flash ( not verbatim but summarized )
| meander_water wrote:
| Looks like this is possible due to the relatively recent addition
| of OAuth2.1 to the MCP spec [0] to allow secure comms to remote
| servers.
|
| However, there's a major concern that server hosters are on the
| hook to implement authorization. Ongoing discussion here [1].
|
| [0] https://modelcontextprotocol.io/specification/2025-03-26
|
| [1]
| https://github.com/modelcontextprotocol/modelcontextprotocol...
| dmarble wrote:
| Direct link to the spec page on authorization:
| https://modelcontextprotocol.io/specification/2025-03-26/bas...
|
| Source:
| https://github.com/modelcontextprotocol/modelcontextprotocol...
___________________________________________________________________
(page generated 2025-05-01 23:00 UTC)