[HN Gopher] Show HN: Browser MCP - Automate your browser using C...
___________________________________________________________________
Show HN: Browser MCP - Automate your browser using Cursor, Claude,
VS Code
Author : namukang
Score : 582 points
Date : 2025-04-07 16:25 UTC (1 days ago)
(HTM) web link (browsermcp.io)
(TXT) w3m dump (browsermcp.io)
| amendegree wrote:
| So is MCP the new RPA (Robotics Process Automation)? Like generic
| yahoo pipes?
| ajcp wrote:
| No, since MCP is just an interface layer it is to AI what REST
| API is to DPA and COM/App DLLs are to RPA.
|
| APA (Agentic Process Automation) is the new RPA, and this is
| definitely one example of it.
| XCSme wrote:
| But AI already supported function calling, and you could
| describe them in various ways. Isn't this just a different
| way to define function calling?
| spmurrayzzz wrote:
| I just view it as a relative minor convenience, but it's not
| some game-changer IMO.
|
| The tool use / function calling thing far predates Anthropic
| releasing the MCP specification and it really wasn't that
| onerous to do before either. You could provide a json schema
| spec and tell the model to generate compliant json to pass to
| the API in question. MCP doesn't inherently solve any of the
| problems that come up in that sort of workflow, but it does
| provide an idiomatic approach for it (so there's a non-zero
| value there, but not much).
| PantaloonFlames wrote:
| It seems the benefit of MCP is for Anthropic to enlist the
| community in building integrations for Claude desktop, no?
|
| And if other vendors sign on to support MCP, then it becomes
| a self reinforcing cycle of adoption.
| JackYoustra wrote:
| MCP is useful because anthropic has a disproportionate
| share of API traffic relative to its valuation and a tiny
| share of first-party client traffic. The best way around
| this is to shift as much traffic to API as possible.
| PantaloonFlames wrote:
| First party client , meaning browser? User agent or ...
| Electron app, or , any mobile app?
| JackYoustra wrote:
| first party client as in a claude subscription will give
| you access (mostly app + web)
| spmurrayzzz wrote:
| Yea it certainly does benefit Claude Desktop to some
| degree, but most MCP servers are a few hundred SLOC and the
| protocol schema itself is only ~400 SLOC. If that was the
| only major obstacle standing in the way of adoption, I'd be
| very surprised.
|
| Coupled with the fact that any LLM trained for tool use can
| utilize the protocol, it doesn't feel like much of a moat
| that uniquely positions Claude Desktop in a meaningful way.
| asabla wrote:
| > And if other vendors sign on to support MCP, then it
| becomes a self reinforcing cycle of adoption
|
| This is exactly what's happening now. A good portion of
| applications, frameworks and actors are starting to support
| it.
|
| I've been reluctant on adopting MCP in applications until
| there was enough adoption.
|
| However, depending on your use case it may also be too
| complex for your use case.
| kmangutov wrote:
| The interesting thing about MCP as a tool use protocol is the
| traction that it has garnered in terms of clients and servers
| supporting it.
| wonderwhyer wrote:
| I would probably call it shipping containers for LLM tool
| integrations.
|
| Containers are not a big deal when viewed in isolation. But
| when its common size/standard for all kinds of ships, cranes
| and trucks, it is a big deal then.
|
| In that sense its more about gathering community around one way
| to do things.
|
| In theory there are REST APIs and OpenAPI standard, but those
| were not made for LLMs but code. So you usually need some kind
| of friendly wrapper(like for candy) on top of REST API.
|
| It really starts to feel like a a big deal when you work in
| integrating LLMs with tools.
| tmvphil wrote:
| I'm a bit stuck on this, maybe you can explain why an LLM
| would have any difficulty writing REST API calls? Seems like
| it should be no problem.
| buttofthejoke wrote:
| Why use this over Puppeteer or Playwright extensions?
| namukang wrote:
| The Puppeteer MCP server doesn't work well because it requires
| CSS selectors to interact with elements. It makes up CSS
| selectors rather than reading the page and generating working
| selectors.
|
| The Playwright MCP server is great! Currently Browser MCP is
| largely an adaptation of the Playwright MCP server to use with
| your actual browser rather than creating a new one each time.
| This allows you to reuse your existing Chrome profile so that
| you don't need to log in to each service all over again and
| avoids bot detection which often triggers when using the fresh
| browser instances created by Playwright.
|
| I also plan to add other useful tools (e.g. Browser MCP
| currently supports a tool to get the console logs which is
| useful for automated debugging) which will likely diverge from
| the Playwright MCP server features.
| buttofthejoke wrote:
| Ooo, i like that. one of the most annoying points has been
| 'not sharing' the browser context. i'll def check it out
| cAtte_ wrote:
| by the way, you can indeed access your personal context with
| Playwright. just `launchPersistentContext()` and set the
| userDataDir to that of your existing Chrome install:
|
| https://playwright.dev/docs/api/class-browsertype#browser-
| ty...
| rahimnathwani wrote:
| This is cool. I'm curious why you chose to use an extension,
| rather than getting the user to run Chrome with remote debugging
| turned on?
| hannofcart wrote:
| Not OP but I suspect it is because of this (mentioned on their
| page):
|
| 'Avoids bot detection and CAPTCHAs by using your real browser
| fingerprint.'
| tylergetsay wrote:
| I don't think remote debugging by itself on a normal chrome
| profile is detectable
| parhamn wrote:
| I'm sure its about the cookies/sessions but I do recall you
| can load cookies from another browser?
| omneity wrote:
| Exposing Chrome CDP is a terrible idea from a security and
| privacy perspective. You get the keys to the whole kingdom
| (and expose them on a standard port with a well documented
| API). All security features of the web can be bypassed, and
| then some, as CDP exposes even more capabilities than
| chrome extensions and without any form of supervision.
| redblacktree wrote:
| You're talking about exposing Chrome CDP to the wider
| internet, right? Or are you highlighting these dangers in
| the local context?
| omneity wrote:
| In the local context as well. Unlike say the docker
| socket which is protected by default using unix
| permissions, the CDP protocol has no authorization,
| authentication or permission mechanism.
|
| Anything on your machine (such as a rogue browser
| extension or a malicious npm/pypi package) could scan for
| this and just get all your cookies - and that's only the
| beginning of your problems.
|
| CDP can access any origin, any data stored (localStorage,
| indexedDB ...), any javascript heap, cross iframe and
| origin boundaries, run almost undetectable code that uses
| your sessions without you knowing, and the list is very
| long. CDP was never meant to expose a real browser in an
| untrusted context.
| namukang wrote:
| An extension is more user-friendly! I leave Chrome open
| basically 24/7 and having to create a new Chrome instance via
| the command line just to use Browser MCP just felt like too
| high of a barrier.
| neilellis wrote:
| Well done, just tested on Claude Desktop and it worked smoothly
| and a lot less clunky than playwright. This is the right
| direction to go in.
|
| I don't know if you've done it already, but it would be great to
| pause automation when you detect a captcha on the page and then
| notify the user that the automation needs attention. Playwright
| keeps trying to plough through captchas.
| johnpaulkiser wrote:
| > Private > Since automation happens locally, your browser
| activity stays on your device and isn't sent to remote servers.
|
| I think this is bullshit. Isn't the dom or whatever sent to the
| model api?
| namukang wrote:
| Of course, you're sending data to the AI model, but the
| "private" aspect is contrasting automating using a local
| browser vs. automating using a remote browser.
|
| When you automate using a remote browser, another service (not
| the AI model) gets all of the browsing activity and any
| information you send (e.g. usernames and passwords) that's
| required for the automation.
|
| With Browser MCP, since you're automating locally, your
| sensitive data and browser activity (apart from the results of
| MCP tool calls that's sent to the AI model) stay on your
| device.
| johnpaulkiser wrote:
| I think we need to be very careful & intentional about the
| language we use with these kinds of tools, especially now
| that the MCP floodgates have been opened. You aren't just
| exposing the users browsing data to which ever model they are
| using, you are also exposing it any tools they may be
| allowing as well.
|
| A lot of non technical people are using these tools to "vibe"
| their way to productivity. I would explicitly tell them that
| potentially "all" of their browsing data is going to be
| exposed to their LLM client and they need to use this at
| their own risk.
| Fernicia wrote:
| Any plans to make a Firefox version?
| namukang wrote:
| Browser MCP uses the Chrome DevTools Protocol (CDP) to automate
| the browser so it currently only works for Chromium-based
| browsers.
|
| Unfortunately, Firefox doesn't expose WebDriver BiDi (the
| standardized version of CDP) to browser extensions AFAIK
| (someone please correct me if I'm mistaken!), so I don't think
| I can support it even if I tried.
| krono wrote:
| Just found this[0] implementation roadmap on Mozilla's wiki,
| recently updated too! At least it's actively being worked on.
|
| Not going to lie, this makes me happy.
|
| [0]: https://wiki.mozilla.org/WebDriver/RemoteProtocol/WebDri
| ver_...
| 101008 wrote:
| Good, just what we needed. More bots browsing the internet.
| Somedays I think I am not 100% against of every website having a
| captcha...
| mgraczyk wrote:
| It's a developer tool
| 101008 wrote:
| Then it should be limited to localhost or something similar.
| mgraczyk wrote:
| It can be, just do that when you install it
| dalemhurley wrote:
| What if you are using domain names for your local
| environment or a cloud environment like IDX or you want to
| automate the testing of the UAT environment?
| handfuloflight wrote:
| Not out of the realm of possibility that this very comment was
| written by a bot prompted to write a negative response to a
| given piece of content.
| 101008 wrote:
| Not, human tired of creating content to put online and being
| consumed not by people but by bots or any other form of
| mechanical consumption that I don't like. As the owner of the
| content I think I have the right to set that preference,
| don't you think?
| brandensilva wrote:
| Yeah this is definitely a bad English bot
| DebtDeflation wrote:
| In the Task Automation demo, how does it know all of the
| attributes of the motorcycle he is trying to sell? Is it relying
| on the underlying LLM's embedded knowledge? But then how would it
| know the price and mileage? Is there some underlying document not
| referenced in the demo? Because that information is not in the
| prompt.
| behnamoh wrote:
| What I don't like about LLMs is that people keep re-inventing the
| wheel over and over. For example, we've been able to control
| browsers using GPT for about 2 years now:
|
| - https://github.com/mayt/BrowserGPT
|
| - https://github.com/TaxyAI/browser-extension
|
| - https://github.com/browser-use/browser-use
|
| - https://github.com/Skyvern-AI/skyvern
|
| - https://github.com/m1guelpf/browser-agent
|
| - https://github.com/richardyc/Chrome-GPT
|
| - https://github.com/handrew/browserpilot
|
| - https://github.com/ishan0102/vimGPT
|
| - https://github.com/Jiayi-Pan/GPT-V-on-Web
| ajcp wrote:
| I think this is noteworthy in that it is using what is
| increasingly becoming the dominant API protocol for LLM.
|
| Just because the wheel exists doesn't mean we shouldn't strive
| to make it better by applying new knowledge and technologies to
| it.
| darepublic wrote:
| none of these have stuck right. And none of them work well
| enough that all web dev agencies no longer have to worry about
| e2e testing. (or do some of them? Maybe the market is simply
| that inefficient).
| dvngnt_ wrote:
| I don't see this being a solution for full e2e regression
| testing. Having to run inference for each command/test seems
| expensive. I do think there's room for self-healing tests
| after failure.
| dumansizsercan wrote:
| Competitors don't just challenge you, they push you to deliver
| your best work.
| dimgl wrote:
| This is a bit disingenuous, no? None of these have actually
| taken off.
| BrandiATMuhkuh wrote:
| This is really well done! Very cool.
|
| I wonder if it's possible to add such plugins to election apps
| (e.g.: Slack). It would be such a nice experience if I could just
| connect my AI of choice to a local app.
| decayiscreation wrote:
| Good idea! I'm sure this is possible since it looks like
| playwright can control electron apps.
| https://playwright.dev/docs/api/class-electronapplication
| chrisweekly wrote:
| election -> Electron
| icelancer wrote:
| I just run into a bunch of errors on my Windows machine + Chrome
| when connected over remote-ssh. Extension installed, tab enabled,
| npx updated/installed, etc.
|
| 2025-04-07 10:57:11.606 [info] rmcp: Starting new stdio process
| with command: npx @browsermcp/mcp@latest
|
| 2025-04-07 10:57:11.606 [error] rmcp: Client error for command
| spawn npx ENOENT
|
| 2025-04-07 10:57:11.606 [error] rmcp: Error in MCP: spawn npx
| ENOENT
|
| 2025-04-07 10:57:11.606 [info] rmcp: Client closed for command
|
| 2025-04-07 10:57:11.606 [error] rmcp: Error in MCP: Client closed
|
| 2025-04-07 10:57:11.606 [info] rmcp: Handling ListOfferings
| action
|
| 2025-04-07 10:57:11.606 [error] rmcp: No server info found
|
| ---
|
| EDIT: Ended up fixing it by patching index.js.
| killProcessOnPort() was the problem. Can hit me up if you have
| questions, I cannot figure out how to put readable code in HN
| after all these years with the fake markdown syntax they use.
| namukang wrote:
| Thanks for the report and the update! I'd love to hear about
| what you changed -- how can I get in touch? I didn't see
| anything in your HN profile. Feel free to email me at
| admin@browsermcp.io
| deathanatos wrote:
| > _I cannot figure out how to put readable code in HN after all
| these years with the fake markdown syntax they use._
|
| Not that HN supports much in the way of markup, but code blocks
| are actually the same as Markdown: indent (by 2 spaces or more,
| in HN's syntax; Markdown calls for 4 or more, so they're
| compatible). print("Hello, world.")
| serverlessmania wrote:
| Did something similar but controls a hardware synth, allowing me
| to do sound design without touching the physical knobs:
| https://github.com/zerubeus/elektron-mcp
| dmix wrote:
| Oh good idea.
|
| Imagine it controlling plugins remotely, have an LLM do
| mastering and sound shaping with existing tools. The complex
| overly-graphical UIs of VSTs might be a barrier to performance
| there, but you could hook into those labeled midi mapping
| interfaces to control the knobs and levels.
| picardo wrote:
| I like this. It would be interesting to use it for when I need to
| use authenticated browser sessions.
| washedDeveloper wrote:
| Can you add a license to your code along with open sourcing the
| chrome extension?
| jngiam1 wrote:
| Pretty cool, do you know of a version of this that supports the
| new remote MCP protocol
| omneity wrote:
| We work on something similar and aim to be the huggingface hub
| for automations you can run in your browser[0], with built-in
| support for MCP SSE.
|
| Use the pre-built Trails[1][2] as MCP servers or create and
| publish your own with a familiar puppeteer-like API, powered by
| your or your friends browsers.
|
| 0: https://herd.garden
|
| 1: https://herd.garden/trails/@herd/browser
|
| 2: https://herd.garden/trails/@omneity/serp
| qwertox wrote:
| MCP seems to be JavaScript's trojan horse into AI.
| ketzo wrote:
| "Trojan horse"? 95% of people currently access AI via web or
| mobile app; those are pretty JS-dominated, no?
| tigrezno wrote:
| this is the way
| nonethewiser wrote:
| Stuff like this makes me giddy for manual tasks like
| reimbursement requests. Its such a chore (and it doesnt help our
| process isnt great).
|
| Every month, go to service providers, log in, find and download
| statement, create google doc with details filled in, download it,
| write new email and upload all the files. Maybe double chek the
| attachments are right but that requires downloading them again
| instead of being able to view in email).
|
| Automating this is already possible (and a real expense tracking
| app can eliminate about half of this work) but I think AI tools
| have the potential to elminate a lot of the nittier-grittier
| specification of it. This is especially important because these
| sorts of workflows are often subject to little changes.
| bhouston wrote:
| So the website claims:
|
| "Avoids bot detection and CAPTCHAs by using your real browser
| fingerprint."
|
| Yeah, not really.
|
| I've used a similar system a few weeks back (one I wrote myself),
| having AI control my browser using my logged in session, and I
| started to get Captcha's during my human sessions in the browser
| and eventually I got blocked from a bunch of websites. Now that
| I've stopped using my browser session in that way, the blocks
| eventually went away, but be warned, you'll lose access yourself
| to websites doing this, it isn't a silver bullet.
| DeathArrow wrote:
| It might depend on the speed with which you click on the
| elements on the website.
| SSLy wrote:
| it does, CF bans my own honest to God clicks if I do them too
| fast.
| omgwtfbyobbq wrote:
| About five years ago, maybe more, Google started sending me
| captchas if I ran too many repetitive searches. I could be
| wrong, but it feel like most large platforms have fairly
| sophisticated anti-bot/scraping stuff in place.
| SubiculumCode wrote:
| Google does the same to me: Don't they know, I keep
| modifying my searches because their results sucked so bad
| I had to try 30 times to find the piece of information I
| needed?
| clown_strike wrote:
| Yandex does the same.
| what wrote:
| GitHub regularly blocks me for some reason. They tell me
| to slow down and I'm blocked for hours. I don't get it.
| rcakebread wrote:
| Make sure you are logged in. It was blocking me after
| just a couple searches if not logged in.
| Tepix wrote:
| Remember when github disabled searches for users who
| aren't logged in? Well, they just set the threshold for
| searches to 0 these days so they have de-facto disabled
| them again, this time avoiding the shitstorm.
| PantaloonFlames wrote:
| SSLy the speed clicker
| michaelbuckbee wrote:
| I use Vimium (Chrome extension for using keyboard control
| of the browser) and this happens to me as well since the
| behavior looks "unnatural".
| sitkack wrote:
| Must suck for people with assistive software. I get
| blocked on CF for now damn reason.
| verve_rat wrote:
| Yeah, I do wonder if there are any ADA implications with
| that?
| TeMPOraL wrote:
| I really really hope there are. Not just because of
| people who _need_ these provisions, but also for everyone
| else, as accessibility is the last line of defense for
| preserving end-user interoperability.
|
| Screen readers need to see a de-bullshittified, machine-
| readable version of the site + this is required by law
| sometimes, and generally considered a nice thing to
| enable -> the site becomes not just screen-reader
| friendly, but end user automation-friendly in general.
|
| (I don't know how long this will hold, though. LLMs are
| already capable of becoming a screen reader _without_ any
| special provisions - they can make sense of the UI the
| same way a sighted person can. I wouldn 't trust them
| much now, but they'll only get better.)
| bombela wrote:
| Same here. And I am also using vimium.
| wordofx wrote:
| I wish people would stop using CF. It's just making the
| internet worse.
| fastball wrote:
| How so?
| tempest_ wrote:
| The caveat with these things is usually "when used with high
| quality proxies".
|
| Also I assume this extension is pretty obvious so it wont take
| long for CF bot detection to see it the same as playwrite or
| whatever else.
| SkyBelow wrote:
| What do you think they might be looking for that could be
| detected pretty quickly? I'm wondering if it is something like
| they can track mouse movement and calculate when a mouse is
| moving too cleanly, so adding some more human like noise to the
| mouse movement can better bypass the system. Others have
| mentioned doing too many actions too fast, but what about
| potential timing between actions. Even if every click isn't
| that fast, if they have a very consistent delay that would be
| another non-human sign.
| tempoponet wrote:
| Modern captchas use a number of tools including many of the
| approaches you mentioned. This why you might sometimes see a
| CloudFlare "I am not a robot" checkbox that checks itself and
| moves along before you have much time to even react. It's
| looking at a number of signals to determine that you're
| probably human before you've even checked the box.
| dalemhurley wrote:
| When I am using keyboard navigation, shortcuts and
| autofills, I seem to get mistaken for a bot a lot. These
| Captchas are really bad at detecting bots and really good
| at falsely labelling humans as bots.
| willsmith72 wrote:
| Well you have to have false positives or negatives. Maybe
| they prefer positives
| Quarrel wrote:
| With AI feeding / scraping traffic to sites growing
| ridiculously fast, I think captchas & their equivalent
| are only going to be on the rise, and given the rise in
| so many people selling residential proxies I see, I don't
| doubt that measures and counter-measures on both sides
| are getting more and more sophisticated.
|
| > These Captchas are really bad at detecting bots and
| really good at falsely labelling humans as bots.
|
| As a human it feels that way to you. I suspect their
| false-positive rate is very low.
|
| Of course, you may well be right that you get pinged more
| because of your style of browsing, which sux.
| magicalhippo wrote:
| They're detecting patterns predominantly bots use. The
| fact that some humans also use them doesn't change that.
|
| Back when I was playing Call of Duty 4, I got routinely
| accused of cheating because some people didn't think it
| was possible to click the mouse button as fast as I did.
|
| To them it looked like I had some auto-trigger bot or
| Xbox controller.
|
| I did in fact just have a good mouse and a quick finger.
| animuchan wrote:
| What's different is the badness of the outcome: if
| children mislabel you as a cheater in CoD, you may get
| kicked from the server.
|
| If CloudFlare mislabels you as a bot, however, you may be
| unable to access medical services, or your bank account,
| or unable to check in for a flight, stuff like that.
| Actual important things.
|
| So yes, I think it's not unreasonable to expect more from
| CF. The fact that some humans are routinely
| mischaracterized as bots should be a blocker level issue.
| magicalhippo wrote:
| Does it suck? Yes, absolutely. Should CF continuously
| work to reduce false positives? Yes, absolutely.
|
| I've never failed the CF bot test so don't know how that
| feels. Though I have managed to get to level 8 or 9 on
| Google's ReCaptcha in recent times, and actually given up
| a couple of times.
|
| Though my point was just it's gonna boil down to a duck
| test, so if you walk like a duck and quack like a duck,
| CF might just think you're a duck.
| diatone wrote:
| Given the volume of bots they tend to be remarkably good
| at detecting bots
|
| source: I work in a team that uses this kind of bot
| detection and yes, it works. And yes we do our best to
| keep false positives down
| kmacdough wrote:
| > I'm wondering if it is something like they can track mouse
| movement
|
| Yes, this is a big signal they use.
|
| > adding some more human like noise to the mouse
|
| Yes, this is a standard avoidance strategy. Easier said than
| done. For every new noise generation method, they work on
| detection. They also detect more global usage patterns and
| other signals, so you'd need to immitate the entire workflow
| of being human. At least within the noise of their current
| models.
| econ wrote:
| Have a lot of small things count towards the result. Users
| behave quite linearly, extra points if they act differently
| all of a sudden.
| mrweasel wrote:
| There's also the whole issue of captchas being in place because
| people cannot be trusted to behave appropriately with
| automation tools.
|
| "Avoids bot detection and CAPTCHAs" - Sure asshole, but
| understand that's only in place because of people like you. If
| you truly need access to something, ask for an API, may you
| need to pay for it, maybe you don't. May you get it, maybe the
| site owner tells you to go pound sand and you should take that
| as you're behaviour and/or use case is not wanted.
| TeMPOraL wrote:
| Actually, the CAPTCHAs are in place mostly because of
| assholes like _you_ abusing _other assholes like you_ [0].
|
| Most of the automated misbehavior is businesses doing it to
| other businesses - in many cases, it's direct competition, or
| a third party the competition outsources it to. Hell, your
| business is probably doing it to them too (ask the marketing
| agency you're outsourcing to).
|
| > _If you truly need access to something, ask for an API, may
| you need to pay for it, maybe you don 't._
|
| Like you'd give it to me when you know I want it to skip your
| ads, or plug it to some automation or a streamlined UI, so I
| don't have to waste minutes of my life navigating your
| bloated, dog-slow SPA? But no, can't have users be invisible
| in analytics and operate outside your carefully designed
| sales funnel.
|
| > _May you get it, maybe the site owner tells you to go pound
| sand and you should take that as you 're behaviour and/or use
| case is not wanted._
|
| Like they have a final say in this.
|
| This is an evergreen discussion, and well-trodden ground.
| There is a reason the browser is also called "user agent";
| there is a well-established separation between user's and
| server's zone of controls, so as a site owner, stop poking
| your nose where it doesn't belong.
|
| --
|
| [0] - Not "you" 'mrweasel personally, but "you" the imaginary
| speaker of your second paragraph.
| mrweasel wrote:
| It seems that we have very different types of businesses in
| mind. I really didn't consider tracking users and
| displaying ads, but I also don't think this is where these
| types of tools would be used. Well, they might, but that's
| as part of some content farm, undesirable bots and
| downright scams, so nothing of value is really lost if this
| didn't exist.
|
| If you have a sales funnel, as in you take orders and ship
| something to a customer, consumer or business, I almost
| guarantee you that you can request an API, if the company
| you want to purchase from is large enough. They'll probably
| give you the API access for free, or as part of a signup
| fee and give you access to discounts. Sometimes that API
| might be an email, or a monthly Excel dump, but it's an
| API.
|
| When we're talking site that purely survive on tracking
| users and reselling their data, then yes, they aren't going
| to give you API access. Some sites, like Reddit does offer
| it I think, but the price is going to be insane, reflecting
| their unwillingness to interact with users in this way.
|
| > Not "you" 'mrweasel personally
|
| Understood, but thank you :-)
| TeMPOraL wrote:
| > _It seems that we have very different types of
| businesses in mind. I really didn 't consider tracking
| users and displaying ads, but I also don't think this is
| where these types of tools would be used._
|
| I wasn't thinking primarily about tracking and ads here
| either, when it comes to B2B automation. What I meant was
| e.g. shops automatically scrapping competing stores on a
| continued basis, to adjust their own prices - a modern
| version of the old "send your employees incognito to the
| nearby stores and have them secretly note down prices".
| Then you also have comparison-shopping (pricing
| aggregators) sites that are after the same data, too.
|
| And then of course there's automated reviews (reading and
| writing), trying to improve your standing and/or sabotage
| competition. There's all kinds of more or less legit
| business intelligence happening, etc. Then there's
| wholesale copying of sites (or just their data) for SEO
| content farms, and... I could go on.
|
| Point being, it's not the people who want to streamline
| their own work, make access more convenient for
| themselves, etc. that are the badly-behaving actors and
| reasons for anti-bot defenses.
|
| > _If you have a sales funnel, as in you take orders and
| ship something to a customer, consumer or business, I
| almost guarantee you that you can request an API, if the
| company you want to purchase from is large enough. They
| 'll probably give you the API access for free, or as part
| of a signup fee and give you access to discounts.
| Sometimes that API might be an email, or a monthly Excel
| dump, but it's an API._
|
| The problem from a POV of a regular users like me is, I'm
| not in this for business directly; the services I use are
| either too small to bother providing me special APIs, or
| I am too small for them to care. All I need is to
| streamline my access patterns to services I already use,
| perhaps consolidate it with other services (that's what
| MCP is doing, with LLM being the glue), but otherwise not
| doing anything disruptive to their operations. And I'm
| denied that, because... Bots Bad, AI Bad, Also Pay Us For
| Privilege?
|
| > _When we 're talking site that purely survive on
| tracking users and reselling their data, then yes, they
| aren't going to give you API access. Some sites, like
| Reddit does offer it I think, but the price is going to
| be insane, reflecting their unwillingness to interact
| with users in this way._
|
| Reddit is an interesting case because the changes to
| their API and 3rd-party client policies happened
| recently, and clearly in response to the rise of LLMs. A
| lot of companies suddenly realized the vast troves of
| user-generated content they host are valuable beyond just
| building marketing profiles, and now they try to lock it
| all up in order to extort rent for it.
| unixfox wrote:
| The extension enable debugging in your browser (a banner
| appears telling you about automation). It's possible to detect
| that in JavaScript.
|
| Hence why projects like this exist:
| https://github.com/Kaliiiiiiiiii-Vinyzu/patchright. They hide
| the debugging part from JavaScript.
| tntpreneur wrote:
| Thanks but idea is ok but it is not working smoothly.
| cadence- wrote:
| Doesn't work on Windows:
|
| 2025-04-07T18:43:26.537Z [browsermcp] [info] Initializing
| server... 2025-04-07T18:43:26.603Z [browsermcp] [info] Server
| started and connected successfully 2025-04-07T18:43:26.610Z
| [browsermcp] [info] Message from client: {"method":"initialize","
| params":{"protocolVersion":"2024-11-05","capabilities":{},"client
| Info":{"name":"claude-
| ai","version":"0.1.0"}},"jsonrpc":"2.0","id":0}
| node:internal/errors:983 const err = new Error(message); ^
|
| Error: Command failed: FOR /F "tokens=5" %a in ('netstat -ano ^|
| findstr :9009') do taskkill /F /PID %a at genericNodeError
| (node:internal/errors:983:15) at wrappedFn
| (node:internal/errors:537:14) at checkExecSyncError
| (node:child_process:882:11) at execSync
| (node:child_process:954:15)
| cadence- wrote:
| I was able to make it work like this:
|
| 1. Kill your Claude Desktop app
|
| 2. Click "Connect" in the browser extension.
|
| 3. Quickly start your Calude Desktop app.
|
| It will work 50% of the time - I guess the timing must be just
| right for it to work. Hopefully, the developers can improve
| this.
|
| Now on to testing :)
| namukang wrote:
| Can you try again?
|
| There was another comment that mentioned that there's an issue
| with port killing code on Windows:
| https://news.ycombinator.com/item?id=43614145
|
| I just published a new version of the @browsermcp/mcp library
| (version 0.1.1) that handles the error better until I can
| investigate further so it should hopefully work now if you're
| using @browsermcp/mcp@latest.
|
| FWIW, Claude Desktop currently has a bug where it tries to
| start the server twice, which is why the MCP server tries to
| kill the process from a previous invocation:
| https://github.com/modelcontextprotocol/servers/issues/812
| cadence- wrote:
| It's working now with the 0.1.0 for me. But I will let you
| know if I experience any issues once I get updated to 0.1.1.
|
| Thanks, great job! I like it overall, but I noticed it has
| some issues entering text in forms, even on google.com. It's
| able to find a workaround and insert the searched text in the
| URL, but it would be nice if the entry into forms worked well
| for UI testing.
| graiz wrote:
| works better than puppet mcp for me but having issues with
| keyboard events and actions on some websites.
| xena wrote:
| Do you respect robots.txt so administrators can block this tool?
| randunel wrote:
| Do user agents doing work for users need to respect robots.txt?
| If yes, does chrome?
| what wrote:
| Any scraper is also a "user agent doing work for users".
| Which ones should respect robots.tx?
| randunel wrote:
| Does the user agent fit the definition of a web crawler? If
| so, then observe robots.txt. This one does not, see
| https://en.m.wikipedia.org/wiki/Web_crawler
| canogat wrote:
| Should I be blocked if I ask Claude Desktop to lower the prices
| in all of my Craigslist ads by 10%?
| wifipunk wrote:
| Setting this up for claude desktop and cursor was alright. Works
| well out of the box with little setup, and I like that it
| attached to my active browser tab. Keep up the good work.
| ndr wrote:
| WARNING for Cursor users:
|
| Cursor is currently stuck using an outdated snapshot of the
| VSCode Marketplace, meaning several extensions within Cursor
| remain affected by high-severity CVEs that have already been
| patched upstream in VSCode. As a result, Cursor users unknowingly
| remain vulnerable to known security issues. This issue has been
| acknowledged but remains unresolved:
| https://github.com/getcursor/cursor/issues/1602#issuecomment...
|
| Given Cursor's rising popularity, users should be aware of this
| gap in security updates. Until the Cursor team resolves the
| marketplace sync issue, caution is advised when using certain
| extensions.
|
| I've flagged it here, apologies for the repost:
| https://news.ycombinator.com/item?id=43609572
| rs186 wrote:
| I am surprised that the VSCode team hasn't gone after them for
| mirroring the marketplace, as the Visual Studio team made it
| very clear that they don't want anybody to do that -- it is
| _their_ marketplace.
| SSLy wrote:
| It seems that there is one sane PM left at VScode who knows
| that such move would only lead to MSFT losing more PR. And
| anti-trust scrutiny?
| JackYoustra wrote:
| Why? This seems fine.
| pknerd wrote:
| So why do I need an editor(Cusror)? How does a non-coder use it?
| rahimnathwani wrote:
| If you're a non-coder, use it with Claude Desktop.
| cadence- wrote:
| How does this compare to Anthropic's Computer Use?
| thenaturalist wrote:
| Crazy, in looking up some info on the web and creating a
| Spreadsheet on Google Sheets to insert the results, it worked
| almost perfectly the first time and completely failed
| subsequently on 8-10 different tries.
|
| Is there an issue with the lag between what is happening in the
| browser and the MCP app (in my case Claude Desktop)?
|
| I have a feeling the first time I tried it, I was fast enough
| clicking the "Allow for this chat" permissions, whereas by the
| time I clicked the permission on subsequent chats, the LLM just
| reports "It seems we had an issue with the click. Let me try
| again with a different reference.".
|
| Actions which worked flawlessly the first time (rename a Google
| spreadsheet by clicking on the title and inputting the name) fail
| 100% of subsequent attempts.
|
| Same with identifying cells A1, B1, etc. and inserting into the
| rows.
|
| Almost perfect on 1st try, not reproducible in 100% of attempts
| afterwards.
|
| Kudos to how smooth this experience is though, very nice setup &
| execution!
|
| EDIT 2: The lag & speed to click the allow action make it
| seemingly unusable in Claude Desktop. :(
| otherayden wrote:
| Such a rich UI like google sheets seems like a bad use case for
| such a general "browser automation" MCP server. Would be cool
| to see an MCP server like this, but with specific tools that
| let the LLM read and write to google sheets cells. I'm sure it
| would knock these tasks out of the park if it had a more
| specific abstraction instead of generally interacting with a
| webpage
| mkummer wrote:
| Agreed, I'd been working on a Google Sheets specific MCP last
| week - just got it published here:
| https://github.com/mkummer225/google-sheets-mcp
| rahimnathwani wrote:
| This is cool. You should submit this as a 'Show HN'.
|
| Also consider publishing it so people can use it without
| having to use git.
| freeone3000 wrote:
| Publishing it _where_? It can't be a github page, it's
| too complex; anything else incurs real costs.
| rahimnathwani wrote:
| I mean publish it on the npm registry
| (https://www.npmjs.com/signup). That way, it would be
| easy to install, just by adding some lines to
| claude_desktop_config.json: {
| "mcpServers": { "ragdocs": {
| "command": "npx", "args": [
| "-y", "@qpd-v/mcp-server-ragdocs"
| ], "env": { "QDRANT_URL":
| "http://127.0.0.1:6333",
| "EMBEDDING_PROVIDER": "ollama",
| "OLLAMA_URL": "http://localhost:11434" }
| }, } } }
| throwaway314155 wrote:
| What you're experiencing is commonly referred to as "luck".
| It's the same reason people consistently think newer versions
| of ChatGPT are nerfed in some way. In reality, people just got
| lucky originally and have unrealistic expectations based on
| this originally positive outcome.
|
| There's no bug or glitch happening. It's just statistically
| unlikely to perform the action you wanted and you landed a good
| dice roll on your first turn.
| lizardking wrote:
| For me it can't click anywhere on google sheets. I get the
| following error
|
| --Error: Cannot access a chrome-extension:// URL of different
| extension
| Gehinnn wrote:
| Would be nice if it could use the Accessibility Tree from chrome
| dev tools to navigate the page instead of relying on screenshots
| (https://developer.chrome.com/blog/full-accessibility-tree)
| mgraczyk wrote:
| In fact you have it backwards. It has no screenshots at the
| moment, only the accessibility tree
| throwaway81523 wrote:
| Can these things automatically solve recaptcha? That's the only
| AI browser feature that I have a real use for.
| SparkyMcUnicorn wrote:
| https://github.com/dessant/buster
| jayunit wrote:
| awesome! For the Cursor / React / Click to Add 2 example, can we
| also have it write a unit/e2e regression test?
| jayunit wrote:
| author replied on Twitter:
|
| > that's a great use case! the aria snapshot that browser mcp
| generates is enough to write tests for playwright using its
| role-based locators, but i may add a get_page_html tool in the
| same way that they're considering:
| https://github.com/microsoft/playwright-mcp/issues/103
|
| https://x.com/roadtoramen/status/1909356255866733044
| otherayden wrote:
| I literally started working on the same exact idea last night
| haha. Great work OP. I'm curious, how are you feeding the web
| data to the LLM? Are you just passing the entire page contents to
| it and then having it interact with the page based on CSS
| selectors/xpath? Also, what are your thoughts on letting it do
| its own scripting to automate certain tasks?
| andy_ppp wrote:
| When I go to a shopping website I want to be able to tell my
| browser "hey please go through all the sideboards on this list
| and filter out for the ones that are larger than 155cm and
| smaller than 100cm, prioritise the ones with dark wood and space
| for vinyl records which are 31.43cm tall" for example.
|
| Is there any browser that can do this yet as it seems extremely
| useful to be able to extract details from the page!
| mfkhalil wrote:
| Hey, we're working on MatterRank which is pretty similar to
| this but currently works on web search. (e.g. I want to
| prioritize results that talk about X and have Y bias and I want
| to deprioritize those that are trying to sell me something).
| Feel free to try it out at https://matterrank.ai
|
| Would also be interested in hearing more about what you're
| envisioning for your use case. Are you thinking a browser
| extension that acts on sites you're already on, or some sort of
| shopping aggregator that lets you do this, or something else
| entirely?
| Niksko wrote:
| Not OP but I definitely sympathise with them. I don't know
| how practical it is to implement or how profitable it would
| be, but the problem I often have is this: * I have something
| I want to buy and have specific needs for it (height, color,
| shape, other properties) * I know that there's a good chance
| the website I'm on sells a product that meets those needs (or
| possibly several such that I'd want to choose from) * my
| criteria are more specific than the filters available on the
| site e.g. I want a specific length down to a few cm because I
| want the biggest thing that will fit in a fixed space *
| crucially for an AI use case: the information exists on the
| individual product pages. They all list dimensions and
| specifications. I just don't want to have to go through them
| all.
|
| Example: find me all of the desks on IKEA that come in light
| coloured wood, are 55 inches wide, and rank them from deepest
| to shallowest. Oh, and make sure they're in stock at my
| nearest IKEA, or are delivering within the next week.
| bravura wrote:
| When doing interior decoration, I am definitely interested in
| finding objects that fit very specific prompts.
| unixfox wrote:
| You could do that with browser-use: https://browser-use.com/
| justanotheratom wrote:
| neat, but instead of asking me to install browser extension, can
| you just bundle a browser in the MCP server?
| StevenNunez wrote:
| I feel like I slept for a day and now MCPs are everywhere... I
| don't know what MCPs are and at this point I'm too afraid to ask.
| jastuk wrote:
| And the worst part is that it opens a pandora's box of
| potential exploits;
| https://elenacross7.medium.com/%EF%B8%8F-the-s-in-mcp-stands...
| halJordan wrote:
| At the risk of it sounding like i support theft; the
| automobile, you know, enabled the likes of Bonnie and Clyde
| and that whole era of lawlessness. Until the fbi and crossing
| county lines became a thing.
|
| So im not sure id give up the sum total progress of the
| automobile just because the first decade was a bad one
| joshwarwick15 wrote:
| Most of these are not a real concern with remote servers with
| Oauth. If you install the PayPal MCP MCP server from im-
| deffo-not-hacking-you.com than https://mcp.paypal.com/sse its
| the same sec model as anything else online...
|
| The article also reeks of LLM ironically
| tuananh wrote:
| it still is. if user has 1 bad tool, it's done!
|
| https://invariantlabs.ai/blog/mcp-security-notification-
| tool...
| joshwarwick15 wrote:
| Its the same security model as NPM/left pad yep, but
| consumers still use electron apps? It's a novel attack
| method, but its not a novel attack surface
| TeMPOraL wrote:
| That's not fault of MCP though, that's the fault of vendors
| peddling their MCPs while clinging to the SaaS model.
|
| Yes, MCP is a way to streamline giving LLMs ability to run
| arbitrary code on your machine, however indirectly. It's
| meant to be used on "your side of the airlock", where you
| trust the things that run. _Obviously_ it 's too powerful for
| it to be used with third-party tools you neither trust nor
| control; it's not that different than downloading random
| binaries from the Internet.
|
| I suppose it's good to spell out the risks, but it doesn't
| make sense blaming MCP itself, because those risks are
| fundamental aspects of the features it provides.
| kmacdough wrote:
| It's not blame, but it's a striking reality that needs to
| be kept at the forefront.
|
| It introduces a substantial set of novel failure modes,
| like cross-tool shadowing, which aren't obvious to most
| folks. Making use of any externally developed tooling --
| even open source tools on internal architecture -- requires
| more careful consideration and analysis than most would
| expect. Despite the warnings, there will certainly be major
| breaches on these lines.
| oulipo wrote:
| It's just a way to provide a "library of methods" / API that
| the LLM models can "call", so basically giving them method
| names, their parameters, the type of the output, and what they
| are for,
|
| and then the LLM model will ask the MCP server to call the
| functions, check the result, call the next function if needed,
| etc
|
| Right now if you go to ChatGPT you can't really tell it "open
| Google maps with my account, search for bike shops near NYC,
| and grab their phone numbers", because all he can do is reply
| in text or make images
|
| with a "browser MCP" it is now possible: ChatGPT has a way to
| tell your browser "open Google maps", "show me a screenshot",
| "click at that position", etc
| throwaway314155 wrote:
| > with a "browser MCP" it is now possible: ChatGPT has a way
| to tell your browser "open Google maps", "show me a
| screenshot", "click at that position", etc
|
| It seems strange to me to focus on this sort of standard well
| in advance of models being reliable enough to, ya know,
| actually be able perform these operations on behalf of the
| user with any sort of strong reliability that you would need
| for widespread adoption to be successful.
|
| Cryptocurrency "if you build it they'll come" vibes.
| acedTrex wrote:
| The speed that every major LLM foundational model provider
| has jumped on this bandwagon feels VERY artificial and
| astro turfy...
| XCSme wrote:
| Maybe because the LLM improvements haven't been that good
| in the last year, they needed some new thing to hype
| it/market it.
|
| EDIT: Don't get me wrong, the benchmark scores are indeed
| higher, but in my personal experience, LLMs make as many
| mistakes as they did before, still too unreliable to use
| for cases where you actually need a factually correct
| answer.
| acedTrex wrote:
| This is in my opinion exactly what it is. A bunch of
| people throwing stuff at the wall trying to show
| "impact."
| taberiand wrote:
| I think MCPs compensate for the unreliability issue by
| providing a minimal and well defined interface to a
| controlled set of actions. That way, the llm doesn't have
| to be as reliable thinking what it needs to do and in
| acting, just in choosing what to do from a short list.
| throwaway314155 wrote:
| You can provide an MCP for Pokemon Red, but Claude will
| still flounder for weeks, making absurd mistakes on a
| game literally designed for children.
|
| Believe me. It's not there yet.
| taberiand wrote:
| Is there an MCP for pokemon red?
| throwaway314155 wrote:
| Not that im aware of, but that actually would be an
| interesting project.
|
| I was referring more broadly to ClaudePlaysPokemon, a
| twitch stream where claude is given tool calling into a
| Gameboy Color emulator in order to try to play Pokemon.
| It has slowly made progress and i recommend looking at
| the stream to see just how flawed LLM's are currently for
| even the shortest of timelines w.r.t. planning.
|
| I compared the two because the tool calling API here is a
| similar enough to an MCP configuration with the same
| hooks/tools (happy to be corrected on that though)
| dimitri-vs wrote:
| You actually can, its called Operator and its a complete
| waste of time, just like 99% of agents/MCPs.
| oulipo wrote:
| Operator is basically MCP...
| mattfrommars wrote:
| Isn't the idea of AI agent talking to each by telling LLM
| model to reply say in, JSON and with some parameter value map
| to, say function in Python code? That in retrospect, given
| context {prompt} to LLM will be able to call said function
| code?
|
| Is this what 'calling' is?
| oulipo wrote:
| Yes exactly. MCP just formalize this a bit better
| hedgehog-ai wrote:
| I know what you mean, I think MCP is being widely adopted but
| it's not grassroots.. its a quick entry to this market by an
| established AI company trying to dominate the mind/market share
| of developers before consensus can be reached developers.
| orbital-decay wrote:
| MCP is a standard to plug useful tools into AI models so they
| can use them. The concept looks confusingly reversed and non-
| obvious to a normal person, although devs don't see this
| because it looks like their tooling.
| whalesalad wrote:
| It's RPC specifically for an LLM. But yes it's the new soup de
| jour trend sweeping the globe.
| sdotdev wrote:
| Still slightly confused on what MCPs are but looking at this it
| does look useful
| esafak wrote:
| A protocol (the P in MCP) for LLMs to use tools.
| aryehof wrote:
| A plugin protocol that allows "applications" to interact with
| LLMs.
| darepublic wrote:
| wouldn't it be for LLMs to interact with applications?
| lxe wrote:
| This one also uses aria snapshots formatted as yaml. This will
| quickly exceed context limits.
| revskill wrote:
| Can u expose the sdk as a react component to be used inside an
| app ?
| mvdtnz wrote:
| Is anyone successfully running MCPs / Claude Desktop on Linux?
| iDon wrote:
| I am running this OK in Ubuntu 2404 :
| https://github.com/aaddrick/claude-desktop-debian Claude
| Desktop for Debian-based Linux distributions
|
| From Claude I have connected to these MCP servers OK :
| @modelcontextprotocol/server-filesystem,
| @executeautomation/playwright-mcp-server.
|
| I have connected to OP's extension (browsermcp.io) from vsCode
| (and clicked 1 tab button OK), but not from Claude desktop so
| far (I get Cannot find module 'node:path'; which is require-d
| in npm/lib/cli.js; tried node 18,20,22; some suggestions here :
| https://medium.com/@aleksej.gudkov/error-cannot-find-module-...
| ).
| pavelfeldman wrote:
| I mean no disrespect, but this looks like an outdated clone of
| https://github.com/microsoft/playwright-mcp
|
| https://github.com/microsoft/playwright-mcp/blob/main/src/to...
| https://github.com/BrowserMCP/mcp/blob/main/src/tools/tool.t...
| marifjeren wrote:
| From the Browser MCP README.md:
|
| > Credits: Browser MCP was adapted from the Playwright MCP
| server
| namukang wrote:
| Hey Pavel, this is Namu, the creator of Browser MCP.
|
| You're right, this is an adaptation of Playwright MCP to
| automate the user's local browser as mentioned in the GitHub
| README and here:
|
| -
| https://github.com/BrowserMCP/mcp/blob/3e6824de6f36eba7d2d3b...
|
| - https://news.ycombinator.com/item?id=43613905
|
| Thanks for all your work to Playwright and Playwright MCP. I'm
| a big fan!
|
| (For those not familiar, Pavel is the largest contributor to
| both Playwright and Playwright MCP:
| https://github.com/microsoft/playwright/graphs/contributors,
| https://github.com/microsoft/playwright-
| mcp/graphs/contribut...)
| pavelfeldman wrote:
| Hi Namu, all good! Feel free to send us the patches and work
| upstream, would be happy to see you on board!
| doug_life wrote:
| This may be obvious to most here, but you need Node.js installed
| for the MCP server to run. This critical detail is not in the set
| up instructions.
| namukang wrote:
| Added!
|
| https://docs.browsermcp.io/setup-server#node-js
| hliyan wrote:
| Ideally, shouldn't this be the native experience of most "sites"
| on the internet? We've built an entire user experience around
| serving users rich, two dimensional visual content that is not
| machine-readable and are now building a natural language command
| line layer on top of it. Why not get rid of the middleware and
| present users a direct natural language interface to the
| application layer?
| tuananh wrote:
| i want to add this for my project (which use wasm) but
| rustlang/socket2 WASI support is not merged yet. after that rust
| CDP will work.
| toutiao6 wrote:
| Interesting -- I've been experimenting with MoonBit for Wasm
| builds, and the lack of mature WASI networking is a recurring
| blocker there too. The moment tools like socket2 or HTTP
| clients land with Preview2, we might see real "Wasm-native"
| browser automation.
|
| It's wild to think we could one day write browser automation in
| a GC-backed language, compile to Wasm, and ship it without Node
| or Bash at all.
| tuananh wrote:
| > The moment tools like socket2 or HTTP clients land with
| Preview2
|
| i'm waiting for that as well. my other options are
|
| - either bind a host function to manage wss connection to
| wasm. fork a CDP lib to use that.
|
| - create a proxy between http/wss maybe. And then fork a CDP
| lib to use http proxy i think.
| webprofusion wrote:
| Or just use Playwright MCP:
| https://github.com/microsoft/playwright-mcp
| knes wrote:
| This is great. Especially debugging frontend issue on localhost
| or staging.
|
| Also works flawlessly with augment code.com too!
| makingstuffs wrote:
| I don't see how an MCP can be useful for browsing the net and
| doing things like shopping as has been suggested. Large companies
| such as CloudFlare have spent millions on, and made a business
| from, bot detection and blocking.
|
| Do we suppose they will just create a backdoor to allow _some_
| bots in? If they do that how long will it be before other bots
| impersonate them? It seems like a bit of a fad from my small
| mind.
|
| Suppose it does become a thing, what then? We end up with an
| internet which is heavily optimised for bots (arguably it already
| is to an extent) and unusable for humans?
|
| Wild.
| kraftman wrote:
| There are already plenty of services that provide residential
| proxies and captcha bypass pretty cheaply.
|
| https://brightdata.com/pricing/web-unlocker
| https://2captcha.com/pricing
| TeMPOraL wrote:
| > _Suppose it does become a thing, what then? We end up with an
| internet which is heavily optimised for bots (arguably it
| already is to an extent) and unusable for humans?_
|
| As opposed to the Web we now have, which is heavily optimized
| for... _wasting human life_.
|
| What you're asking for, what "large companies such as
| CloudFlare have spent millions on", is verifying that on the
| other end of the connection is a web browser, and behind that
| web browser there is a human being that's being made to
| needlessly suffer and waste their limited lifespans, as they
| tediously work their way through the UI maze like a good little
| lab rat, watching ads at every turn of the corridor, while
| being constantly surveilled.
|
| Or do you believe there is some _other_ reason why you should
| care about whether you 're interacting with a "human" (really:
| an _user agent_ called "web browser") vs. "not human" (really:
| any other user agent)?
|
| The relationship between the commercial web and its users is
| antagonistic - businesses make money through friction, by
| making it more difficult for users to accomplish their goals.
| That's why we never got the era of APIs and web automation _for
| users_. That 's why we're dealing with tons of bespoke shitty
| SPAs instead of consistent interfaces - because no store wants
| to make it easy for you to comparison-shop, or skip their
| upsells, or efficiently search through the stock; no news
| service wants you to skip ads or make focused searches, etc.
|
| As users, we've lost the battle for APIs and continue to be
| forced to use the "manual web" (with active cooperation of the
| browser vendors, too). MCP feels promising because we're in a
| moment in time, however brief, where LLMs can _navigate the
| "manual web" for us_, shielding us from all the malicious
| bullshit (ads, marketing copy, funneling, call to actions,
| confusing design, dark patterns, less dark patterns, the fact
| that your store is a bloated SPA instead of an endpoint for a
| generic database querying frontend, and so on) while remaining
| mostly impervious to it. This will not last long - the vendors
| de-facto ruling the web have every reason to shut it down (or
| turn it around and use LLMs _against us_ ). But for now, it
| works.
|
| _Adversarial interoperability_ is the name of the game. LLMs,
| especially combined with tool use (and right tools), make it
| much easier and much more accessible than ever before. For
| however brief a moment.
| makingstuffs wrote:
| Sorry it wasn't entirely clear that I was by no means saying
| the web in its current form is anything close to what it
| could/should be. My main point was that, by making backdoors
| for MCPs there will be a new possible entry point for bad
| actors by exploiting said backdoor.
|
| As for the optimisation to _waste human life_ I do agree but
| the reality is that the sites which waste the majority of
| human life/time are the ones which would not be automated by
| the MCP and would, ultimately, see more 'real' usage by
| virtue of the fact that your average human will have more
| time to mindlessly scroll their favourite echo-chamber.
|
| Then we have the whole other debate of whether we really
| believe that the VC funders whom are largely responsible for
| the current state of the web will continue pumping money into
| something which would hurt their bottom line from another
| angle?
| m11a wrote:
| > Do we suppose they will just create a backdoor to allow
| _some_ bots in?
|
| That, and maybe they will as CF seem quite big on MCP.[0] Or
| people just bypass the bot detection. It's already not terribly
| difficult to do; people in the sneaker bot and ticket scalping
| communities have long had bypasses for all the major companies.
|
| I mean, we can all imagine bad use-cases of bots, but there's
| also the pros: the internet wastes loads of human time. I still
| remember needing to browse marketplaces real estate listings
| with terrible search and notification functionality to find a
| flat... _shudders_. Unbelievable amount of hours wasted.
|
| If fewer people are able to build bots that can index a larger
| number of sites and give better searching capabilities, for
| instance, where sites are unable to provide this, I'm
| personally all for it. For many sites, it's that they lack the
| in-house development expertise and probably they wouldn't even
| mind.
|
| [0]: https://developers.cloudflare.com/agents/model-context-
| proto... etc
| jedimastert wrote:
| Most thing that do this kind of fingerprinting bot detection
| aren't looking for a browser that's pretending to be a human,
| they're looking for other programs that are pretending to be a
| browser.
| rmac wrote:
| [!warning!]
|
| 1) this projects' chrome extension sends detailed telemetry to
| posthog and amplitude:
|
| - https://storage.googleapis.com/cobrowser-images/telemetry.pn...
|
| - https://storage.googleapis.com/cobrowser-images/pings.png
|
| 2) this project includes source for the local mcp server, but not
| for its chrome extension, which is likely bundling
| https://github.com/ruifigueira/playwright-crx without attribution
|
| super suss
| bn-l wrote:
| The only chrome extensions you should install are ones you can
| build yourself from source.
| neycoda wrote:
| ... And have reviewed and understand completely
| EGreg wrote:
| So ... pretty much none
|
| Keep in mind, extensions can update themselves at any time,
| including when they're bought out by someone else. In fact,
| I bet that's a huge draw... imagine buying an extension
| that "can read and modify data on all your websites" and
| then pushing an update that, oh I dunno, exfiltrates
| everyone's passwords from their gmail. How would most
| people even catch that?
|
| DO NOT have any extensions running by default except "on
| click".
|
| There should be at least some kind of static checker of
| extensions for their calls to fetch or other network APIs.
| The Web is just too permissive with updating code, you've
| got eval and much more. It would be great if browsers had
| only a narrow bottleneck through which code could be
| updated, and would ask the user first.
|
| (That wouldn't really solve everything since there can be
| sleeper code that is "switched on" with certain data coming
| over the wire, but better than what we have now.)
| metadat wrote:
| It would be interesting if you could easily install
| browser extensions via a source repository URL (e.g.
| GitHub, or any git URL), then at least there would be
| more transparency about who/what you are trusting by
| installing it. Blindly trusting a mostly anonymous chrome
| store "install" button seems insane, since they don't do
| any significant policing. Wasn't the promise of safety
| one of the primary reasons Google started the chrome
| store?
| econ wrote:
| Like user.script/grease monkey. It use to be that you
| could publish a reasonably large script and someone would
| review it. Even better was to start out simple then
| gradually update it so that existing users can continue
| reviewing by looking at the changes.
|
| I think the permission system should be much more
| complicated so that the user gets a prompt that explains
| what is needed and why.
|
| Furthermore there should be [paid] independent reviewers
| to sign off on extensions. This adds a lot of
| credibility, specially to a first time publication
| without users. That would also give app stores someone to
| talk to before deleting something. Nefarious actors
| working for app stores can have their credibility
| questioned.
| bn-l wrote:
| > So ... pretty much none
|
| You'd be surprised. It describes all the extensions I
| use.
| rahimnathwani wrote:
| Keep in mind, extensions can update themselves at any
| time
|
| GP suggested only installing extensions you can build
| yourself from source. Most extensions that auto update do
| so via the Chrome store. If you install an extension from
| source, that won't happen.
| nlarew wrote:
| "detailed" is an anonymized deviceId and a counter of tool
| calls? Heaven forbid an app want to get some basic insights
| into how people use it.
| observationist wrote:
| This automatic sense of entitlement to surveil users is the
| absolute embodiment of the banality of evil.
|
| It's 2025 - we want informed consent and voluntary
| participation with the default assumption that no, we do not
| want you watching over our shoulders, and no, you are not
| entitled to covertly harvest all the data you want and
| monetize that without notifying users or asking permissions.
| The whole ToS gotcha game is bullshit, and it's way past time
| for this behavior to stop.
|
| Ignorance and inertia bolstering the status quo doesn't make
| it any less wrong to pile more bullshit like this onto the
| existing massive pile of bullshit we put up with. It's still
| bullshit.
| nlarew wrote:
| You're making a huge jump from "gathering anonymous
| counters to understand how many people use the thing" to
| "harvest all the data you want and monetize it".
|
| If they were tracking my identity across sites and actually
| selling it to the highest bidder that's one thing that
| we'll definitely agree on. This is so so far from that.
|
| You're welcome to build and use your own MCP browser
| automation if you're so hostile to the developer that built
| something cool and free for you to use.
| tomrod wrote:
| Correct. Telemetry should _always_ be opt-in and explicitly
| an easy choice to not engage.
|
| Any other mode of operation is morally bankrupt.
| nlarew wrote:
| Really? The hyperbole does not help anyone here.
|
| I don't sign a term sheet when I order at McDonalds but you
| can be damn sure they count how many big macs I order. Does
| that make them morally bankrupt? Or is it just a normal
| business operation that is actually totally reasonable?
| namukang wrote:
| Hey, creator of Browser MCP here.
|
| 1. Yes, the extension uses an anonymous device ID and sends an
| analytics event when a tool call is used. You can inspect the
| network traffic to verify that zero personalized or identifying
| information is sent.
|
| I collect anonymized usage data to get an idea of how often
| people are using the extension in the same way that websites
| count visitors. I split my time between many projects and
| having a sense of how many active users there are is helpful
| for deciding which ones to focus on.
|
| 2. The extension is completely written by me, and I wrote in
| this GitHub issue why the repo currently only contains the MCP
| server (in short, I use a monorepo that contains code used by
| all my extensions and extracting this extension and maintaining
| multiple monorepos while keeping them in sync would require
| quite a bit of work):
| https://github.com/BrowserMCP/mcp/issues/1#issuecomment-2784...
|
| I understand that you're frustrated with the way I've built
| this project, but there's really nothing nefarious going on
| here. Cheers!
| Trias11 wrote:
| When people see "I collect" they won't even bother reading
| further.
|
| This is showstopper.
|
| Noble reasons won't matter.
|
| Spyware perception.
| wyldberry wrote:
| This seems to be the opposite of what happens in reality.
| asaddhamani wrote:
| Hey, as a maker, I get it. You spent time building something,
| and you want to understand how it gets used. If you're not
| collecting personal info, there is nothing wrong with this.
|
| Knee-jerk reactions aren't helpful. Yes, too much tracking is
| not good, but some tracking is definitely important to
| improving a product over time and focusing your efforts.
| mrwww wrote:
| How does it compare to playwright mcp?
| metadat wrote:
| Bot Detection Evasion is becoming an increasingly relevant topic.
| Even for non-abusive automation, it's now a necessary
| consideration.
|
| Interesting research and reading via the HN search portal:
| https://hn.algolia.com/?q=bot+detection
| josefrichter wrote:
| What I used this for:
|
| "Go to https://news.ycombinator.com/upvoted?id=josefrichter,
| summarize what topics I am interested in, and then from the
| homepage pick articles I might be interested in."
|
| Works like a charm.
___________________________________________________________________
(page generated 2025-04-08 23:01 UTC)