hngopher.com

       [HN Gopher] Detecting AI agent use and abuse
       ___________________________________________________________________
        
       Detecting AI agent use and abuse
        
       Author : mattmarcus
       Score  : 102 points
       Date   : 2025-02-14 16:18 UTC (6 hours ago)
        
 (HTM) web link (stytch.com)
 (TXT) w3m dump (stytch.com)
        
       | ceejayoz wrote:
       | I fully expect captchas to incorporate "type the racial slur /
       | death threat into the box" soon, as the widely available models
       | will balk at it.
        
         | dexwiz wrote:
         | Anyone who cares about breaking captchas would just run their
         | own model.
        
         | Der_Einzige wrote:
         | Gemini unironically can have all of its safety stuff turned
         | off, the open access models like deepseek can be trivially
         | uncensored (if they aren't already uncensored by default like
         | mistral) .
         | 
         | That's not good enough, but it is funny to imagine.
        
         | deadbabe wrote:
         | It's ironic that some of the first intelligent chatbots very
         | quickly became Nazis and racists, and now we've swung the other
         | way.
        
           | jowea wrote:
           | I am quite sure the people developing the current chatbots
           | were well aware of what happened with Tay etc. I'd bet it's
           | part of the reason for the safety stuff.
        
         | riskable wrote:
         | "What major event happened in 1989 at Tienanmen Square,
         | Beijing, China?"
        
         | reverendsteveii wrote:
         | the LLMs are trained on data stolen from the internet. There's
         | no racial slur they don't know, there's no death threat they
         | can't deliver. Currently our best LLMs are generating new
         | racial slurs to deploy in our eternal quest to make the
         | internet worse. You may have never heard the term "Chapingle"
         | before but don't use it in front of a Lithuanian person after
         | the year 2028 unless you want punched in the mouth.
        
       | diggan wrote:
       | > it could present unacceptable risks for application developers
       | or be used as a method for malicious attacks (e.g. credential
       | stuffing or fake account creation).
       | 
       | The article seems to want to distinguish between "bad" and "good"
       | bots, yet beyond the introduction, seems to treat them exactly
       | the same.
       | 
       | Why are website authors so adamant I need to use whatever client
       | they want to consume their content? If you put up a blog online,
       | available publicly, do you really care if I read it in my
       | terminal or via Firefox with uBlock? Or via an AI agent that
       | fetches the article for me and tags it for me for further
       | categorization?
       | 
       | It seems like suddenly half the internet forgot about the term
       | "user-agent", which up until recently was almost always our
       | browsers, but sometimes feed readers, which was acceptable it
       | seems. But now we have a new user-agent available, "AI Agents",
       | that somehow is unacceptable and should be blocked?
       | 
       | I'm not sure I agree with the premise that certain user-agents
       | should be blocked, and I'll probably continue to let everyone
       | chose their own user-agent when using my websites, it's literally
       | one of the reasons I use the web and internet in the first place.
        
         | hansonkd wrote:
         | > It seems like suddenly half the internet forgot about the
         | term "user-agent", which up until recently was almost always
         | our browsers, but sometimes feed readers, which was acceptable
         | it seems.
         | 
         | Was it really "suddenly"? it seems like for the past decade
         | there has been an ongoing push to make everyone use "chromium"
         | based browsers. I remember 10-15 years ago you would get
         | blocked for not using IE or whatever, even though the site
         | worked fine and there was no technical reason for the block.
         | 
         | It was over 12 years ago when google effectively killed RSS to
         | prevent alternative methods of access.
        
           | diggan wrote:
           | > I remember 10-15 years ago you would get blocked for not
           | using IE or whatever, even though the site worked fine and
           | there was no technical reason for the block
           | 
           | Reminds me of when I discovered that Google Inbox worked in
           | Firefox, even though Google decided to only allow Chrome to
           | access it:
           | 
           | https://news.ycombinator.com/item?id=8606879 - "Why Is Google
           | Blocking Inbox on Firefox?" - 213 points | Nov 14, 2014 | 208
           | comments
           | 
           | (correct link to the gist is
           | https://gist.github.com/victorb/1d0f4ee6dc5ec0d6646e today)
           | 
           | I think that "ongoing" push you're talking about was/is
           | accidental, because a lot of people use Chrome. What I'm
           | seeing now seems to be intentional, because people disagree
           | with the ethics/morals surrounding AI, or seeing a large
           | impact on their servers because of resource consumption, so
           | more philosophical and/or practical, rather than accidental.
           | 
           | But who knows, I won't claim to have exact insights into
           | exactly what caused "Chrome is the new IE", could be it was
           | very intentional and they never stopped.
        
             | SoftTalker wrote:
             | I don't remember Google (search, at least) ever not working
             | in any browser I tried, and I used some oddball browsers
             | over the years. Maybe apps like gmail and docs, but they
             | simply would not work in other browsers. Remember in its
             | early years Chrome was a darling because it was supporting
             | the "modern" web. That was the whole stated reason Google
             | developed Chrome: to support modern, rich web applications,
             | and force other browsers to do the same if they wanted to
             | stay relevant. Nobody guessed that Chrome would eventually
             | be the new IE.
        
           | Klonoar wrote:
           | ...10-15 years ago?
           | 
           | Try like 20(~+)
        
         | zb3 wrote:
         | They only care about ad revenue. I guess if bots were paying
         | them they wouldn't need to detect humans anymore..
        
         | Etheryte wrote:
         | The problem is resource consumption. On some of my servers,
         | scrapers, bots etc make up a vast majority of both the
         | bandwidth and CPU usage when left unchecked. If I didn't block
         | them, I would need to pay for a beefier server. All the while
         | this doesn't give me or my regular visitors any benefit, it's
         | just large corporations driving up my hosting costs.
        
           | diggan wrote:
           | > The problem is resource consumption. On some of my servers,
           | scrapers, bots etc make up a vast majority of both the
           | bandwidth and CPU usage when left unchecked
           | 
           | What are they downloading, like heavy videos and stuff?
           | Initiating heavy processes or similar?
        
             | johnmaguire wrote:
             | It takes 125,000 4MB requests to use up 0.5 TB bandwidth,
             | which is the lowest offered by Vultr. I could see this
             | being an issue for personal sites that include photos.
        
         | mcstempel wrote:
         | Hey there, I'm the author of the post. I'm actually pretty
         | sympathetic to your viewpoint, and I wanted to clarify my
         | stance.
         | 
         | I actually spent years working at a "good bot" company (Plaid),
         | which focused on making users' financial data portable. The
         | main reason Plaid existed was that banks made it hard for users
         | to permission their data to other apps -- typically not solely
         | out of security concerns, but to also actively limit
         | competition. So, I know how the "bot detection" argument can be
         | weaponized in unideal ways.
         | 
         | That said, I think it's reasonable for app developers to decide
         | how their services are consumed (there are real cost drivers
         | many have to think about) -- which includes the ability to have
         | monitoring & guardrails in place for riskier traffic. If an app
         | couldn't detect good bots, that app also can't do things like
         | 1) support necessary revocation mechanisms for end users if
         | they want to clawback agent permissions or 2) require human-in-
         | the-loop authorization for sensitive actions. Main thing I care
         | about is that AI agent use remains safe and aligned with user
         | intent. For your example of an anonymous read-only site (e.g.
         | blog), I'm less worried about that than an AI agent with read-
         | write access on behalf of a real human's account.
         | 
         | My idealistic long-term view though is that supporting AI agent
         | use cases will eventually become table stakes. Users will
         | gravitate toward services that let them automate tedious tasks
         | and integrate AI assistants into their workflows. Companies
         | that resist this trend may find themselves at a competitive
         | disadvantage. Ultimately, this has started to happen with
         | banking & OAuth, though pretty slowly.
        
           | do_not_redeem wrote:
           | It seems like cases (1) and (2) would both be better handled
           | by letting the user give their user agent a separate security
           | context if they choose, instead of trying to detect/guess
           | what kind of browser made that http request. I'm thinking
           | about things like oauth permissions, GitHub's sudo mode, etc.
           | Otherwise your magic detection code will inevitably end up
           | telling an ELinks user "sorry, you need to download chrome to
           | view your payment info".
        
             | mcstempel wrote:
             | You read our mind! https://stytch.com/blog/the-age-of-
             | agent-experience/
             | 
             | Very much agreed that's the long-term goal, but I think
             | we'll live in a world where most apps don't support oauth
             | for a while longer (though I'd love for all of them to --
             | we're actually announcing something next week that makes
             | this easy for any app to do)
             | 
             | But we're also envisioning an interim period where users
             | are delegating to unsanctioned external agents (e.g. OpenAI
             | Operator, Anthropic Computer Use API, etc.) prior to apps
             | catching up and offering proper oauth
        
           | soulofmischief wrote:
           | Plaid is not a "good bot" company. Despite posturing from
           | leadership, it is fundamentally unethical to build a
           | pervasive banking middle-man service which requires users to
           | surrender their private account credentials in order to
           | operate. What if every business operated this way? It's
           | disgusting that companies like Plaid have considerably set
           | back public discourse on acceptable privacy tradeoffs.
        
             | clint wrote:
             | You write as if someone held a gun to your head and force
             | you to sign up for Plaid. Plaid doesn't require anyone to
             | use it.
             | 
             | Your bank is the entity you're ultimately upset with, don't
             | malign a company that generated a _very good solution_ to a
             | _huge problem_ and THEN worked with their industry peers to
             | cajole these huge banks to let you have access to your data
             | how you want to use it. Before Yodlee and Plaid came around
             | there was a snowballs chance in hell I could ever hope to
             | get at my banking transactions in an API and now I can, and
             | in many cases I never have to give supply my banking
             | credentials to anyone but my bank.
        
               | soulofmischief wrote:
               | > You write as if someone held a gun to your head and
               | force you to sign up for Plaid. Plaid doesn't require
               | anyone to use it.
               | 
               | There is not a physical gun pointed at my head, but an
               | increasing amount of digital online interactions are
               | solely gated by Plaid. I've run into plenty cases where I
               | simply had no choice, for example dealing with landlords.
               | 
               | And you already know how long it takes for financial
               | systems to evolve once in place, as evidenced by your own
               | frustration for them not embracing APIs and digital
               | sovereignty. So once a solution like Plaid is in place,
               | we're normalizing this kind of man-in-the-middle security
               | nightmare for generations to come. Even if Plaid's
               | founders did not have malicious intent, the company will
               | eventually change hands to someone less ethical, and the
               | door is open for other companies to seek the same kind of
               | relationships with end users. If not malicious, Plaid is
               | brazenly reckless and short-sighted.
               | 
               | And regardless... I as a consumer do not want to hand
               | over my passwords to a man in the middle, I'm already
               | angry enough at the security and password restrictions I
               | encounter now with financial institutions. If I am in a
               | position where I cannot rent a home or make an important
               | purchase without interacting with a company like Plaid,
               | where is my digital sovereignty?
        
               | soco wrote:
               | I think this anger with Plaid is unwarranted. Without
               | them, or before them, you had zero API access because the
               | banks (including yours) don't give a rat's ass on your
               | fancy access needs. Now Plaid managed to gather together
               | some kind of access. Are they to blame because they
               | managed that? Do you still have any alternative with the
               | bank? I think no, and no. You can get back to the
               | "standard" situation of no API, no guns involved, or you
               | can use them as middlemen. Or you can create your own
               | middleman service if you like and everybody will
               | appreciate your Plaid alternative (except Plaid, I
               | suppose).
        
             | randunel wrote:
             | I'd assume they had to work with what was offered. As long
             | as banks required usernames and passwords with no oauth
             | possible, what's plaid to do? Their users wanted their
             | service, but the banks used username password credentials.
             | 
             | In any case, "good bot" doesn't refer to best practices
             | such as rejecting suppliers with antiquated auth and
             | guiding users to others, it refers to not being
             | intentionally malicious and acting as users' agents
             | instead.
        
         | cess11 wrote:
         | As an almost fanatic VPN and Tor user I'm already used to being
         | blocked and circumventing it, usually by exiting through a data
         | center IPv4 but I sometimes check whether they let SeleniumBase
         | access with a chromedriver and surprisingly often they do.
         | 
         | The Internet is a rather hostile place, I don't think that'll
         | change anytime soon.
        
         | reverendsteveii wrote:
         | I came here to say exactly this. Of course bots can do things
         | that are shitty but we need to detect and ban the behavior and
         | resist the urge to also try to detect and ban the type of user
         | that tends to exhibit that behavior. If a bot comes onto my
         | page, hands off the captcha to a human, then does human things
         | at human speed and human scale I shouldn't care that they're a
         | bot and I should allow them to do what they're doing. If a
         | human comes onto my page and starts trying to brute force
         | directory structures, mass-downloading tons of huge files and
         | otherwise causing a problem I shouldn't care that they're a
         | human and I should block them. This whole idea of bot detection
         | and blocking seems to be an inversion of what I think is the
         | best design principle we've discovered in the history of
         | software development: build things that do simple, useful
         | things without regard to who is using them or for what, then
         | let consumers surprise you with what they do with it. Banning
         | non-abusive agents is just locking out potential upsides for
         | your app.
         | 
         | OTOH if you make your living serving ads a bot bypassing your
         | monetization is a problem for you. Either you detect and block
         | them or eventually the value of an ad impression in your app
         | will approach zero. So in some cases I guess merely not being a
         | human _is_ the abusive behavior.
        
         | golergka wrote:
         | I'm building a service which needs to extract rss feeds from
         | pages (hntorss.com if you're interested). Nothing else. From
         | any rational point of view, website owner would actively want
         | this parser to work as easily as possible -- the whole point is
         | for users to see the content you publish!
         | 
         | Alas, I still get rate-limited, 400-ed and others because of
         | user agent and other bot-detection mechanisms.
        
         | digitaltrees wrote:
         | thank you!!!! This is the right answer.
        
       | JohnMakin wrote:
       | I've been flagged as a bot on pretty much every major platform.
       | Most ridiculously lately, linkedin - I have to prove my identity
       | using 2 different forms of ID, which they still won't accept, OR
       | find a notary and somehow prove I own the account I no longer
       | have access to. Maybe try refining this tech a little better
       | before you start blasting legitimate users with it - I am
       | extremely skeptical of the catch rate given what I see
       | anecdotally online, and my own experience getting flagged for
       | stuff as benign as being a quick typist.
        
         | deadbabe wrote:
         | I asked a manager about this, the policy is that we do not need
         | to differentiate between bots and people who sound similar to
         | bots: both are considered low quality content/engagement.
         | Delete them.
         | 
         | Seems like wherever they delete bots, they will in the end,
         | delete human beings.
        
           | emgeee wrote:
           | I never really thought about this perspective but in some
           | ways it makes sense. I think the ironic part is that LinkedIn
           | now provides built-in AI tools that make you sound more like
           | a bot.
           | 
           | Maybe they could fingerprint slop generated with they tools
           | and allow it through to incentivize upgrading
        
             | soco wrote:
             | But "our" bots are always the good ones. Why does this
             | sound like literature...
        
           | wat10000 wrote:
           | That's what happens when a business is built on getting a
           | tiny amount of value per user from a vast number of users.
           | There's essentially no incentive to treat any individual user
           | well, and no resources to make it happen even if they wanted
           | to. This becomes more and more problematic as our lives
           | revolve more and more around such businesses.
        
             | deadbabe wrote:
             | Silly commenters, mass audiences are for influencers, but
             | go ahead and write your little bandwagoned take so you can
             | feel heard.
        
           | JohnMakin wrote:
           | My problem with this approach is what metrics are you using
           | to determine whatever I am doing is "low quality?" On
           | LinkedIn specifically, I barely ever post "content" publicly
           | - I use it to network with recruiters and read technical
           | articles mostly. It's completely opaque and will catch users
           | doing absolutely nothing wrong or "low content," maybe they
           | are on the spectrum or disabled in a way that makes their
           | user clicks look weird. No managers ever consider these
           | things, it's always like "oh well, fuck em"
        
           | oceanplexian wrote:
           | Actually, they will only delete humans, because the bots can
           | already far outpace low quality content posted by humans.
        
         | mcstempel wrote:
         | LinkedIn always hits me with those frustrating custom CAPTCHAs
         | where you have to rotate the shape 65 degrees -- they've taken
         | a pretty blunt, high-friction approach to bot detection
         | 
         | I think most apps should primarily start with just monitoring
         | for agentic traffic so they can start to better understand the
         | emergent behaviors they're performing (it might tell folks
         | where they actually need real APIs for example), and then go
         | from there
        
         | digitaltrees wrote:
         | Maybe if sales navigator was better there wouldn't be so many
         | third party automation platforms that do automation. Or maybe
         | if linkedin figured out how to make money with an ecosystem
         | rather than monopoly they wouldn't need to be so aggressive.
         | 
         | I think companies that are hostile to AI Agents are going to
         | shrink. AI Agents are a new class of user, the platforms that
         | welcome them will grow and thrive, those that are hostile will
         | suffer.
        
         | AnotherGoodName wrote:
         | This is often due to network setup. If you're behind NAT where
         | there's many users behind a single IP address you'll be hit.
         | 
         | Eg. Many cell phone providers are 100% behind NAT for IPV4
         | internet. Corporate networks almost 100% likely to hit this
         | too. VPNs are straight up almost always flagged for further
         | authentication.
         | 
         | A 'fun' thing that often happens to me is purchasing online via
         | credit card at work and then going to use the CC later that day
         | in stores only to be denied since that's likely fraud since you
         | were in another location completely a few hours ago according
         | to IP location (work routes everything via a datacenter on the
         | other coast).
        
           | _moof wrote:
           | _> If you 're behind NAT where there's many users behind a
           | single IP address you'll be hit._
           | 
           | Doesn't this describe the vast majority of networks in the
           | world?
        
             | mcmcmc wrote:
             | They likely mean CGNAT specifically
        
           | JohnMakin wrote:
           | For me specifically, I do believe this is a major part of it.
           | However, if my options are to use a VPN _or_ the service, but
           | not both, I 'm more inclined to pick the VPN and say screw
           | the service, I just will opt out of using it. There's no real
           | reason that a sufficiently sophisticated network/security
           | team at a large company can't differentiate between
           | commercial VPN users and "bot" traffic. It's just
           | laziness/incompetence. Sufficiently advanced bots use
           | residential proxies anyway and it really isn't difficult to
           | go down that road.
        
           | petee wrote:
           | Cell tower/provider is a big part I think from my own
           | experience; I'd get constant captchas and rejects only when
           | near one specific tower at work, which happened to be right
           | over a fedex ground building, take that how you will...
        
         | yard2010 wrote:
         | What you're describing is the end of the internet for some
         | people. Good bots will evade everything (or at least try until
         | they do), and some people like you (and me, this shit always
         | happens to me) just stare at the screen, wondering what Kafka
         | would say about this.
        
       | bsnnkv wrote:
       | I have personally opted out of the arms race for at least one
       | service that I operate.[1]
       | 
       | If AI agents figure out how to buy a subscription and transfer
       | money from their operators to me, they are more than welcome to
       | scrape away.
       | 
       | [1]: https://lgug2z.com/articles/in-the-age-of-ai-crawlers-i-
       | have...
        
       | digitaltrees wrote:
       | I think companies need to rethink this and go the opposite
       | direction, rather than being hostile and blocking AI Agents--and
       | losing millions or billions in revenue when people sending AI
       | Agents to do tasks on their behalf cant get the task done---they
       | should redesign their software for Agent use.
       | 
       | https://www.loop11.com/introducing-ai-browser-agents-a-new-w...
        
       | ATechGuy wrote:
       | Looks like detecting real humans apart from agents is going to be
       | an arms race if the detection is based on browser/device
       | fingerprinting or visual/audio captchas; AI will only get better.
       | 
       | What are captcha alternatives that can block resource consumption
       | by bots?
        
         | mcstempel wrote:
         | CAPTCHAs have been ineffective as a true "bot detection"
         | technique for a while as tools like anti-captcha.com allow for
         | outsourcing it to real humans. BUT they have been successful at
         | the economic side of raising the cost of programmatic traffic
         | on your site (which is good enough for some use cases)
         | 
         | As the author of this agent detection post, we agree that
         | CAPTCHA and vanilla browser/device fingerprinting is quickly
         | not going to be very valuable in isolation, but we still see a
         | lot of value in advanced network/device/browser fingerprinting
         | 
         | The main reason is that the underlying corpus & specificity of
         | browser/device/network data points you get from fingerprinting
         | makes it much easier to build more robust systems on top of it
         | than a binary CAPTCHA challenge. For us, we've found it very
         | useful to still have all of the foundational fingerprinting
         | data as a primitive because it let us build a comprehensive
         | historical database of genuine browser signatures to train our
         | ML models to detect subtle emulations, which can reliably
         | distinguish between authentic browsers and agent-driven
         | imitations
         | 
         | That works really well for the OpenAI/BrowserBase models. Where
         | that gets tricky is the computer-use agents where it's actually
         | putting its hands on your keyboard and driving your real
         | browser. Still though, it's valuable to have the underlying
         | fingerprinting data points because you can still create
         | intelligent rate limits on particular device characteristics
         | and increase the cost of an attack by forcing the actor to buy
         | additional hardware to run it
        
           | ATechGuy wrote:
           | I don't think tracking everything is the way to go; info
           | would get outdated very soon and tracking compromises user
           | privacy. A simple solution could be to throw a challenge that
           | humans can easily solve, but agents absolutely cannot now or
           | in the future (think non-audio/visual/text).
        
         | cle wrote:
         | Web Environment Integrity. Eventually your hardware will rat
         | you out via attestation.
        
           | ATechGuy wrote:
           | And you think nobody (professional hackers?) can put together
           | a "virtual TPM" that falsifies real hardware info? I think
           | there are much simpler solutions, but the big tech wants to
           | retain the control.
        
       | tcdent wrote:
       | adopting the mentality that AI agents are akin to russian spam
       | bots is regressive mentality.
       | 
       | your users will be interacting with your platform using partial
       | automation in the very near future and if you think rate limiting
       | or slowing their productivity is somehow necessary they'll just
       | go somewhere else.
       | 
       | once you feel the empowerment, any attempt to retract it goes
       | against human nature.
        
       | mtrovo wrote:
       | It looks like it's just a matter of time for "Computer Use" like
       | tools becomes commoditised and widely available. I'm worried that
       | this could upend our usual ways of filtering out bot activity
       | with no simple way to go back. Sites that already have bot
       | problems, like social platforms and socket puppet profiles or
       | ticketing services and scalpers, might become even harder to deal
       | with.
       | 
       | Sometimes I think the dead internet theory might not have been so
       | far off, just a bit early in its timing. It really feels like
       | we're about to cross a line where real humans and AI agents
       | online activities blend in ways we can't reliably untangle.
        
       | bbor wrote:
       | Great article, but the actual technical details of their current
       | "browser fingerprinting" approach are linked at the bottom:
       | https://stytch.com/docs/fraud/guides/device-fingerprinting/o...
       | 
       | This seems semi-effective for professional actors working at
       | scale, and pretty much useless for more careful, individual
       | actors -- especially those running an actual browser window!
       | 
       | I agree that the paywalls around LinkedIn and Twitter are in
       | serious trouble, but a more financially pressing concern IMO is
       | bad faith Display Ads publishers and middlemen. Idk exactly how
       | the detectors work, but it seems pretty impossible to spot an
       | unusually-successful blog that's faking its own clicks...
       | 
       | IMHO, this is great news! I believe society could do without both
       | paywalls or the entire display ads industry.
        
         | mcstempel wrote:
         | Ah, this is great feedback -- I don't think we do enough to
         | articulate how much we're doing beyond that simplified
         | explanation of device fingerprinting on those docs. I'll get
         | that page updated, but 2 main things worth mentioning:
         | 
         | 1. We have a few proprietary fingerprint methods that we don't
         | publicly list (but do share with our customers under NDA),
         | which feed into our ML-based browser detection that assesses
         | those fingerprint data points against the entire historical
         | archives of every browser version that has been released, which
         | allows us to discern subtle deception indicators. Even
         | sophisticated attackers find it difficult to figure out what
         | we're fingerprinting on here, which is one reason we don't
         | publicly document it.
         | 
         | 2. For a manual attacker running attacks within a legitimate
         | browser, our Intelligent Rate Limiting (IntRL) tracks and rate-
         | limits at the device level, making it effective against
         | attackers using a real browser on their own machine. Unlike
         | traditional rate limiting that relies on brute traits like IP,
         | IntRL uses the combo of browser, hardware, and network
         | fingerprints to detect repeat offenders--even if they clear
         | cookies or switch networks. This ensures that even human-
         | operated, low-frequency attacks get flagged over time, without
         | blocking legitimate users on shared networks.
        
         | randunel wrote:
         | I'm in a business tangential to the one the author is in and
         | I've mostly encountered annoyances automating websites which
         | perform browser fingerprinting including TLS fingerprinting,
         | but outright blocks not really, not unless you also block real
         | users like cloudflare and datadome frequently do (in their
         | cases, automations have a marginally lower bypass rate than
         | real users do).
         | 
         | In my experience, the level of sophistication to automate
         | bypassing WAFs which do fingerprinting is much too high for
         | those skills to be used to click ads. Seriously, it's not just
         | about the compute cost of running real browsers and residential
         | proxies, it's also the dev time invested, nobody clicks google
         | ads when they can do much, much more with that knowledge.
        
       | programd wrote:
       | We're already at a point where AI can perfectly imitate a human,
       | so I don't expect behavioral AI bot detection to work in the long
       | term. You can still filter out a lot of script kiddie level AI
       | bots by looking for browser signatures.
       | 
       | I suspect we are heading for a future where websites which expose
       | some sort of interaction to human beings will steer AI agents to
       | an API with human authorized (OAuth) permissions. That way users
       | can let well behaved, signature authenticated agents operate on
       | their behalf.
       | 
       | I think we need an "AI_API.yaml", kind of like robots.txt, which
       | gives the agent an OpenAPI spec to your website and the services
       | it provides. Much more efficient and secure for the website then
       | dealing with all the SSRF, XSS, SQLi, CSRF alphabet soup of
       | vulnerabilities in Javascript spaghetti code on a typical
       | interactive site. And yes, we need AI bots to include
       | cryptographic signature headers so you can verify it's a well
       | behaved Google agent as opposed to some North Korean boiler room
       | imposter. No pubkey signature no access and fail2ban for bad
       | behavior.
       | 
       | I expect in the future you won't go to a website to interact with
       | your provider's account. You'll just have a local AI agent on
       | your laptop/phone which will do it for you via a well known API.
       | The website will revert back to just being informational. Frankly
       | that would fix a lot of security and usability problems. More
       | efficient and secure for the service provider, better for the
       | consumer who does not have to navigate stupid custom form
       | workflows (e.g. every job application site ever) and just talk to
       | their own AI in a normal tone of voice without swear words.
       | 
       | Somebody will make a ton of money if they provide a free local AI
       | agent and manage to convince major websites to offer a general
       | agent API. Kind of like Zapier but with a plain language
       | interface. I'm betting that's where the FAANGs are ultimately
       | heading.
       | 
       | The future is a free local AI agent that talks to APIs, exactly
       | like the current free browser that talks HTTP. Maybe they are one
       | and the same.
        
       | jerpint wrote:
       | The other day I tried an open source deep research
       | implementation, and a ton of links returned 403s because I was
       | using an agent. But it is for legitimate purposes. I think we
       | need better ways of identifying legitimate agents working on my
       | behalf vs spam bots
        
         | xena wrote:
         | TBH if we want that to be a thing, we're gonna need to figure
         | out how to pay server operators to cope with the additional
         | load that AI agents can and will put on servers.
        
       | xyst wrote:
       | It's a bit disgusting that multi-billion dollar corporations are
       | not properly compensating the individuals and groups that their
       | "artificial intelligence" models rely on.
       | 
       | Meta/FB/Zuckerfuck was caught with their pants down when they
       | were _torrenting_ a shit ton of books. It's not a rogue engineer
       | or group. It came from the top and signed off by legal.
       | 
       | Companies, C-level executives, and boards of these companies need
       | to be held accountable for their actions.
       | 
       | No a class action lawsuit is not sufficient. _People_ need to
       | start going to jail. Otherwise it will continue.
        
       ___________________________________________________________________
       (page generated 2025-02-14 23:00 UTC)