hngopher.com

       [HN Gopher] We Do Not Support Opt-Out Forms (2025)
       ___________________________________________________________________
        
       We Do Not Support Opt-Out Forms (2025)
        
       Author : mefengl
       Score  : 79 points
       Date   : 2026-01-27 09:35 UTC (21 hours ago)
        
 (HTM) web link (consciousdigital.org)
 (TXT) w3m dump (consciousdigital.org)
        
       | drcongo wrote:
       | That site doesn't seem to support pages loading either.
       | 
       | edit: I feel their pain - I've spent the past week fighting AI
       | scrapers on multiple sites hitting routes that somehow bypass
       | Cloudflare's cache. Thousands of requests per minute, often to
       | URLs that have _never_ even existed. Baidu and OpenAI, I 'm
       | looking at you.
        
         | jen729w wrote:
         | > often to URLs that have never even existed
         | 
         | Oh you're _so_ deterministic.
        
         | trollbridge wrote:
         | There is currently some AI scraper that uses residential IP
         | addresses and a variety of techniques to conceal itself that
         | likes downloading Swagger generated docs over... and over...
         | and over.
         | 
         | Plus hitting the endpoints for authentication that return 403
         | over and over.
        
         | comrade1234 wrote:
         | Are they hitting non-existent pages? I had ip addresses
         | scanning my personal server including hitting pages that don't
         | exist. I had fail2ban running already so I just turned on the
         | nginx filters (and had to modify the regexs a bit to get them
         | working). I turned on the recididiv jail too. It's been working
         | great.
        
         | tommek4077 wrote:
         | Why are "thousands" of requests noticable in any way?
         | Webservers are so powerful nowadays.
        
           | drcongo wrote:
           | It's not just one scraper.
        
           | SoftTalker wrote:
           | Small, cheap VPSs that are ideal for running a small niche-
           | interest blog or forum will easily fall over if they suddenly
           | get thousands of requests in a short time.
           | 
           | Look at how many sites still get "HN hugged" (formerly known
           | as "slashdotted").
        
             | ronsor wrote:
             | I remember my first project posted to HN was hosted on a
             | router with 32MB of RAM and a puny MIPS CPU; despite
             | hitting the front page, it did not crash.
             | 
             | At this point, I have to assume that most software is too
             | inefficient to be exposed to the Internet, and that becomes
             | obvious with any real load.
        
               | SoftTalker wrote:
               | While true, it's also true that it was (presumably) able
               | to run and serve its intended audience until the scrapers
               | came along.
        
         | ndriscoll wrote:
         | My n100 minipc can serve over 20k requests per second with
         | nginx (well, it could, if not for the gigabit NIC limiting it).
         | Actually IIRC it can (again, modulo uplink) do more like 40k
         | rps for 404 or 304s.
        
         | mystraline wrote:
         | IP blocking Asia took my abusive scans down 95%.
         | 
         | I also do not have a robots.txt so google doesnt index.
         | 
         | Got some scanners who left a message how to index or dei dex,
         | but was like 3 lines total in my log (thats not abusive).
         | 
         | But yeah, blocking the whole of Asia stopped soooo much of the
         | net-shit.
        
           | blenderob wrote:
           | > I also do not have a robots.txt so google doesnt index.
           | 
           | That doesn't sound right. I don't have robots.txt too but
           | Google indexes everything for me.
        
             | mystraline wrote:
             | https://news.ycombinator.com/item?id=46681454
             | 
             | I think this is a recent change.
        
               | daveoc64 wrote:
               | All the comments there seem to suggest that there has
               | been no change and that robots.txt isn't required.
        
           | Citizen_Lame wrote:
           | How did you block Asia, cloudflare or something else?
        
             | mystraline wrote:
             | You can download weekly IP blocks of regions.
             | 
             | I import them into iptables and wholesale block them all.
             | 
             | I dont deal with eastdakota's pile of shit.
        
             | kjs3 wrote:
             | You can block at your gateway/router. Lots of places have
             | country IP ranges[1], and there are even more or less
             | frequently updated lists of 'malicious' IP ranges[2]. Some
             | gateway providers include 'block by country' and/or
             | 'download blocklists automatically' as a feature.
             | 
             | [1] e.g. https://github.com/ipverse/geo-ip-blocks
             | 
             | [2] e.g. https://github.com/bitwire-it/ipblocklist
        
         | storystarling wrote:
         | Might be worth checking if they are appending random query
         | strings to force cache misses. Usually you can normalize the
         | request at the edge to strip those out and protect the origin.
        
       | lambdaone wrote:
       | Archive link:
       | 
       | https://web.archive.org/web/20251009081648/https://conscious...
        
         | dcminter wrote:
         | That wasn't working for me, but this one was:
         | https://archive.ph/QCMjJ
        
       | rubinlinux wrote:
       | | Since emails are sent from the individual's email account, they
       | are already verified.
       | 
       | This is not how email works, though.
        
         | blenderob wrote:
         | This.
         | 
         | I wonder if it is a generation gap thing. The young folks these
         | days have probably used only Gmail, Proton or one of these big
         | email services that abstract away all the technical details of
         | sending and receiving emails. Without some visibility into the
         | technical details of how emails are composed and sent they
         | might not have ever known that the email headers are not some
         | definite source of truth but totally user defined and can be
         | set to anything.
        
           | pif wrote:
           | Eh, nice times, when you could type an email just by
           | telnetting to port 25...
        
             | bradleyy wrote:
             | I've certainly sent thousands of emails this way. It was a
             | simpler time.
        
           | SoftTalker wrote:
           | 98% of email users of any generation don't have the first
           | clue how the protocol works.
        
         | kro wrote:
         | +1, Even if they validate DKIM/SPF+alignment (aka DMARC) that
         | would only verify the domain. There is no local part
         | verification possible for the receiver, the sending server
         | needs to be trusted with proper auth
        
       | veverkap wrote:
       | https://archive.ph/QCMjJ if it helps
        
       | augusteo wrote:
       | The irony of a site about AI opt-outs getting hammered by AI
       | scrapers is almost too on the nose.
       | 
       | trollbridge's point about scrapers using residential IPs and
       | targeting authentication endpoints matches what we've seen. The
       | scrapers have gotten sophisticated. They're not just crawling,
       | they're probing.
       | 
       | The economics are broken. Running a small site used to cost
       | almost nothing. Now you need to either pay for CDN/protection or
       | spend time playing whack-a-mole with bad actors.
       | 
       | ronsor hosting a front-page HN project on 32MB RAM is impressive
       | and also highlights how much bloat we've normalized. The scraper
       | problem is real, but so is the software efficiency problem.
        
       | wincy wrote:
       | It's wild when I read a professional looking website like this
       | and Conscious Digital misspells their own org name as "Consious
       | Digital" in the first paragraph. I'm glad they're fighting
       | against email spam but it just raises all sorts of red flags in
       | my mind, or at least it used to.
       | 
       | Funny enough, these days it indicates the article was written by
       | a human. I had a dev join my team and made a few typos and it
       | gave me a chuckle, as it's a whole class of mistake I hadn't seen
       | in awhile.
        
       | nabbed wrote:
       | The "required login" pattern is particularly a problem. I seem to
       | have namesakes around the US and UK that use my email address as
       | their own when signing up for various services (mobile phone
       | services, Shopify, Uber, various banks and investment firms,
       | landscaper services, real estate services, home and car
       | insurance, car repair shops, even _Silver Daddies_!!).
       | 
       | I can't open an issue (to ask the service to remove my email)
       | without logging in to an account I don't have control over.
       | 
       | I don't want to use "forgot my password", because I don't want my
       | IP address to be associated with a login to the account, because
       | in some cases (particularly Shopify), the services were obviously
       | used for fraud.
        
         | Mordisquitos wrote:
         | > _I can 't open an issue (to ask the service to remove my
         | email) without logging in to an account I don't have control
         | over._
         | 
         | > _I don 't want to use "forgot my password", because I don't
         | want my IP address to be associated with a login to the
         | account_
         | 
         | As a fellow victim of worldwide technically-illiterate
         | namesakes, I used to do this using the TOR browser until I had
         | a paid VPN service which is what I use now. Out of sheer
         | paranoia, I always use a secondary browser profile while using
         | a false userAgent extension.
        
         | hilsdev wrote:
         | I was pretty early to Gmail, I paid $5 for an invite to the
         | beta, and secured my first(.)last@gmail.com. But now I pay for
         | my own domain and my own hosted email just to avoid any
         | collisions
        
       | burnte wrote:
       | So, they're trying to be an online privacy service for users but
       | they require companies work in the way THEY want the companies to
       | operate. This is not a serious organization I need to care about
       | as a user or a service provider. They're just setting themselves
       | up for failure by requiring the world around them to change.
        
         | aklemm wrote:
         | Their detailed explanation of compliance issues in the space is
         | interesting and enlightening.
        
       ___________________________________________________________________
       (page generated 2026-01-28 07:01 UTC)