[HN Gopher] Botspam apocalypse
       ___________________________________________________________________
        
       Botspam apocalypse
        
       Author : panic
       Score  : 379 points
       Date   : 2022-08-04 04:17 UTC (18 hours ago)
        
 (HTM) web link (memex.marginalia.nu)
 (TXT) w3m dump (memex.marginalia.nu)
        
       | SavageBeast wrote:
       | I get paid over 92 Dollars per hour working from home with 2 kids
       | at home. i never thought i'd be able to do it but my best friend
       | earns over 15k a month doing this and she convinced me to try.
       | the potential with this is endless... Simply go to the BELOW LINK
       | and start your work..
       | 
       | EDIT: bad joke but maybe someone will get a chuckle.
        
       | lloydatkinson wrote:
       | I found that my netlify site attracts a lot of spam specifically
       | from the same spammer/group. The messages always start with some
       | variation of "Hi my name is Eric".
       | 
       | Netlify seem to not really care after reporting it on their
       | support forum. The spammer disables JS so no client side
       | protection works. I've recently decided to (unfortunately) break
       | the ability for JS disabled browsers to be able to submit the
       | contact form. The form elements attributes are all wrong meaning
       | the form won't submit correctly. Instead, some JS on page load
       | sets the attributes to the correct values. I will wait a while
       | and see if this solves it.
       | 
       | While netlify does correctly mark all this as spam the fact is
       | that legitimate messages sometimes can slip past these with false
       | positives. So I have to check the vast amount of spam often.
        
       | 1vuio0pswjnm7 wrote:
       | "The rest are forced to build web services with no interactivity,
       | or seek shelter behind something like Cloudflare, which
       | discriminates against specific browser configurations and uses IP
       | reputation to selectively filter traffic."
       | 
       | Interactivity is not a must-have. The world's first general
       | purpose computer, ENIAC, was not built for "interactivity". It
       | was built to calculate ballistic trajectories, which were
       | otherwise calculated manually. Computers exist to allow
       | automation, to reduce manual labour.^1 "Tech" companies need
       | interactivity to support collection of data about www users and
       | paid services related to programmatic online advertising.
       | Generally, users do not need interactivity. Generally, users do
       | not need to spend excessive quantities of time "interacting" with
       | networked computers.
       | 
       | As a user, I want __non-interactive__ www services, whether it is
       | data /information retrieval or e-commerce. I want to use more
       | automation, not less. Automation is not reserved for those
       | providing "services". It also should be available to those using
       | them.
       | 
       | Provide bulk data access. Let others mirror it. Take advantage of
       | "open datasets" hosting if necessary. For example, Common Crawl
       | is hosted for free with Amazon. Upload the data to internet
       | Archive.
       | 
       | "The API gateway is another stab at this, you get to choose from
       | either a public API with a common rate limit, or revealing your
       | identity with an API key (and sacrificing anonymity)."
       | 
       | Publish the rate limit for the public API. Do not make users
       | guess. Do not require "sign-in" to use an API to retrieve public
       | data.
       | 
       | 1. Some folks consider having to "interact" with a computer as
       | labour, not fun.
        
         | lifeisstillgood wrote:
         | >>> Automation is not reserved for those providing "services".
         | It also should be available to those using them.
         | 
         | Yes !
         | 
         | I call this software literacy. And yes - no matter how cool the
         | JS on a major site, the fact that the sites goals are to keep
         | me there and clicking and my goals are to get what I want with
         | minimal action are in conflict.
         | 
         | I would suggest that bots are actually not a problem. For most
         | things I would like a bot _acting for me_. Telling me as and
         | when that I need to visit the dentist, who has slots free next
         | weds and friday. Friday is best because I am also WFH that day.
         | 
         | The bot apocalypse is only one because we are trying to make a
         | "web for humans" when actually a "web for bots, and a bot for a
         | human" is a much better idea :/)
        
           | denton-scratch wrote:
           | > Telling me as and when that I need to visit the dentist
           | 
           | Isn't that simply your calendar? Sure, you want it automated;
           | but it doesn't need internet access, it doesn't need to crawl
           | or search, I don't know why you refer to it as a 'bot'.
           | 
           | To my mind, the idea of personal 'bots' was that you could
           | give it some general instructions such as "Let me know when
           | the content at any of these URLs changes", and then leave it
           | running. Were they also called agents?
        
         | nicbou wrote:
         | Some part of it is loss in the process.
         | 
         | I run a website about immigration. I'd love to reinstate
         | comments and get valuable feedback from people who just tried
         | my advice. Bots just make it too time-consuming.
        
       | fabianhjr wrote:
       | It would be simpler to decentralize and implement webs of
       | trust[1] (that locality would also help community-building /
       | social cohesion).
       | 
       | Secure Scuttlebutt[1] doesn't have a lack of moderation / spam
       | issue and it is completely decentralized and without monetary
       | fees nor proof-of work. Why can't centralized services do better?
       | 
       | [1]: https://ssbc.github.io/scuttlebutt-protocol-guide/#follow-
       | gr...
        
       | jgalt212 wrote:
       | > I can't afford to operate a datacenter to cater to traffic that
       | isn't even human. This spam traffic is all from botnets with IPs
       | all over the world.
       | 
       | In our experience (we don't have a forum), almost all of our bot
       | traffic has been SEO spiders (or claiming to be so).
        
       | jart wrote:
       | This kind of botspam is usually pretty easy to address with
       | redbean using the finger https://redbean.dev/#finger and maxmind
       | https://redbean.dev/#maxmind modules. The approach I usually
       | recommend people isn't so much ip reputation, which can be
       | unfair, but rather it allows you to find evidence of clients
       | lying to you. For example, if the User-Agent says it's Windows
       | with a language preference of English, but the TCP SYN packet
       | says it's Linux and MaxMind says it's coming from China, then
       | that means the client is lying (or being MiTM'd) and you can
       | righteously hellban once it's fingered for the crime.
        
         | [deleted]
        
         | unglaublich wrote:
         | What keeps bots from just fixing their acts and reporting
         | correct info instead?
        
           | krageon wrote:
           | Nothing. Once this sort of fingerprinting becomes common the
           | common bot frameworks will bypass it.
        
             | marginalia_nu wrote:
             | To be fair, bot countermeasures are and have always been an
             | arms race.
        
             | jart wrote:
             | But it's not common. So for the time being, redbean users
             | have the advantage.
        
               | krageon wrote:
               | If taking away normal users' agency is an advantage to
               | you, you go and use it.
        
           | jart wrote:
           | What's stopping them from clubbing you with a monkey wrench?
           | With bots, to answer your question, it'd probably take
           | another standard deviation in the IQ of the person using it.
           | So you've ruled out all the script kiddies in the world by
           | default. The purpose of this game isn't to have a perfect
           | defense, which is impossible, but rather to make the list of
           | people who can mess with you as short as possible.
        
         | BiteCode_dev wrote:
         | My laptop is lying all the time. I change my UA, preferred
         | language, my ip, mac and so on, because of tracking, terrible
         | dev assumptions and personal preferences.
         | 
         | Yet, I'm a very good web citizen.
         | 
         | Because of this, I often have to solve the same captcha many
         | times before it thinks I'm human.
        
           | jart wrote:
           | I don't doubt it. Given how rare people like you are, I'm
           | sure a good citizen like you would also be perfectly fine
           | sending an email to the service asking to be whitelisted, or
           | having a second browser that isn't your daily driver for
           | situations like this that doesn't try to obfuscate its
           | identity by behaving like a bot.
        
             | krageon wrote:
             | I won't lie, if you make an asshole system that bans me for
             | doing perfectly normal things that make the internet work
             | I'm going to assume I don't want to interact with it
             | anyway.
        
               | jart wrote:
               | Then what would you propose that's better?
        
             | BiteCode_dev wrote:
             | The first solution isn't practical (so many services to
             | manually find a mail to send a message to, then interact
             | with a human that might not even exist), and if you do,
             | they don't whitelist you. I tried. Either they don't
             | answer, or have "no way to have a specific whitelist for a
             | single user in our system".
             | 
             | So the second browser is the solution. But then the site
             | will do all the bad things that I wanted it not to do in
             | the first place. Like serving terrible French results
             | instead of good English ones, or assuming Firefox doesn't
             | work based on UA while their site work fine with it. And of
             | course track me to death, sell my data, and so on.
             | 
             | The only solution that works is to chose services you pay
             | money for: they have your card, so they know you are not a
             | bot. For years now, I have been suspicious of anything
             | free. But it doesn't solve the tracking problem.
        
               | jart wrote:
               | Yes I understand the desire for capitalism rather than
               | surveillance capitalism, but that's a derailment. The OP
               | appears to be someone who just wants to build something
               | cool and share it with other human beings. In that case,
               | it's really helpful to be able to have a free practical
               | way to address abuse. Would you really tell someone like
               | the OP to stop expressing themself and shut down their
               | service and put a paid one in its place? How can you
               | charge for search when Google gives it away for free?
        
               | BiteCode_dev wrote:
               | I understand all causes and consequences of this problem,
               | and I'm not implying there is an easy solution, only
               | underlying that using "the user is lying" will lead to
               | frustrating false positives.
        
               | nottorp wrote:
               | > The OP appears to be someone who just wants to build
               | something cool and share it with other human beings.
               | 
               | Only thing is, you don't _know_ if that statement is
               | true. Or they could really be wanting to build something
               | cool but take advantage of all those  "free" services and
               | basically sell you to Google and Facebook.
        
         | viraptor wrote:
         | That's actually a terrible heuristic. My requests are often
         | from windows proxied by Linux, with language set to my
         | preferred one in a non-matching country. And that's before I
         | start travelling and using a hotspot with a faked TTL to
         | workaround telco limitations. That's before you even get to
         | people completely unaware of interesting routing applied to
         | them (like corporate proxies, vpns) and people with incorrect
         | maxmind entries.
        
       | Test0129 wrote:
       | Not totally unrelated but I had to turn off email alerts and come
       | up with a way to summarize things because Fail2Ban and other
       | alert systems were hit quite literally every 15 seconds with port
       | scans/attempted entries on SSH and other ports. Reporting the
       | abuse to ARIN/ICANN didn't help because almost a full 95% of the
       | traffic originated from China, and 90% of the remaining 5% was
       | Russia. Of the remaining they were zombies inside of America,
       | typically on digital ocean, and I was able to get those handled
       | quickly and efficiently. When I had a simple (secure) login
       | system hosted on HTTPS it was getting hit hard enough my VPS ISP
       | was sending emails to figure out a way to stop it. There are
       | literally 3 people that even know of the existence of these
       | services.
       | 
       | It is actually nuts just how much bot spam there is.
        
       | elias94 wrote:
       | > has been upwards of 15 queries per second from bots
       | 
       | What type of queries are they generating? For what purpose are
       | querying Marginalia? Scraping and filling internal search
       | engines?
       | 
       | > If anyone could go ahead and find a solution to this mess
       | 
       | I would maybe trying to investigate why are querying your search
       | engine. Is for the search results? Maybe from there you can
       | create and sell an API service. Is for the wiki? Is for research
       | purpose?
       | 
       | I would love to see some data, raw or with some behavior derived
       | from it.
        
         | marginalia_nu wrote:
         | Most of the queries don't seem to be tailored toward my search
         | engine, they're ridiculously over-specified and typically don't
         | return any results at all.
         | 
         | As I've mentioned in another comment, my best guess is they're
         | betting it's backed by google, and are attempting to poison
         | their search term suggestions. The queries I've been getting
         | are fairly long and highly specific, often within e-pharma or
         | online casino or similarly sketchy areas.
         | 
         | Like
         | 
         | > cialis 50mg online pharmacy canada price
         | 
         | Either that, or nonsense like the below, where they appear to
         | be looking for CMSes to exploit (although I don't understand
         | the appendage at the end)
         | 
         | > "Please enter the email address associated with your User
         | account. Your username will be emailed to the email address on
         | file." Finestre Antirumore Torino
         | 
         | > affordable local seo services "Din epostadress delas eller
         | publiceras aldrig Obligatoriska flt r markerade med"
         | 
         | > "You are not logged in. (Login)" Country "City/Town" "Web
         | page" erst
         | 
         | Point is, none of these queries actually return anything at
         | all. I don't offer real full text search, for one. And the
         | queries are much too long.
        
       | shp0ngle wrote:
       | The actual blogpost aside: the margianalia search is the first of
       | these "alternative search engines" that I actually like.
       | 
       | Most of those has been either "worse google" or "utter trash"...
       | this one returns some interesting results for some queries I have
       | tried.
        
       | BiteCode_dev wrote:
       | It's not that bad.
       | 
       | First, of course, you have cloudflare and recaptcha, which are
       | free and very efficient, as the author say.
       | 
       | But even if you don't want to use them (some of my services
       | don't), most bots are very dumb:
       | 
       | - require JS, and you lose half of the web ones
       | 
       | - silly tricks like hidden input fields in forms that worked in
       | 2000 still work in 2022. Use a bunch of them, and you can yet
       | again halve the bot traffic.
       | 
       | - many URL should have impossible to guess paths. E.G: just
       | changing the /admin/ url to a uuid in django or the /wp-admin/ in
       | wordpress, you save so many requests.
       | 
       | - bots are usually not tailored to your site, meaning if you
       | require JS, you can actually embed anti-bot measure in the client
       | code and they will work. E.G: exponential backoff + some heavy
       | calculations if too many fast consecutive ajax requests.
       | 
       | - fail2ban + a few iptables rules (mitigate syn flood, etc) will
       | help
       | 
       | - varnish + redis gets you very far to shave excess dummy traffic
       | 
       | It's not great, but it's not an apocalypse.
       | 
       | Unless you are under targeted attack.
       | 
       | Then it sucks and you die.
        
         | sparkling wrote:
         | Very nice list of countermeasures. I agree that doing these
         | small things like hidden input fields really go a long way.
         | 
         | I would add to that:
         | 
         | - block signups/comments from known throwaway email domains
         | 
         | - block known datacenter IP ranges, at least for POST requests.
         | Honestly on our sites 50% of spam was coming from AWS EC2 IPs
         | 
         | - use a proxy/vpn/bot detection service like https://focsec.com
        
           | Zak wrote:
           | Please don't make blocking VPNs plan A. Between snooping ISPs
           | and public wifi networks that have indiscriminate content
           | filters, I'm on a VPN about half the time. Many other
           | legitimate users are as well.
        
           | emptyparadise wrote:
           | But then you end up forcing people to use Gmail.
        
           | EdwardDiego wrote:
           | Yup, in adtech, "IP is an AWS block" was a bot 99.999% of the
           | time.
           | 
           | The 0.001% was that person using EC2 as a proxy or VPN
           | server.
        
             | mobilio wrote:
             | It's not only AWS. Also happens on Azure and GCP.
        
               | EdwardDiego wrote:
               | True, but at the time, 3 - 4 years ago, Azure and GCP IPs
               | were minimal.
               | 
               | Guess the fraudsters were vendor locked lol.
        
         | SyneRyder wrote:
         | > First, of course, you have cloudflare and recaptcha, which
         | are free and very efficient, as the author say.
         | 
         | Recaptcha has been almost useless, in my experience. If you
         | read the spam logs, you'll quickly learn about the spam
         | software they (claim to) use to bypass Recaptcha, because
         | that's what they end up promoting. I started tagging in logs if
         | Recaptcha had validated on messges, and sure enough these spam
         | posts had all successfully passed it. Great opportunity to rip
         | out more Google dependencies from my website.
         | 
         | I've found my own custom written filters to be vastly more
         | effective than Recaptcha.
         | 
         | Lots of the bots are running full Chrome with JS, lots of
         | HeadlessChrome being used lately. The fact that they're using
         | HeadlessChrome is something that makes them easy to detect,
         | ahem.
        
           | BiteCode_dev wrote:
           | Those are very specific bots, recaptch will stop a lot of
           | casual ones. Most of them in fact.
        
             | SyneRyder wrote:
             | That really hasn't been my experience, perhaps I'm just
             | getting hit more by the sophisticated bots than the naive
             | ones. I'm glad that it works for some people.
             | 
             | Recaptcha was also filtering out some legit humans (I
             | logged all posts regardless of captcha status to be
             | reviewed later), so it just wasn't worth reducing the user
             | experience when the captcha bot detection rate was so low.
        
         | BasiliusCarver wrote:
         | One of the things I've done before among the other suggestions
         | is to put a hidden link like /followmeifyouscraping.html in the
         | landing page to get a bit of info about scraping volume and
         | then you can use fail2ban filters to block if it's visited if
         | you want
        
           | jiggawatts wrote:
           | I would add that link to robots.txt as an exclusion.
           | 
           | That way well-behaved search engines won't be affected, but
           | naive scrapers get auto-banned.
        
         | efitz wrote:
         | No, then you hide behind CloudFlare, because only the CSPs and
         | network operators have the infrastructure to deal with
         | volumetric attacks.
        
         | efitz wrote:
         | Also, attackers are rarely going to try to guess your URLs -
         | they're going to find them via Google or Shodan, or, if you're
         | a good rest citizen, via "/<yourapp>/"
        
           | bryanrasmussen wrote:
           | >Also, attackers are rarely going to try to guess your URLs -
           | 
           | because then the attack becomes DOS as they cycle through
           | dictionaries of words?
        
           | marginalia_nu wrote:
           | I get quite a lot of guesses in my logs, probing in /solr/
           | and so on (fairly pointlessly, I might add, as I run bespoke
           | software).
        
           | raverbashing wrote:
           | Any website gets probed for wp-admin.php etc, even if you
           | don't use WP
        
             | BiteCode_dev wrote:
             | In fact, if someone is probing for wp-admin, you should
             | insta ban them, no matter the site.
        
       | JimWestergren wrote:
       | I am running a website builder with > 20K sites. I use open
       | contact forms without captcha. What worked for me is to use a one
       | line javascript that places current timestamp in a hidden input
       | field that is default 0. Then I check on the backend and if the
       | value is either 0 or time to fill out and send the form is less
       | than 4 seconds I block as spam. This blocks more than 99% of spam
       | and also takes care of most human copy paste spam as well.
        
         | naillo wrote:
         | I like this solution because spammers are unlikely to try to
         | get around it. A delay eats into their time budget and they
         | can't introduce a human-like waiting time on every site they
         | try to spam, better to just move on to find cheaper targets.
        
           | walls wrote:
           | You could just decrease the timestamp instead of actually
           | waiting.
        
             | naillo wrote:
             | I meant for general spammers who goes after tons of sites
             | mostly blind. I agree it would not help for a targeted
             | attack.
        
               | JZerf wrote:
               | I'm already using this timestamp technique on my website
               | and so far no bot operator has bothered trying to work
               | around this. However even if some bot operator were to
               | specifically target a website using this technique and
               | try to decrease the timestamp, I believe you could still
               | force a bot to wait by just changing the website to use
               | something like a cryptographic nonce that includes a
               | timestamp instead of just a simple timestamp that can be
               | understood easily.
        
         | robalni wrote:
         | If you don't want to require users to run javascript you should
         | be able to make the server generate the timestamp.
        
           | JimWestergren wrote:
           | I used to do it with PHP but the problem is that then you
           | can't cache the HTML (varnish or other solutions). Javascript
           | don't have that problem and the added benefit of stopping
           | bots that don't run javascript.
           | 
           | In the error message I have a friendly texts for human to
           | turn on javascript if it is off and a <a href="javascript:
           | history.go(-1);">Go back and try again</a> making them not
           | loose the text that they have typed.
        
           | mariusor wrote:
           | How do you do that, without bots being able to circumvent the
           | feature?
        
             | sschueller wrote:
             | You could generate a CSRF token or something similar in a
             | hidden filed based on a JWT token (yes I know) on the
             | server side. The JWT token can either contain some
             | timestamp after which it's valid or the time it was
             | created.
        
             | robalni wrote:
             | People will be able to write programs that circumvent the
             | feature but that's also true for the javascript solution.
             | The point of it was that it gets rid of most spam because
             | most bots fill in the form faster than 4 seconds and are
             | not made to circumvent this feature.
        
             | Aachen wrote:
             | Bots can also circumvent this JS thing, so it's the same
             | either way.                   <?php echo '<input
             | type=hidden name=starttime value='.time().'>';
             | 
             | On submit:                   <?php if (time() -
             | $_POST['starttime'] < 4) die('2fast4me');
             | 
             | Revealing the error condition (that it was submitted too
             | fast) is nice for users and bots alike, of course. Up to
             | you.
             | 
             | I've had websites where I was too fast in submitting a form
             | before. Not any kind of anti-spam, just their server was so
             | fricking slow that I had input the date (iirc it was a
             | reservation system) and clicked next before the JS blobs
             | had finished triggering each other and fully loaded. It
             | would break the page somehow with no visual indication. I
             | found out by looking in the dev console and noticing stuff
             | was still loading in the background. How normal people are
             | able to use the Internet with how often I need the dev
             | console to do entirely ordinary things is a mystery to me.
        
         | JZerf wrote:
         | I also use essentially the same technique (although I have the
         | server generate the timestamp instead of using JavaScript) on
         | my website and concur that this is a highly effective technique
         | for blocking bot submissions.
        
       | js4ever wrote:
       | "There has been upwards of 15 queries per second from bots. There
       | is just no way to deal with that sort of traffic, barely even to
       | reject it."
       | 
       | What??? My phone can serve that easily, any modern server can
       | handle 50-250 rps
        
         | marginalia_nu wrote:
         | This is queries per second (as in I run a search engine), not
         | requests per second.
        
       | Joel_Mckay wrote:
       | For small sites, I would just use a simple firewall:
       | 
       | 1. whitelist the finite IP ranges for the regional ISPs/country
       | where you do business
       | 
       | 2. blacklist the proxy and tor exit nodes
       | 
       | 3. blacklist the list of published compromised servers
       | 
       | 4. add spamhaus blacklists
       | 
       | 5. add fail2ban rules to trip on common server security scans,
       | and unused common service ports
       | 
       | 6. publicly reply to those having access issues, and imply they
       | have bad neighbors.
       | 
       | This will often take care of 99% of the nuisance traffic, but I
       | still recommend live monitoring traffic regularly. ;)
        
         | philprx wrote:
         | Tor users are often legitimate good internet citizens.
         | 
         | A lot of (lucky) us have the luxury to live in real
         | democracies.
         | 
         | Some others live in countries that use every single aspect of
         | their private lives (DPI, mass surveillance) to put pressure on
         | them and bend them to the regime's will.
         | 
         | In my opinion, Tor and anonymity should not be killed as a
         | result of silly bots.
        
           | Joel_Mckay wrote:
           | Your opinion is duly noted, and I agree most knowledge should
           | be equally accessible to give everyone a chance to grow.
           | 
           | That being said, a commercial site owes nothing to
           | financially irrelevant bandits, sociopaths, or shills.
           | 
           | Try it for a week, and then weigh the liability again. ;)
        
           | RL_Quine wrote:
           | > Tor users are often legitimate good internet citizens.
           | 
           | We have had exactly zero traffic from it at any point which
           | was legitimate. Any user who ever showed up with a exit IP
           | ended up being banned eventually, so we just proactively
           | fraud banned anybody who uses one, and anybody that was
           | related to them. There is zero value in allowing anonymizer
           | traffic on your service, and a whole lot to lose.
        
       | TekMol wrote:
       | Crypto currency mining could be the solution.
       | 
       | If one request to the site generates more revenue than it costs
       | in resources, the bot problem is solved.
       | 
       | The author says that he is getting 15 bot requests to his site
       | per second. That is about 36 million requests per month. How much
       | does it cost to serve those? $1000 would seem high.
       | 
       | $1000/36M = $0.00003 per request.
       | 
       | How long would a crypto currency, that is suitable for mining in
       | the browser, need to be mined before $0.00003 is generated?
       | 
       | If it turnes out it is a few seconds or so, the solution would be
       | nicely user friendly. A few seconds of CPU time for access to the
       | site. No ads needed to finance the site.
       | 
       | It is kind of telling, that Bitcoin started as a spam blocker.
       | The original "hashcash" use case was to use proof of work to
       | prevent email spam.
        
         | GeckoEidechse wrote:
         | As much as I hate the whole cryptocurrency hype myself, I think
         | I agree that a proof-of-work requirement on spam detection that
         | pays in the hosts favour could help solve spam to some degree.
        
           | endgame wrote:
           | Before bitcoin, there was hashcash, which aimed to do exactly
           | this: http://www.hashcash.org/ . The original bitcoin paper
           | cites it, in fact.
        
         | kragen wrote:
         | Satoshi Nakamoto almost certainly isn't Adam Back.
         | 
         | It might be enough for the request to require more resources
         | from the requestor than from the server, even if it doesn't
         | actually give the server any money. I mean the requestor
         | probably isn't going to be willing to dedicate more hardware
         | "horsepower" to taking your search engine down than you are to
         | keeping it up. That was the idea behind Hashcash.
         | 
         | As for coins, the current Bitcoin hashrate is about 200
         | exahashes per second, down from a high of over 250 a couple of
         | months ago, and the block reward is 6.25 BTC until probably
         | June 02024. At a price of US$24000/BTC that's US$150k per block
         | (plus a much smaller amount in transaction fees) or about
         | US$1.25e-18 per hash. So your suggestion of US$3e-5 would
         | require about 2e13 hashes. https://en.bitcoin.it/wiki/Non-
         | specialized_hardware_comparis... says an overclocked ATI Radeon
         | HD 6990 can do about 800 megahashes per second (8e8) so you're
         | looking at about 3e4 seconds of compute on that card, about 8
         | hours.
         | 
         | Maybe one of the altcoins that uses a hash function with a
         | smaller ASIC speedup would be a better fit, although I don't
         | know enough about mining to know if there are any where GPUs
         | are still competitive. Still, it seems like it might be more
         | than a few seconds?
        
           | TekMol wrote:
           | I have not seen any arguments yet, why Satoshi is not Adam.
           | 
           | I did say "crypto currency, that is suitable for mining in
           | the browser" for exactly this reason: That Bitcoin is not
           | well suited for it.
           | 
           | One would have to look at what typical consumer hardware is
           | good at. Maybe an algorithm that saturates one CPU core with
           | serial calculations that need fast access to exactly 1GB of
           | RAM. I think consumer hardware is pretty good when it comes
           | to single core performance and RAM access.
        
             | swinglock wrote:
             | You'd only need lightning integration and paying a small
             | amount of sats, recycling the Bitcoin proof of work instead
             | of making more. You could even have the server transfer the
             | same sats back and forth as a token as long as the server
             | side is happy.
        
               | TekMol wrote:
               | "only"
               | 
               | You will not get your visitors to buy Bitcoin and set up
               | a lighnign wallet to visit your website.
               | 
               | But having some JS on your site that crunches numbers for
               | 2 seconds before the user can progress would work.
        
               | swinglock wrote:
               | Well it would "just" have to be integrated in browsers.
               | :)
               | 
               | You don't have to buy it, you could crunch numbers for an
               | equivalent cost if that's preferable. The advantage is
               | that the effort can be stored and used later, so you need
               | not even add a 2 second latency. Similar to "Privacy
               | Pass".
        
       | jb1991 wrote:
       | > They're a major part in killing off web forums,
       | 
       | I've noticed that a lot of old popular forums disappeared in
       | recent years, but I didn't realize it was possibly due to bots.
       | Why is that? I assumed that the admins just got tired of running
       | them and moderating them.
        
         | marginalia_nu wrote:
         | It's more complicated than just bots, competition from Reddit
         | is another factor, but bot traffic were certainly a significant
         | part of the problem, both in terms of the constant drive-by
         | exploits and ceaseless comment spam drove up the amount of work
         | needed to operate a forum as a hobby to basically a full time
         | job. With waning visitor numbers, it simply became untenable.
        
       | SyneRyder wrote:
       | Really glad to see someone finally talking about this.
       | 
       | Does anyone know what's going on with that "Duke de Montosier"
       | spam botnet? It accounts for more than half of the botspam
       | attacks on my sites, and I can't find anyone talking about it
       | online anywhere, except one tweet dating back to mid-2021. It's
       | identifiable by several short phrases that it posts:
       | 
       |  _Duke de Montosier_
       | 
       |  _for Countess Louise of Savoy_
       | 
       |  _Testaru. Best known_
       | 
       | And cryptic short posts that can assemble into creepy sequences:
       | 
       |  _Europe, and in Ancient Russia_
       | 
       |  _Century to a kind of destruction:_
       | 
       |  _Western Europe also formed_
       | 
       |  _and was erased, and on cleaned_
       | 
       |  _only a few survived_
       | 
       |  _number of surviving European_
       | 
       |  _55 thousand Greek, 30 thousand Armenian_
       | 
       | Many of the IPs involved seemed to be in Russia, China and Hong
       | Kong, though they're coming from all over (eg European & US VPNs,
       | Tor exit nodes). From tracking the IPs on AbuseIPDB, the weird
       | spam posts seem to be just one layer, while behind the scenes it
       | also attempts SMTP Auth and IMAP attacks on the server.
       | 
       | I'm eager to know more if anyone knows, and especially if anyone
       | is trying to shut this thing down. But I can't find anyone even
       | talking about it. (Maybe there's a reason for that?)
        
         | marginalia_nu wrote:
         | How very numbers station of them.
         | 
         | I've seen it suggested that botnets use comment fields for
         | command and control, maybe something like that?
        
           | SyneRyder wrote:
           | My theory for the phrases above is that they're a "unique
           | seed" used to identify sites that are easily compromised. Do
           | a web search, find a website filled with "Duke de Montosier"
           | comments - bingo, you've identified an easy website to target
           | with your backlink comment spam. Or, more maliciously, a
           | website that is easy to thoroughly compromise with
           | vulnerabilities. But that's just my current theory.
           | 
           | Here's the one tweet I found in Swedish about the comment
           | spam botnet, and it dates back to February 2021. She's the
           | only person I could find who has mentioned it in public. Or
           | maybe my search skills are failing me.
           | 
           | https://twitter.com/aureliagu/status/1357368329573400578
        
         | bombcar wrote:
         | I suspect some of these are "bots sold for hire" where they
         | make money selling the bot to people, many of whom don't know
         | how to use it and run it with the default config.
         | 
         | I've found spam email that certainly is the above, because it
         | has things like PUT_LINK_TO_STORE_HERE and other variables that
         | obviously weren't updated in the config file.
        
         | prepend wrote:
         | I assume it's time travelers trying to post enough so their
         | message persists.
        
       | Kiro wrote:
       | Coin Hive was an interesting solution before it became synonymous
       | with crypto jacking. In order to post a comment you had to lend
       | your CPU to mine for X seconds. The only true anonymous and
       | frictionless micropayment system I've seen.
        
       | Nextgrid wrote:
       | > The only ones that can survive the robot apocalypse is large
       | web services. Your reddits, and facebooks, and twitters, and
       | SaaS-comment fields, and discords. They have the economies of
       | scale to develop viable countermeasures, to hire teams of people
       | to work on the problem full time and maybe at least keep up with
       | the ever evolving bots.
       | 
       | I only agree when it comes to the system resources that can keep
       | up with bots. When it comes to fighting spam, these services
       | often do a terrible job because 1) their business model benefits
       | from higher user & engagement numbers and 2) their monopoly
       | status affords them to retain users even if their experience is
       | degraded by the spam, something a small site often won't be able
       | to do.
        
       | Avamander wrote:
       | It's annoying for sure. I deal with abuse at a large scale.
       | 
       | I'd recommend:
       | 
       | - Rate-limit everything, absolutely everything. Set sane limits.
       | 
       | - Rate-limit POST requests harder. Preferably dynamically based
       | on geoip.
       | 
       | - Rate-limit login and comment POST requests even harder. Ban IPs
       | that exceed the amount.
       | 
       | - Require TLS. Drop TLSv1.0 and TLSv1.1. Bots certainly break.
       | 
       | - Require SNI. Do not reply without SNI (nginx has 444 return
       | code for that). Ban IP's on first hit that connect without.
       | There's no legitimate use and you'll also disappear from places
       | like Shodan.
       | 
       | - If you can, require HTTP/2.0. Bots break.
       | 
       | - Ban IP's listed on StopForumSpam, ban destination e-mail
       | addresses listed there. If possible also contribute back to SFS
       | and AbuseIPDB.
       | 
       | - Collect JA3 hashes, figure out malicious ones, ban IPs that use
       | those hashes. This blocks a lot of shit trivially because
       | targeting tools instead of behaviour is accurate.
        
         | [deleted]
        
         | andai wrote:
         | >Bots break.
         | 
         | Wonder if you could respond in a way to get them to crash, or
         | even better, to hang indefinitely.
        
         | neurostimulant wrote:
         | I'm sure these would work but I'll probably got banned too just
         | because I often try to poke ip addresses directly. I also often
         | use VPN especially when outside, so I'll definitely got banned.
        
           | Avamander wrote:
           | VPNs tend to be smaller offenders in terms of clients-per-IP
           | than say educational institutions or offices.
        
         | 1vuio0pswjnm7 wrote:
         | "This spam traffic is all from botnets with IPs all over the
         | world. Tens, maybe hundreds of thousands of IPs, each with a
         | relatively modest query rates, so rate limiting does all of
         | bupkis."
        
           | Avamander wrote:
           | Yep, there isn't a silver bullet that curtails all abuse.
        
             | shiftpgdn wrote:
             | Blocking the entire aws/gcp/azure/digital ocean/linode IP
             | ranges will stop 99.999% of malicious bot traffic full
             | stop.
        
               | CWuestefeld wrote:
               | Yes, it would.
               | 
               | It would also stop a not-insignificant number of my
               | customers.
        
               | Plasmoid wrote:
               | Why do your customers pay Amazon for egress to the
               | internet? Isn't that very expensive?
        
         | CodeSgt wrote:
         | > Rate-limit login and comment POST requests even harder. Ban
         | IPs that exceed the amount
         | 
         | Don't ban IPs. Or if you do, let the ban expire relatively
         | quickly (days/weeks, not months/years).
        
           | Avamander wrote:
           | Ideally you'd keep track of repeat offenders and decide the
           | length based on that.
        
           | LinuxBender wrote:
           | Or at least rate limit session cookies. If a person does not
           | have a session cookie, rate limit by IP. If they are
           | authenticated as a unique person have different rate limits
           | and different levels of authentication. HAProxy can do
           | different rate limits by ACL conditions.
           | 
           | Or instead of strictly rate limiting, ask them a question
           | that can't be "looked up" in a table and that requires human
           | thought, philosophy, emotion, ethics. Maybe GPT could
           | eventually adapt to this and in that case fall back to IP
           | rate limiting and grow the set of questions.
        
           | annoyingnoob wrote:
           | I ban IPs from small data centers all the time. For my
           | purposes there is no need to support traffic from small
           | hosting providers that are everywhere all over the world. I
           | do not tend to ban the IPs of commercial ISPs that provide
           | service to end users.
        
             | rndgermandude wrote:
             | You will probably ban a lot of VPN users as collateral
             | damage. VPN providers often use these small and relatively
             | cheap providers for their endpoints.
             | 
             | You may be fine with banning those VPN users, or even want
             | that - lots of bots will try to hide behind "legitimate"
             | VPNs - but one has to be aware of this consequence at
             | least, especially considering that more and more people
             | seem to use them - probably also thanks to the aggressive
             | "sponsoring" certain providers such as ExpressVPN do on
             | e.g. a wide variety youtube videos.
        
               | annoyingnoob wrote:
               | It depends on what you are trying to protect I suppose.
               | Banning OVH IPs (and others) cleared up a lot of issues
               | for me. I don't miss them, but sure you might.
        
         | superkuh wrote:
         | > Require TLS. Drop TLSv1.0 and TLSv1.1. Bots certainly break.
         | 
         | So will people who run older computers with older software. But
         | I guess people who don't have money don't matter for commercial
         | websites so screw 'em.
        
           | Avamander wrote:
           | I don't think there's much web you can visit with those
           | browsers anyway. Windows XP with IE and no SNI support, maybe
           | sites from that era without JavaScript would work?
        
             | superkuh wrote:
             | I think you'd be surprised the how recent a browser can be
             | and still lack a client/server cypher overlap once you
             | start whittling down what TLS versions you accept. Just at
             | the start of the pandemic many government sites had to re-
             | enable early TLS because so many people couldn't access
             | their recent TLS only sites.
             | 
             | But yeah, corporate employees aren't going to care about
             | those people. Governments have to. And human persons
             | building personal websites should too.
        
           | Kalium wrote:
           | Anything that got a significant update in the past ten to
           | twelve years will support TLS 1.2. The window of systems that
           | would support 1.1 but not 1.2 is pretty small. You have to go
           | all the way back to IE on Windows XP before a lack of 1.2
           | support becomes an issue.
           | 
           | So, yeah. You're absolutely right. In a lot of cases the loss
           | of revenue from users with severely outdated software will be
           | less than the cost decease of cutting spam and abuse.
           | 
           | This gets back to an old question - to what degree should
           | legacy systems be supported and at what level of expense?
           | There's no one easy answer that works for everyone.
        
         | jmt_ wrote:
         | I'm not very familiar with all the workings of HTTP/2.0 - why
         | would it break bots? Assuming no CloudFlare type protection,
         | does it somehow stop someone from using curl to get (non-JS
         | generated) content? Does it thwart someone accessing the site
         | from something like playwright/selenium?
        
           | Avamander wrote:
           | > I'm not very familiar with all the workings of HTTP/2.0 -
           | why would it break bots?
           | 
           | There's a lot of outdated garbage bots out there. Not using
           | HTTP/2.0 is also often the default with various HTTP
           | libraries.
        
             | jmt_ wrote:
             | So it just comes down to bot software not being compatible
             | with HTTP 2.0 rather than any sort of HTTP 2.0 specific
             | mechanism/feature?
        
               | ruuda wrote:
               | Yes
        
         | troad wrote:
         | In other words, make your website unusable for people who have
         | to connect through VPNs or public networks, difficult for
         | anyone without a stable Western broadband connection, and
         | unpleasant for everyone else.
        
           | hbn wrote:
           | Google search seems to have this issue for most of the
           | regions near me on the VPN I use (Private Internet Access)
           | 
           | Sometimes I just turn it on if I have to fire off a few
           | searches because it'll make me complete a long, tedious
           | captcha for EVERY search
        
           | Avamander wrote:
           | With a bit of work any limits can be fine-tuned usually not
           | to impact actual users behind NATs. Some collateral does
           | happen but that's an unfortunate reality. I'd like you to
           | elaborate on the rest of your comment though.
        
             | dalbasal wrote:
             | I agree that there's a "that's life" aspect to
             | collateral/tradeoff.
             | 
             | That said, I sympathize somewhat with the parent. "Done
             | right, negative side effects are minimal" is an
             | uncomforting statement. First, because things are often not
             | implemented correctly. There are a lot of details and
             | tuning that will often fail to materialize in practice.
             | Second, because long tail usability issues can often go
             | overlooked. The abuse->anti-abuse feedback loop is pretty
             | tight. Abuse gets identified and counteracted. The anti-
             | abuse-> UX problems loop tends to be noticeably looser.
             | Often, it's just aggregates (revenue/AUD/etc).
        
         | gkbrk wrote:
         | > If you can, require HTTP/2.0. Bots break.
         | 
         | Non-bots break as well. I have Firefox configured to use
         | HTTP/1.1 only.
         | 
         | No reason to chase Google's standard-of-the-day, HTTP/1.1 has
         | worked for ages and it will continue to do so for the
         | foreseeable future.
        
           | Avamander wrote:
           | Some old browsers break as well, if it's worth it depends on
           | the website. It's your prerogative to disable an useful
           | feature, you can also disable JavaScript. But there's little
           | reason for a website operator to cater to that unnecessary
           | edge case if it's mostly used for abuse.
        
             | NullPrefix wrote:
             | JavaScript is mostly used for abuse.
        
               | Avamander wrote:
               | Not in that direction though.
        
           | bsuvc wrote:
           | That seems like a strange reason to me. Isn't HTTP/2.0
           | faster? Isn't it also basically transparent to the end user?
           | 
           | I'm trying to figure out what I would gain by configuring my
           | browser to use HTTP/1.1 only.
        
           | dspillett wrote:
           | If doing something fends off a lot of bots, but also
           | inconveniences a very small number of people who have
           | significantly non-standard or just out-of-date
           | configurations, I'm likely to favour protecting myself from
           | the former over worrying about the latter. To paraphrase Mr
           | Spok: The inconveniences of the me outweigh the
           | inconveniences of the you!
        
             | bbarnett wrote:
             | Bear in mind, inconveniencing 4.8% of users, does not map
             | identically.
             | 
             | Instead, you are often dumping 4.8+4.8+4.8 as you add block
             | methods, with some overlap.
        
               | marginalia_nu wrote:
               | To be fair, most of my visitors are not exactly lining up
               | with the expectations of "standard". I get >90% of my
               | [human] traffic from desktop clients, for example.
        
               | bbarnett wrote:
               | Sure, but the logic about mitigation does hold true, if
               | you overlap methods.
               | 
               | Eg, your method described in prior post, along with
               | thongs which may lock out VPN or NAT users.
               | 
               | Just something to consider.
        
           | sirshmooey wrote:
           | Genuinely curious, why disable HTTP2? Your web browsing must
           | be awfully slow sans multiplexing.
        
             | gkbrk wrote:
             | > why disable HTTP2
             | 
             | Because it adds nothing to improve my browsing experience,
             | and reducing the number of protocols supported by my
             | browser from 3 to 1 also reduces the attack surface.
             | 
             | > Your web browsing must be awfully slow sans multiplexing.
             | 
             | And yet it's not slowed down at all. How many different
             | resources must a web page use before it feels slow on a
             | connection pool of keep-alive TCP sockets? Maybe people
             | visit some wild experimental web pages with hundreds of
             | blocking <src> tags that are not bundled/minified?
             | 
             | Either way, my experience is it doesn't slow anything down
             | when I use both websites (forums, resources, youtube,
             | social media) and web apps (banking, maps, food delivery
             | etc).
        
               | dboreham wrote:
               | Perhaps your experience is the same, but it may impose
               | extra load on middleboxes that track TCP flows.
        
             | stonemetal12 wrote:
             | https://github.com/dalf/pyhttp-
             | benchmark/blob/master/results...
             | 
             | HTTP2 is barely any better than http 1, if you want it to
             | make a 1/10 of a second difference you have to be making
             | 100s of requests.
        
         | ajsnigrutin wrote:
         | > - Rate-limit everything, absolutely everything. Set sane
         | limits.
         | 
         | This breaks when multiple users are behind the same IP. I've
         | seen services fail even in classroom, because the prof did
         | something and a few tens of students followed (captchas
         | everywhere).
        
           | icelancer wrote:
           | Yup. This happened to us when we had rate limiting turned on
           | our sites and ran off-site events at hotels, for example -
           | then the hotel's IP got temp banned and our sales engineers
           | would complain, rightfully so.
        
           | winternett wrote:
           | Black hats always find ways around rate limiting, and that's
           | why they are more prevalent than actual users. People can
           | literally run click farms with cheap 4g cell phones that
           | artificially pump anything they want without consequences,
           | while authentic posters that simply run 2 necessary accounts
           | are penalized if they post regularly.
           | 
           | The only real way to properly police Internet communities is
           | to keep them smaller so that botting is more obvious, and to
           | involve carefully managed moderation. Reddit tried this, but
           | also lost track of the human factors involved and now
           | moderators collect side money and promote their own posts
           | artificially.
           | 
           | The main problems facilitating the surge in bots are shammy
           | creator funds and all the other measures sites take to boost
           | their profit and market dominance. They have grown far too
           | big and can no longer effectively manage their user bases
           | effectively. Things weren't meant to be this way at all, the
           | excessive quest for market dominance and profit has
           | thoroughly corrupted freedom of info online in business, now
           | many users are also following the same road map.
        
           | [deleted]
        
           | CWuestefeld wrote:
           | For sure! Our site is B2B ecommerce, and any sizeable
           | customer has all their users coming to us from a single NAT
           | or proxy. For major customers it's likely that there are
           | several of their employees using our system at any given
           | time.
           | 
           | The answer needs a whole lot more finesse than this.
        
           | marginalia_nu wrote:
           | (Author)
           | 
           | I do in fact rate-limit everything, it is good advice, but
           | the way you implement rate-limiting allows for traffic
           | bursts. It's basically a reverse leaky bucket, where you
           | start out with N allowed requests, which gets depleted for
           | each request, and refilled slowly over time.
           | 
           | Search traffic is fairly bursty, people do a few requests
           | where they tweak the query and then they go go away.
        
             | RamRodification wrote:
             | Off-topic, but isn't that a normal (non-reverse) leaky
             | bucket? When the bucket gets full the rate limiting
             | engages. An empty bucket allows for a burst without getting
             | full. It slowly leaks over time at a rate that allows a
             | normal amount of traffic without filling up.
        
               | megous wrote:
               | To me, it's a bucket that's being filled at constant rate
               | from a tap until it's full, and the traffic requires
               | taking some water from the bucket. If there's no water,
               | the trafic has to be dropped or wait in a queue.
               | 
               | Basicaly, you can look at it either way.
        
               | RamRodification wrote:
               | I like it. I guess it's one of those things where it
               | depends on the example used when you learned about it?
               | For me it was some Nginx guide on rate limiting and I
               | think they described it in the way I see it.
        
               | marginalia_nu wrote:
               | Hmm, yeah, that's actually true now that I think about
               | it.
        
           | Avamander wrote:
           | A "sane limit" wouldn't be "one person one IP", such a global
           | limit should rather stop one IP (even if it's a nasty CGNAT)
           | from having a negative impact on the entire service.
           | 
           | If such a limit would hinder classroom usage, but that's your
           | target audience then other solutions should be found, fairly
           | logical.
        
       | GeckoEidechse wrote:
       | I wonder if a general solution could be to make the visit more
       | computationally demanding to the visitor than to the host, e.g.
       | some form of proof-of-work. I guess captchas already do that in
       | some sense but they require the humans to do the work.
       | 
       | Now the author above has stated they dislike the crypto route and
       | I agree that the whole web3 idea is bs but what if in the case
       | that spam of some form is detected by the server, it requires the
       | visitor to show some proof-of-work and combine that with the
       | "mining crypto in JS instead of ads" craze. That way the bot
       | would need to put work in which would slow it down and at the
       | same time it would pay for its own visit.
       | 
       | No ofc no spam detection system is perfect and it would also hit
       | human users but in their case it would be just a wait a few more
       | seconds longer for page to load kinda case.
        
       | AviationAtom wrote:
       | It's funny the author mentions Facebook and Twitter, because the
       | bot spam on both is quite apparent. The spam on the former has
       | risen greatly, seemingly mostly from India and parts of Africa.
       | Scam air duct cleaning posts, posts about hacked account
       | recovery, and random other crap. It really degrades the
       | experience of the Internet, IMHO.
        
       | kazinator wrote:
       | In the 1980's, we kept anklebyters off dial-up BBSses with a
       | simple technique: voice validation. To join the forum, you had to
       | fill an application first, which included your real name and
       | phone number. The sysop would give you a call for a quick chat,
       | and then grant you access if you didn't seem like a twit.
       | 
       | This would be entirely practical for some small-time operator
       | trying to run a forum off residential broadband, while
       | impractical for the reddits, facebooks and twitters.
        
         | OliverJones wrote:
         | "anklebyters"! I learned a useful new word today. Thanks.
        
           | [deleted]
        
         | bluedino wrote:
         | I remember some BBS registration forms where you would have to
         | give the names of a couple existing users that would vouch for
         | you. Kind of like other sites where you need an invite or
         | referral from an existing member.
        
         | s1k3s wrote:
         | I guess it's a different time and it also depends on who's your
         | target audience. Some people go crazy if you ask for their
         | email address. Phone numbers and calling is a big no-no.
        
           | mjevans wrote:
           | I'm one of those radical militants who refuses to give up any
           | means of direct contact...
           | 
           | However for a small scale thing I'd gladly go visit at a face
           | to face meetup to fulfill this type of validation.
        
             | shaburn wrote:
             | What if they came to you. What is the imputed value of that
             | network connection relative to cost...?
        
             | nottorp wrote:
             | > However for a small scale thing I'd gladly go visit at a
             | face to face meetup to fulfill this type of validation.
             | 
             | Even if it were 3 flights totalling 18 hours away? :)
             | 
             | Or even just from one coast of the US to another...
        
               | mjevans wrote:
               | Someone that far away shouldn't want my direct contact
               | information to join a group.
               | 
               | However there is a medium / large organization case,
               | where each area has local 'chapters' or some other term
               | for a small fragment of the larger group. In that case
               | the local leaders each operate as a small group for their
               | areas.
        
           | easrng wrote:
           | You could schedule a voice-only jitsi or some other kind of
           | call that doesn't need an email or phone number.
        
           | mnd999 wrote:
           | Phone number is an excellent tracking identifier across
           | services. Even better than email, which is why the data
           | hoarders want it.
        
       | figmaheart255 wrote:
       | While clearly bot spam is on the rise, we need to be _very
       | careful_ on how we choose to deal with it. Cloudflare has already
       | introduced  "proof-of-Apple" [1], where proven Apple devices get
       | special treatment, bypassing captchas. Later we might see
       | websites that are _only_ accessible via Google, Microsoft, or
       | Apple devices. If we continue down this path, we 'll end up with
       | a social credit system ruled by big tech.
       | 
       | [1]: https://news.ycombinator.com/item?id=31751203
        
         | kube-system wrote:
         | We basically already have "social credit" systems, we just call
         | them anti-fraud/anti-spam/reputation scores.
        
       | boredumb wrote:
       | I get a ton of spam from my contact me pages even with a captcha
       | in place, i've been experimenting with loading an initial dummy
       | form and replacing it within a few seconds of loading to the real
       | deal which seems to have cut down on bots submitting stuff.
       | 
       | Rate limit everything you can and use a captcha where acceptable,
       | there are also a load of public IP and email blacklists that you
       | can use to run a quick check. Working in a field where there is a
       | large amount of bots and incentive to abuse we invest quite a bit
       | of time and money in fraudulent traffic detection using a
       | cornocopia of different services in tangent and at the end of the
       | day we still see a small percentage of traffic getting through
       | that is fantastically human like.
       | 
       | With that out of the way, I've been engulfed in AI and GPT3
       | functionality lately and I thought this post was going to be
       | doomsaying the coming apocalypse of bot spam, because the level
       | of human like quality coming from the AI is going (already has)
       | to make deciphering human vs bot traffic/posts/emails/comments
       | nearly impossible. It will be fun here soon when we see forums
       | entirely dedicated to bots conversing and arguing with each other
       | outside of reddit.
        
         | ComputerCat wrote:
         | Same! The captcha doesn't seem to be able to slow down the
         | bots. Inbox is still getting flooded with spam.
        
         | [deleted]
        
       | golergka wrote:
       | > If Marginalia Search didn't use Cloudflare, it couldn't serve
       | traffic. There has been upwards of 15 queries per second from
       | bots.
       | 
       | 15 RPS is very far from an apocalypse.
        
         | [deleted]
        
         | marginalia_nu wrote:
         | It is if you're hosting an internet search engine on a PC.
        
           | [deleted]
        
           | golergka wrote:
           | Why would you do such a thing in the first place?
        
             | marginalia_nu wrote:
             | Because I want this search engine to exist, and I'm not a
             | multimillionaire so I can't afford better hardware.
             | 
             | See, when it comes to not being able to find stuff on
             | Google, you can either complain about it on the internet,
             | or you can build a search engine yourself that allows you
             | to find what you are looking for.
             | 
             | I chose the second option.
        
         | Avamander wrote:
         | It's bad if it's your dead-average Wordpress site that has 10
         | PHP workers, each page load being >1s. Easy DoS.
        
           | Aachen wrote:
           | Yeah but WordPress is an extreme example. Every time a WP
           | blog is posted to HN without a static-page-ifier (caching
           | layer that basically turns the dynamic pages into static
           | ones), it dies within minutes. Normal software doesn't seem
           | to have that problem.
           | 
           | I traced it once, and I got to admit there was not an obvious
           | bottleneck (this was 2015 or so). Just millions upon millions
           | of calls into deeper and deeper layers for things like
           | translations or themes. Wrapping mysql_query in a function
           | that caches the result (to avoid doing identical queries)
           | helped a few % I think, but aside from major changes like
           | patching out the entire translation system for single-
           | language sites, I didn't spot an obvious way to fix it. You'd
           | need to spend a lot of time to optimize away the complexity
           | that grew from suiting a million different needs, contributed
           | by thousands of people across many years.
        
       | unixbane wrote:
       | >spam
       | 
       | captchas were designed to solve this (and only this, as opposed
       | to requiring them to merely view content like modern ignorant web
       | devs like to do [yes i know some web devs now require it to be
       | able to make sure the people they're datamining are real, but
       | this is a new practice from this year basically])
       | 
       | public services should be implemented by decentralized p2p.
       | static content is solved by ipfs, freenet, etc. dynamic content
       | perhaps can only be solved with smart contracts, which would be
       | less bad than cloudflare if they weren't expensive, as they still
       | provide protocol conformance (unlike cloudflare that requires you
       | to have your packets look like a big 4 browser), anonymity (yeah,
       | pseudonyms, you can still make one per query), etc. without smart
       | contracts many interactive applications are still possible
       | 
       | > The other alternatives all suck to the extent of my knowledge,
       | they're either prohibitively convoluted, or web3 cryptocurrency
       | micro-transaction nonsense that while sure it would work, also
       | monetizes every single interaction in a way that is more
       | dystopian than the actual skull-crushing robot apocalypse.
       | 
       | centralized web hosting is and always was unsustainable and this
       | is the reason most web content is commercial garbage, and the
       | problem will only get worse. my concern was always what kind of
       | garbage boomer protocol will become the new standard. i sure as
       | hell dont want something that looks like email, web, or UN*X.
        
       | ltr_ wrote:
       | tangential : Two weeks ago (and for a while) our country's
       | twitter-sphere (Chile) was completely and obviously dominated by
       | bots, they were starting and inflating trending topics with
       | absurd lies, spreading fear and chaos in favor of "Rechazo" (the
       | option against our new constitution in the next ballot) or echo
       | chambers for republican and extreme right associated
       | politicians.. What happened? a self organized group[1] started to
       | do data analysis of the trending topics and delivering the
       | results to the people showing who was behind the campaigns and
       | synthetic likes, after this, prominent and public figures from
       | that sector started to cut funding for bot networks (because of
       | the public shaming and media attention they were receiving) and
       | is so pathetic now ,they can't even get more than 100 likes and
       | often the most popular response is a refutation or the very same
       | analysis showing the bot network working with substantially more
       | organic likes. I think is a very interesting phenomenon to watch.
       | Note that this lies/fear/chaos campaign is transversal, from
       | rural AM radio to tiktok, but is not working at all. People is
       | very aware of these campaigns and knows how to defend against.
       | Truth is stronger than money.
       | 
       | - [1] https://twitter.com/BotCheckerCL
        
       | 12907835202 wrote:
       | For my forum with 500k users a month I just added a registration
       | captcha related to my niche. E.g. for a Dark Souls forum it would
       | say "what game is this forum about?" And if you got it wrong the
       | validation would include "tip it's just two words D _rk S*ls ".
       | This reduced spam by over 99% and didn't annoy people with
       | recaptcha.
       | 
       | If someone was unable to get past that captcha (it still happens
       | I have logs!) I figured they were probably not that valuable a
       | contributor anyway.
       | 
       | If someone wanted to target my site directly they could but
       | hasn't happened so far._
        
         | ridgered4 wrote:
         | Reminds me of a guy who implemented a pre-screen on his phone
         | calls to stop spammers. He said he wanted to use something
         | simple at first and that he planned to tweak it depending upon
         | how many spammers go through. So his phase one it asks "Dial 1
         | to continue". But that was enough to stop all the spam calls so
         | he never had to improve it.
        
           | imperialdrive wrote:
           | I did the same for my parents home phone. Completely stopped
           | all spam calls!
        
         | Lex-2008 wrote:
         | re: someone was unable to get past that captcha - this reminded
         | me a story I heard back in ICQ times about some human who
         | couldn't pass anti-bot question: "What planet do we live on?"
        
           | Suzuran wrote:
           | I remember a friend's con-group forums who had an issue along
           | these lines - the anti-bot question was "What is the
           | brightest thing in the sky at noon?" the expected answer was
           | "the sun", but some guy got stuck because he was answering
           | "Sol". Since they had an IRC channel the issue was relatively
           | quickly resolved, but it was an in-joke for some time.
        
             | bombcar wrote:
             | The key takeaway is that if you have a _second_ line of
             | communication, humans can use it but bots won 't - "Issues
             | registering? Contact someemail or see us on IRC/Discord"
             | can do wonders.
        
           | gilrain wrote:
           | Fair enough... one can only speak for oneself, after all.
        
         | d3nj4l wrote:
         | A niche dark souls forum sounds interesting, any chance I could
         | get a link?
        
           | google234123 wrote:
           | You misread the post. That was just an example. A niche dark
           | souls forum wouldn't have 500k users lol.
        
       | EGreg wrote:
       | I will reiterate what I had been saying on HN for years:
       | 
       | 1) The problem is centralization. Yes DNS is federated but there
       | is a central registry. This means anyone can spam
       | you@yourdomain.com or visit your web server listening for HTTP
       | connections at www.domain.com
       | 
       | 2) DNS is a glorified search engine. Human readable domain names
       | are only needed for dictating a domain name (and listeners often
       | make mistakes anyway). They only map to a small fraction of URLs)
       | namely the ones with "/" path name. For most others, the human
       | readability adds little benefit.
       | 
       | 3) Start using URIs that are not human readable. The titles,
       | favicons and other metadata of resources should simply be cached,
       | and displayed to the user in their own bookmarks, search engines
       | or whatever. For Javascript environments, variables can easily
       | hold non human readable URIs. Also QR codes can resolve to non
       | human readable URIs.
       | 
       | 4) There may be some cookie policy for third party hostnames etc.
       | but just make them non human readable also.
       | 
       | 5) We should have DHT or other decentralized systems for routing,
       | and here is the key... in this system, you need a capability
       | issued by the website / mailbox owner in order for your message
       | to be routed to them. If the capability is compromised and used
       | to get a ton of SPAM, they simply revoke that specific capability
       | (key).
       | 
       | For HTTP websites you can already implement it on your side by
       | signing the keys / capabilities ie session cookie balues with an
       | HMAC, and there is no need to even do network I/O to verify them,
       | you can upload the whitelist to the edges and check them there
       | easily.
       | 
       | But going further, for new routing protocols, IP addresses should
       | be removed after the first hop in the DHT, because the global
       | routing system will send traffic there otherwise. See how SAFE
       | network does it.
       | 
       | 6) I don't need a "real names policy" or "blue checkmark". I can
       | know who "The Real Bill Gates (TM)" is through some verified
       | claims by Twitter or someone else. Just because I have the email
       | billgates@microsoft.com doesnt mean I should be able to email
       | him. There can be many Bill Gates. The names are just verified
       | claims by some third party. Here on HN we dont have names or
       | photos, and it works just fine.
       | 
       | 7) Most of the celebrity culture, papparazzi, Elon Musk and
       | Donald Trump moving markets and tweeting at 5am to 5 million
       | people at once, are problems of centralization. Both a 1 to many
       | megaphone and a many to 1 inbox. Citizens United is just a
       | symptom of the problem. I have spoken about this (privately
       | owning access to an audience) with Noam Chomsky in an interview I
       | did a year ago:
       | 
       | https://community.qbix.com/t/freedom-of-speech-and-capitalis...
       | 
       | Fox News (Rupert Murdoch), CNN (Ted Turner), Twitter (Elon or
       | Jack), Facebook (Zuck) are controlled by only a few people.
       | Channels on youtube, telegram, podcasts etc are controlled by a
       | few people. This leads to divisions in society, as outrage
       | clickbait rises to the top. Nonprofit models based on
       | collaboration like Wikipedia, Wikinews, Open Source and Science
       | produce far more balanced and benign information for the public.
       | 
       | In short we need alternatives to celebrity culture, DNS and other
       | systems that centralize decision making in the hands of a few, or
       | create firehoses and megaphones. Neither the celebrity nor the
       | public actually enjoy the results.
        
       | david_draco wrote:
       | Have a "CAPTCHA" that gives the IP reputation for some time
       | (cookie+IP=key), but instead of a CAPTCHA make the web page /
       | browser solve and submit a BOINC task from a randomly picked
       | science project. No user interaction needed, it has the benefits
       | of "paying by computation" of cryptocurrencies without the
       | tracing, and if bots solve the problem efficiently, it's good for
       | science.
        
         | GTP wrote:
         | But solving a BOINC task requires too much time while the
         | average user rightfully expects a webpage to load within 5
         | seconds or so
        
         | rapnie wrote:
         | That is a nice idea. So bit similar to mCaptcha [0] that uses
         | PoW algorithm, mentioned in other comment [1] in the thread.
         | 
         | [0] https://mcaptcha.org/
         | 
         | [1] https://news.ycombinator.com/item?id=32339902
        
         | zakki wrote:
         | Can we make a bot to mine a cryptocurrency?
        
           | GTP wrote:
           | It's called miner and you can already install it on your pc.
        
       | julianlam wrote:
       | I disagree with TFA's take on dealing with spam -- giving up!
       | 
       | For our app, we don't deal with spam in any novel way. We use
       | honey pot, SFS, and Akismet.
       | 
       | However, by far the easiest way to stop spammers is a post queue.
       | Lots of spammers will just create a burner account, fire off
       | their spam, and start over. Given no actual reputation, give them
       | the trust they deserve -- none.
       | 
       | The other factor is building out a _fast_ backend. Besides
       | benefiting your own users, it also means Googlebot or Ahrefsbot
       | won 't absolutely cripple your site when they come knocking.
       | Sometimes that is doable, sometimes not.
        
         | marginalia_nu wrote:
         | (Author) I'm running a search engine though. Do you propose I
         | require users to register an account, and then not allow them
         | to search?
         | 
         | I think my backend is plenty fast given it's hosted on a PC off
         | domestic broadband. Most searches complete sub-100ms.
        
           | julianlam wrote:
           | Hey, thanks for the reply!
           | 
           | Specific scenarios require creative solutions. For a search
           | engine, how do you differentiate between robots and
           | legitimate users? It seems a rate limiting step is likely the
           | best solution.
           | 
           | If query rate from the same IP exceeds a threshold, throttle
           | them creatively. +100ms the next time, +250ms the next, etc.
           | 
           | The upside is these bots will adjust their strategies to hit
           | your site slower, which is the whole point, isn't it.
           | 
           | If they spread requests across IPs, perhaps try
           | fingerprinting. I'm not sure how effective that is on the
           | backend though.
        
       | Animats wrote:
       | What's hard to do now is host a lightly used but broadly
       | interesting service that doesn't require a login.
       | 
       | Although, surprisingly, I host such a service, and while it gets
       | a constant stream of random hits, they're a minor nuisance.
       | Probably because it's just the back end for a web page, and
       | nobody bothers to target it specifically. Random web browsing
       | won't find it, and the API will just return an error if called
       | incorrectly. Even if it is called correctly, it has fair queuing
       | on the service, so hammering on it from a small number of IP
       | addresses won't do much.
       | 
       | That did happen once. Someone from a university was making
       | requests at a high rate and not even reading the results. I
       | noticed after a month, and wrote to their department chair, which
       | stopped the problem.
        
         | closedloop129 wrote:
         | >What's hard to do now is host a lightly used but broadly
         | interesting service that doesn't require a login.
         | 
         | Which other broadly interesting services do exist? The owners
         | of those services could come together and offer a VPN that gets
         | preferred treatment for these services. This could be more
         | precise than https://www.abuseipdb.com/.
        
         | Aachen wrote:
         | Same! I also got like 20 requests every second from a
         | university IP. I tried a few things to make it error out, like
         | returning 404, but no dice. In my case, it was my own fault
         | though, a page with a few lines of JS to periodically check for
         | updates got into a crazy state (I never found out how) and they
         | didn't notice because it was a remote desktop system where they
         | left the page open. Went on for months but didn't impact my
         | service (I just noticed it in access logs while looking for
         | something else) so I left it and remembered again a few months
         | later, then it was gone.
        
         | s1k3s wrote:
         | Yes, this is why I plan to take down my hobby projects. And
         | it's not only bots, real people do it as well. Apparently some
         | people have a passion for screwing up other people's work. Some
         | even email me afterwards asking for money to disclose a bug
         | they found.
        
       | [deleted]
        
       | FrenchDevRemote wrote:
       | Does anyone know how google/linkedin manage to block bots who are
       | using SSO?
       | 
       | Trying to login to a linkedin account using a google account from
       | an automated browser(like puppeteer+puppeteer-stealth or
       | fakebrowser), will open a white empty window instead of the
       | normal google login window, I could be a limitation of those
       | libraries, but I doubt it, smells like something they detect,
       | maybe looking into might lead some interesting insights on how to
       | limit modern bots.
        
       | goatcode wrote:
       | >large resources causing bot spam
       | 
       | >large resources are the solution
       | 
       | To those who have been recently pondering the history of
       | antivirus companies of the 90s and 00s, and suspiciously
       | wondering how they were always able to so quickly come up with
       | definitions for the newest infections, this all feels so
       | familiar. What a sad world we live in, sometimes.
        
       | djohnston wrote:
       | I work in this space at a company you've heard of - even at our
       | scale and with our resources the proportionally larger attack
       | incentives mean we are constantly firefighting.
       | 
       | > The other alternatives all suck to the extent of my knowledge,
       | they're either prohibitively convoluted, or web3 cryptocurrency
       | micro-transaction nonsense that while sure it would work, also
       | monetizes every single interaction in a way that is more
       | dystopian than the actual skull-crushing robot apocalypse.
       | 
       | I understand the drawback here but I would like to see monetized
       | transactions employed as a defense layer a little more before we
       | make a final decision. It is undemocratic, to be sure, but maybe
       | for those of us who can afford it it's still better than the
       | cesspool we currently sift through on every major platform.
       | Anyone aware of any platforms taking this approach?
       | 
       | Maybe the fediverse will help - by fragmenting networks attackers
       | may have less incentive to attack a particular one.
        
         | reaperducer wrote:
         | _Anyone aware of any major platforms taking this approach?_
         | 
         | The Postal Service?
         | 
         | Sure, there's junk mail, but imagine how much junk mail there
         | would be if it were delivered for free. It wasn't until phone
         | calls became so cheap as to be "unlimited" that we ended up
         | flooded with billions of junk calls.
         | 
         | Microtransactions (non-crypto, thankyouverymuch) would solve a
         | certain number of today's problems.
        
           | marginalia_nu wrote:
           | I do think it would help, like even if a transaction cost
           | 0.05c, it would add up very quickly for a bot operator but
           | stay cheap for everyone else. But I think the problem is it
           | would inevitably introduce the need for a middle man, shaving
           | 0.01c off that 0.05c, with a dubious incentive to increase
           | the amount of money changing hands as much as possible. What
           | you've invented at that point is basically Cloudflare with
           | worse incentives.
           | 
           | You get either that, or yucky defi web3 crap.
        
           | djohnston wrote:
           | Yes for sure, I have thought about making an email "stamp"
           | web-3 service that would implement this. I even wanted to
           | make some fun "pony express" animations whenever a letter was
           | arriving to your inbox.
        
           | kube-system wrote:
           | Let's finally implement HTTP 402
        
       | SmileyJames wrote:
       | I thought a plan for spam had solved this one?
       | http://www.paulgraham.com/spam.html
       | 
       | Has NLP progressed rendering Paul's plan a failure?
       | 
       | Am I a bot? How about you? Does it matter if I make valuable
       | contributions?
        
         | greazy wrote:
         | Spam and bots eating traffic are two different things.
        
       | timmaxw wrote:
       | I wonder if proof-of-work would help. Suppose every form
       | submission requires an expensive calculation, calibrated to take
       | about 1 second on a typical modern computer/smartphone. For human
       | users, this happens in the background, although it makes the
       | website feel slower. But for bots, it dramatically limits how
       | many submissions each botnet host can make to random websites.
        
         | protoduction wrote:
         | I'm the co-founder of Friendly Captcha [0], we offer a proof of
         | work-based captcha since two years or so. Happy to answer any
         | questions.
         | 
         | A big part of what makes our captcha successful in fighting
         | abuse is that we scale the difficulty of the proof-of-work
         | puzzle based on the user's previous behavior and other signals
         | (e.g. minus points if their IP address is a known datacenter
         | IP).
         | 
         | The nice thing about a scaling PoW setup is that it's not all-
         | or-nothing unlike other captcha's. Most captcha's can be solved
         | by "most" humans, but that means that there is still some
         | subset of all humans that you are excluding. In our case if we
         | do get it wrong and wrongly think the user is a bot, the user
         | may have to solve a puzzle for a while, but after that they are
         | accepted nonetheless.
         | 
         | [0]: https://friendlycaptcha.com
        
           | tmikaeld wrote:
           | While your service is of high quality, the pricing is
           | completely unreasonable for private use cases, many times
           | higher than hosting the site in the first place.
        
             | protoduction wrote:
             | I'm sorry to hear that. We offer free and small plans for
             | small use-cases, but I also understand that some projects
             | don't have a budget at all.
             | 
             | There is a blessed source-available version of the server
             | that you can self-host [0]. It is more limited in its
             | protection, but it is probably good enough for hobby
             | projects.
             | 
             | [0]: https://github.com/FriendlyCaptcha/friendly-lite-
             | server
        
             | dj_mc_merlin wrote:
             | I think the it depends on what counts as a "request" in
             | terms of pricing. Is it only successful checks? Pricing
             | would be fine then. If it also includes failed checks then
             | there is no point in the service, including the Advanced
             | plan. Would eat through the entire credit in a day.
        
               | tmikaeld wrote:
               | If it was on successful validations, they would called it
               | so, no it's on every request, even failed ones.
        
         | tmikaeld wrote:
         | "mCaptcha uses SHA256 based proof-of-work(PoW) to rate limit
         | users."
         | 
         | https://github.com/mCaptcha/mCaptcha
        
           | cmjs wrote:
           | I'm curious whether this can actually be considered to be a
           | "CAPTCHA" in the true sense of the term. It doesn't seem to
           | be intended to "tell computers and humans apart", but rather
           | to force the client _computer_ (not the human user) to do
           | some work in order to slow down DOS attacks.
           | 
           | Of course slowing down DOS attacks is a great goal in itself,
           | and it's very often what captchas have been (ab)used for, but
           | it doesn't seem to me to replace all or most use cases for a
           | captcha. In particular, since it can be completed by an
           | automated system _at least_ as easily as by a human, it doesn
           | 't seem like it would limit spambot signups or spambot
           | comment or contact form submissions in any meaningful way.
           | 
           | Or am I misunderstanding, @realaravinth?
        
             | realaravinth wrote:
             | Thanks for the ping!
             | 
             | I used "captcha" to simplify mCaptcha's application,
             | calling it a captcha is much simpler to say than calling it
             | a PoW-powered rate limiter :D
             | 
             | That said, yes it doesn't do spambot form-abuse detection.
             | Bypassing captchas like hCaptcha and reCAPTCHA with
             | computer vision is difficult but its is stupid easy to do
             | it with services offered by CAPTCHA farms(employ humans to
             | solve captchas; available via API calls), which are
             | sometimes cheaper than what reCAPTCHA charges.
             | 
             | So IMHO, reCAPTCHA and hCaptcha are only making it
             | difficult for visitors to access web services without
             | hurting bots/spammers in any reasonable way.
        
               | cmjs wrote:
               | Thanks for the reply! That's basically what I thought
               | then - but as you say, traditional captchas are deeply
               | flawed and ineffective anyway, and I totally agree that
               | in many cases the cost to real users outweighs any
               | benefit. So I'm excited to see alternatives such as
               | mCaptcha popping up. It'll be interesting to see how it
               | works out for people in real-world use.
        
           | Aissen wrote:
           | How does that work without becoming a SPOF for taking down
           | the website ? Can't a user/botnet with more CPU power than
           | the server simply send more captchas than can be processed ?
           | 
           | In addition, using sha256 for this is IMHO a mistake, calling
           | for ASIC abuse.
        
             | realaravinth wrote:
             | > How does that work without becoming a SPOF for taking
             | down the website ? Can't a user/botnet with more CPU power
             | than the server simply send more captchas than can be
             | processed ?
             | 
             | Glad you asked! This is theoretically possible, but the
             | adversary will have to be highly motivated with
             | considerable resources to choke mCaptcha.
             | 
             | For instance, to generate Proof of Work(PoW), the client
             | will have to generate 50k hashes(can be configured for
             | higher difficulty) whereas the mCaptcha server will only
             | have to generate 1 hash to validate the PoW. So a really
             | powerful adversary can overwhelm mCaptcha, but at that
             | point there's very little any service can do :D
             | 
             | > In addition, using sha256 for this is IMHO a mistake,
             | calling for ASIC abuse.
             | 
             | Good point! Codeberg raised the same issue before they
             | decided to try mCaptcha. There are protections against ASIC
             | abuse: each captcha challenge has a lifetime and also,
             | variable difficulty scaling implemented which increases
             | difficulty when abuse is detected.
             | 
             | That said, the project is in alpha, I'm willing to wait and
             | see if ASIC abuse is prevalent before moving to more
             | resource-intensive hashing algorithms like Scrypt. Any
             | algorithm that we choose will also impact legitimate
             | visitors so it'll have to be done with care. :)
        
               | Aissen wrote:
               | > the client will have to generate 50k hashes(can be
               | configured for higher difficulty)
               | 
               | I completely forgot how PoW worked, it's clearer now. You
               | should probably add that this is a probabilistic average,
               | so people will have to be ready for much longer (and
               | faster) resolutions.
               | 
               | With what you said, an adversary can probably just DoS
               | mCaptcha without any computation, if verification is
               | stateless (by sending garbage at line rate); if it is
               | stateful (e.g CSRF token), you'll have to do a cache
               | query, which is probably on the same order of magnitude
               | of a single hash.
        
           | realaravinth wrote:
           | Hello!
           | 
           | I'm the author of mCaptcha, I'd be happy to answer any
           | questions that people might have :)
        
             | titaniczero wrote:
             | It looks great, as a suggestion: Instead of an easy mode
             | and advanced one I would use a single mode with a
             | calculator, that way it is more transparent to the user and
             | it would make the process of learning the advance mode and
             | concepts easier.
             | 
             | Also, here: https://mcaptcha.org/, under the "Defend like
             | Castles" section, I think you meant "expensive", not
             | "experience".
             | 
             | Keep up the good work!
        
               | realaravinth wrote:
               | Thank you for the kind words!
               | 
               | > Instead of an easy mode and advanced one I would use a
               | single mode with a calculator, that way it is more
               | transparent to the user and it would make the process of
               | learning the advance mode and concepts easier.
               | 
               | Makes sense, I'll definitely think about it. The
               | dashboard UX needs polishing and this is certainly one
               | area where it can be improved.
               | 
               | > Also, here: https://mcaptcha.org/, under the "Defend
               | like Castles" section, I think you meant "expensive", not
               | "experience".
               | 
               | Fixed! There are a bunch of other typos on the website
               | too, I can't type even if my life depended on it :D
        
             | luckylion wrote:
             | The results of the PoW are just thrown away, right? I
             | wonder if you could couple that with something useful, e.g.
             | what SETI@home used to do, but the intentionally small size
             | of the work probably makes it difficult to be useful.
        
               | realaravinth wrote:
               | I'd love to do something useful with the PoW result but
               | like you say, the PoW should be able to work in browsers,
               | so they are intentionally small.
               | 
               | The maximum advisable delay is ~10s but even then it
               | might not be enough for it to be useful.
        
             | rapnie wrote:
             | See also dedicated submission at:
             | https://news.ycombinator.com/item?id=32340305
        
           | timmaxw wrote:
           | Nice! Yeah, mCaptcha looks like just what I had in mind.
           | 
           | I wonder why this approach hasn't been widely adopted?
        
             | tmikaeld wrote:
             | Probably due to "PoW" being power-hungry, but that's
             | largely false because you only apply PoW here on users that
             | are abusing the system.
             | 
             | Allowing abusers to freely abuse would cost even more power
             | than just forcing them to do the work.
        
             | rapnie wrote:
             | mCaptcha is in the process of being adopted in Gitea and
             | Codeberg. See recent Fediverse post from the project
             | account: https://gts.batsense.net/@mcaptcha/statuses/01G9KR
             | BRC8CRC9M3...
        
             | realaravinth wrote:
             | The project is very new, I haven't started promoting yet.
             | The Codeberg development was purely from word of mouth :)
             | 
             | disclosure: I'm the author of mCaptcha
        
         | the8472 wrote:
         | This is an old idea known as hashcash.
         | https://en.wikipedia.org/wiki/Hashcash
         | 
         | Newer variations (such as argon2) are tunable so you can
         | include memory footprint and cpu-parallelism. There also are
         | time-lock puzzles or verifiable delay functions that negate any
         | parallelism because there's a single answer which can't be
         | arrived at sooner by throwing more cores at the problem.
        
         | zozbot234 wrote:
         | For small scale self-hosted forums, bespoke CAPTCHA questions
         | can work quite well in practice. Make it weird enough and it
         | just isn't worth that much for malicious users to break, while
         | most humans can pass easily. Spammers benefit from volume.
        
           | rapnie wrote:
           | > most humans can pass easily
           | 
           | Beware when choosing a CAPTCHA that serving "most humans"
           | _might_ exclude those with accessibility issues, like the
           | visually impaired.
        
       | rrwo wrote:
       | I run a website for a small company. The site has been around
       | since the mid-1990s, and bots are a minor annoyance, but not a
       | problem.
       | 
       | We also use some simple heuristics to reject obvious bot traffic.
       | 
       | One of the simplest is to have a form field that is hidden via
       | CSS. Humans don't see it and it stays blank. Bots fill it in.
       | 
       | Bots tend to fill in every form fields with random garbage, even
       | checkboxes. Validating checkboxes rather than checking they have
       | a value is another good way to detect bots.
       | 
       | Many bots have a hard time with CSRF tokens in hidden fields.
       | 
       | Many bots also don't handle session cookies properly. If someone
       | submits a registration form without an existing session, we
       | reject it. (So we don't get as far as checking the CSRF token.)
       | 
       | After a certain number of failed attempts to register or login,
       | we block the IP for a period of time.
        
       | lifeisstillgood wrote:
       | I would suggest that bots are actually not the underlying
       | problem. For most things I would like a bot acting for me.
       | Telling me as and when that I need to visit the dentist, who has
       | slots free next weds and friday. Friday is best because I am also
       | WFH that day. The bot apocalypse is only one because we are
       | trying to make a "web for humans" when actually a "web for bots,
       | and a bot for a human" is a much better idea :/)
       | 
       | We need to redesign a web based on APIs, certificates, rate
       | limits etc. And stop having "engagement" as a goal, and have
       | "getting things done" as a goal
       | 
       | Edit: mucked up formatting
        
       | IfOnlyYouKnew wrote:
       | So there's one service keeping this search engine online, and
       | it's probably doing it for free, and the author can't even think
       | of a better way to do it.
       | 
       | Yet Cloudflare still gets two paragraphs of complaints in the
       | face? Because the author wants to "own" something instead of
       | "renting"?
        
         | marginalia_nu wrote:
         | I'm doing it for free because I don't want this to be a
         | commercial service. I get that HN is startup city, but I'm not
         | running a startup, it's just a hobby.
        
       | SahAssar wrote:
       | > There has been upwards of 15 queries per second from bots.
       | There is just no way to deal with that sort of traffic, barely
       | even to reject it.
       | 
       | I don't really understand, is that a lot? 15qps does not sound
       | like a lot, especially for a blocking/rejection function.
        
         | marginalia_nu wrote:
         | It's 15 search queries per second, not requests per second. RPS
         | is usually 10-20x higher.
        
           | SahAssar wrote:
           | But you said "barely even to reject it", rejecting 15 QPS
           | should not be heavy on any resource, right? Or is the actual
           | problem identifying the bot traffic?
        
       | Xeoncross wrote:
       | I have considered skipping the regular fingerprinting,
       | geolocation, captcha, hashcash, email verification, payment
       | required, etc... mitigations and instead requiring people to drop
       | into a public chat room (or pm/chat the support team) to have
       | their account activated.
       | 
       | The number of languages supported would be small to match whoever
       | helped moderate this, but it would at least require speaking to
       | someone. A PM thread or live chat would be an instant way to find
       | out if someone can string two sentences together and might be
       | worth allowing into the site. You could even have them create an
       | account and solve a single captcha prior to getting access to the
       | chat.
       | 
       | It's not perfect by any stretch, but might be worth exploring
       | having humans-verify-humans.
        
       | pphysch wrote:
       | The only real solution to the abuse of anonymous protocols is to
       | stop using anonymous protocols and use protocols where clients
       | can be held accountable. But that's politically nonviable in the
       | West.
        
         | JZerf wrote:
         | I don't see any good reason why people can't be allowed to
         | remain anonymous while still allowing website operators from
         | taking measures to stop bot abuse. CAPTCHAs can already stop
         | many bots. Other commenters have also mentioned that things
         | like Proof of Work systems and micro-transactions could also
         | stop bot abuse. These don't necessarily require giving up
         | anonymity.
        
           | pphysch wrote:
           | It's not just bots, it's troll farms as well which are "real"
           | people destroying public discourse in bad faith.
        
             | JZerf wrote:
             | Website operators could still take measures to stop abuse
             | from troll farms as well while still allowing people to
             | remain anonymous. A website operator like Twitter for
             | instance could perhaps require users to make a small micro-
             | transaction before allowing someone to make a post. Some
             | equilibrium for the cost of a post could probably be found
             | where most legitimate users would still be willing to pay
             | that cost but most troll farms would not.
        
               | pphysch wrote:
               | Are you serious? The problematic troll farms are the ones
               | backed by states and multinational corporations. Gating
               | speech behind money only makes the problem worse.
               | 
               | The correct approach is to deanonymize reasonably
               | "public" online behavior. This is the only way to hold
               | abusers accountable, and, ironically, democratize free
               | speech.
               | 
               | 1 person, 1 voice.
               | 
               | Not 1 rich person, 100 troll accounts.
        
               | JZerf wrote:
               | Yeah, I'm serious. Even with Twitter, for example,
               | currently allowing accounts to be created and posts to be
               | made for essentially free, real accounts and posts still
               | outnumber those of troll farms and bots from what I've
               | seen. If those troll farms and bots actually had to pay,
               | I imagine there would be far less. I also imagine that
               | those troll farms and bots are less influential than real
               | people. I believe that the endgame is that if a website
               | operator takes enough measures to stop troll farms and
               | bots, the operators of those troll farms and bots will
               | eventually run out of resources and be forced to curtail
               | their activity.
               | 
               | You're right that gating speech behind money could
               | potentially be bad and make problems worse but I only
               | offered that as one suggestion. Instead of or in addition
               | to using money, you could perhaps make a system that uses
               | some type of karma/reputation for instance. Those could
               | still be done anonymously.
        
       | z3t4 wrote:
       | My trick is to have one field that should always be blank and one
       | field that should always have an value, this stops all automated
       | bots. No "CAPTCHA" needed.
        
       | paulmd wrote:
       | > They're a major part in killing off web forums, and a
       | significant wet blanket on any sort of fun internet creativity or
       | experimentation.
       | 
       | > The only ones that can survive the robot apocalypse is large
       | web services. Your reddits, and facebooks, and twitters, and
       | SaaS-comment fields, and discords. They have the economies of
       | scale to develop viable countermeasures, to hire teams of people
       | to work on the problem full time and maybe at least keep up with
       | the ever evolving bots.
       | 
       | This is not true at all. There are web forums that are not "web-
       | scale" and don't spend all day fighting bot spam. The solution is
       | real simple: it costs 10 bux to register an account, if you're a
       | nuisance your account is banned and you pay 10bux to get back on.
       | 
       | Even the sites that don't require payment for explicit
       | registration - often succeed by gating functionality or content
       | behind paywalls. Requiring a "premium membership" to post in the
       | classifieds forum is an extremely extremely common thing on small
       | interest-based web-boards (photrio, pentaxforums, homebrewtalk,
       | etc). That income supports the site and supports the anti-bot
       | efforts as a whole. The customer isn't advertisers - it's the
       | community itself, and you're providing the _service_ of high-
       | quality content and access to people with similar interests.
       | 
       | You need to bootstrap a community first, of course, but it
       | doesn't need to be a large community, just a high-value one.
       | 
       | The twitters and facebooks of the world just don't like that
       | solution because they value growth above all other
       | considerations. They'd rather be kings of a billion user website
       | with 200 million bots than a 1k-100k user forum with 100% organic
       | membership and content. And they value engagement over content
       | quality, which is the entire reason comment-tree/vote-based
       | systems have been pushed heavily over web-1.0 threaded forum
       | discussions as well.
       | 
       | This botpocalypse is the inevitable outcome _of the systems that
       | social-media giants have created_ , not inherent outcomes of the
       | internet as a whole.
        
         | NickRandom wrote:
         | > The solution is real simple
         | 
         | Uhhmmm, I beg to differ and so do a lot of very smart people
         | with many more servers and users than you or I are likely to
         | see.
         | 
         | As with most 'Oh, its' Simple - Just Do XYZ' solutions there
         | are often very good reasons for not doing the 'Easy/Simple/One-
         | Liner' and here are a few with yours -
         | 
         | Firstly - The '10 bux' could exclude a vast swathe of the
         | poorest. Skipping a couple of Starbuck coffees vs. the local
         | currency equivalent of whatever you are charging equating to a
         | month's worth of food or being able to send at least one of
         | your children to the local village school. I mean - your forum
         | / site so you can gate it anyway you wish, I'm just pointing
         | out that it could and would be exclusionary (perhaps
         | unintentionally so).
         | 
         | Next Problem: Accepting and Processing the 'Good Behavior'
         | deposit. Congratulations, you now need to become a Payment
         | Processor and as such have certain legal requirements regarding
         | payment details and storage and also tax returns. 'Oh, just
         | Off-Load it to Stripe' someone might suggest. Do-able I guess
         | but anyone who has taken payments over the internet will tell
         | you that its a Royal Pain in The Ass. Also, now all a 'Griefer'
         | needs to do is run a few dodgy cards through your registration
         | system and 'Poof' there goes your payment processor and/or the
         | fees go sky high.
         | 
         | Most 'oh its simple - why don't they just...' overlook (or are
         | not aware) of the many, many good reasons why greater minds
         | than yours or mine haven't already implemented it.
         | 
         | Sure - sometimes people do come up with novel solutions to old
         | problems so theres no harm in spit-balling and I'm certainly
         | not directing any scorn or ill-intent in my reply.
        
           | Kalium wrote:
           | > Firstly - The '10 bux' could exclude a vast swathe of the
           | poorest. Skipping a couple of Starbuck coffees vs. the local
           | currency equivalent of whatever you are charging equating to
           | a month's worth of food or being able to send at least one of
           | your children to the local village school. I mean - your
           | forum / site so you can gate it anyway you wish, I'm just
           | pointing out that it could and would be exclusionary (perhaps
           | unintentionally so).
           | 
           | It _is_ intentionally exclusionary. Not necessarily of the
           | poorest among us, but of those who expect free service.
           | Botters and spammers are disproportionately likely to look
           | for free service. Pretty much any level of required spending
           | in any currency will have a similar effect. By cutting off
           | the abuse-prone free tier that many bad actors depend on, you
           | dramatically decrease your exposure to abuse.
           | 
           | The point is not to keep out the poor people. The point is to
           | make it far more work to get over the hurdle than it's worth
           | for abusers. If you have a way to do the latter without the
           | former that doesn't hinge on pushing a bunch of extra work
           | onto administrators, I suspect quite a lot of people would be
           | very curious to hear about it.
        
           | [deleted]
        
           | gnome_chomsky wrote:
           | > Most 'oh its simple - why don't they just...' overlook (or
           | are not aware) of the many, many good reasons why greater
           | minds than yours or mine haven't already implemented it.
           | 
           | The forums they are referring to have been operating with the
           | "10 bux" model implemented on top of a highly customized
           | version of vBulletin for over 20 years.
        
         | tablespoon wrote:
         | > This is not true at all. There are web forums that are not
         | "web-scale" and don't spend all day fighting bot spam. The
         | solution is real simple: it costs 10 bux to register an
         | account, if you're a nuisance your account is banned and you
         | pay 10bux to get back on.
         | 
         | That doesn't work at all unless your service is already pretty
         | popular. Who would pay $5 to access a new, empty forum?
         | 
         | You mention "you need to bootstrap a community first," but
         | that's basically an admission that this solution doesn't solve
         | the problem at all, because you have to solve the problem in
         | some other way to use this solution. 10bux was a solution
         | limited to a _very specific time_.
        
           | ineptech wrote:
           | Step 1: Plant a tree twenty years ago...
        
         | sircastor wrote:
         | >The solution is real simple: it costs 10 bux to register an
         | account, if you're a nuisance your account is banned and you
         | pay 10bux to get back on.
         | 
         | Many years ago there was a public server called SDF (Super
         | Dimensional Fortress). It was a BSD system and anyone could get
         | a user account for $1. The theory was even the least of us, a
         | kid scrounging for money on the street, could come up with a
         | dollar (and presumably the postage to mail it). To a certain
         | person, access to this kind of server was invaluable - the only
         | situation you could hope to get close to this kind of system.
         | As time went on, the number of people interested in this was
         | dwindling.
         | 
         | Jumping through hoops is a useful gateway, but if your hoops
         | are too complex or arduous, you miss out on people who you
         | genuinely want to include in your community.
        
           | bayindirh wrote:
           | SDF is still alive and kicking, though. I have an account
           | with them.
        
             | [deleted]
        
             | [deleted]
        
           | RalfWausE wrote:
           | >As time went on, the number of people interested in this was
           | dwindling.
           | 
           | I don't think so...
           | 
           | First and foremost, SDF is well alive and there is a constant
           | stream of people registering on it...
        
           | Nextgrid wrote:
           | > As time went on, the number of people interested in this
           | was dwindling.
           | 
           | That's mostly because access to computers has became easier -
           | you can either get your own Linux box or get a proper VPS for
           | extremely cheap (if not free - see cloud provider free tiers)
           | nowadays so why bother with a non-root account on a _BSD_
           | system?
           | 
           | IMO it doesn't have anything to do with the barrier to entry.
        
           | paulmd wrote:
           | > Many years ago there was a public server called SDF (Super
           | Dimensional Fortress). It was a BSD system and anyone could
           | get a user account for $1. The theory was even the least of
           | us, a kid scrounging for money on the street, could come up
           | with a dollar (and presumably the postage to mail it). To a
           | certain person, access to this kind of server was invaluable
           | - the only situation you could hope to get close to this kind
           | of system. As time went on, the number of people interested
           | in this was dwindling.
           | 
           | SDF is still around though, and still operates on the same
           | model - pay once to get in, and you can stay as long as you
           | want unless you become a nuisance.
           | 
           | > Jumping through hoops is a useful gateway, but if your
           | hoops are too complex or arduous, you miss out on people who
           | you genuinely want to include in your community.
           | 
           | It is certainly not impossible for communities with this
           | model to die - that's not what I'm saying at all. Small
           | social media sites die all the time, including with Reddit-
           | style gamification bullshit. Or they turn into cesspits like
           | Digg or Voat.
           | 
           | But yes, increasing the friction of engagement is literally
           | the point, you are losing some users but increasing the
           | quality of the ones who remain. It's the old "fire your bad
           | customers" routine, but for social media.
           | 
           | "Oh no, we are all losing out on your valuable shitposting,
           | how will this community ever go on?"
        
             | Kye wrote:
             | They even have a Mastodon instance.
             | 
             | https://mastodon.sdf.org/about
        
           | ticviking wrote:
           | SDF is still around. I recently recovered my account and
           | spent a lovely afternoon hanging out on their chat
        
         | varispeed wrote:
         | > There are web forums that are not "web-scale" and don't spend
         | all day fighting bot spam. The solution is real simple: it
         | costs 10 bux to register an account, if you're a nuisance your
         | account is banned and you pay 10bux to get back on.
         | 
         | There is a big problem with all sorts of activists though.
         | Their modus operandi is finding forums they don't like and the
         | go on post illegal material and reporting it to hosting
         | provider. Many forums stopped accepting new users because of
         | that and for instance only way to sign up is to find the owner
         | and speak directly to them.
        
         | paulmd wrote:
         | few edits I wanted to make but couldn't while HN was down: this
         | comes to a question of intentions, right? Like are you trying
         | to _build a high-value community_ , or are you trying to make a
         | billion-dollar company? Photrio or Pentaxforums is never going
         | to sell for a billion dollars like Reddit, and that's not the
         | kind of community that Reddit is trying to build.
         | 
         | The highly-chaotic multithreaded model of Reddit/HN/etc is
         | directly _designed_ to be impenetrable and chaotic, where
         | everyone is just responding to everyone rather than having a
         | "flow of conversation" in which everyone is involved. The
         | "everyone responding to everyone" is literally engagement, and
         | that's what those sites/companies want to drive, not community-
         | building. It's _designed_ to suck as a medium for serious
         | discourse, because making 27 slight variations on the same
         | response to 27 different comments keeps you on the site.
         | 
         | As long as we persist in having engagement be the primary
         | metric, that's what you will get, and as long as we persist in
         | the idea that the objective of social media needs to be making
         | a couple people into billionaires, engagement is going to be
         | the focus.
         | 
         | And again, it's a fundamental shift in "who the customers are".
         | Are the customers the people using the site, who want a great
         | place to discuss the nuances of parrot taxonomy, or are your
         | customers the advertisers? Those lead to different ways you
         | build the community.
         | 
         | And you can still make a six-figure income being a webmaster of
         | a smaller community too. You're just not going to make Reddit
         | money off Pentaxforums.
        
           | closewith wrote:
           | > The highly-chaotic multithreaded model of Reddit/HN/etc is
           | directly designed to be impenetrable and chaotic, where
           | everyone is just responding to everyone rather than having a
           | "flow of conversation" in which everyone is involved.
           | 
           | I couldn't disagree more with this characterisation. Part of
           | the reason that sites like Reddit and HN are preferred to
           | traditional fora (which have their own engagement mechanisms)
           | is because it's possible to have a different set of
           | discussions on a topic.
           | 
           | Single-threaded fora result in many conversations on various
           | sub-topics and of varying quality being multiplexed in a
           | single chaotic and incomprehensible comment chain. Comment
           | trees allow sub-topics and sub-conversions to be grouped in a
           | reasonable fashion, and voting allows junk contributions to
           | be pushed to the bottom.
           | 
           | It's not just webscale entrepreneurs - users love comment
           | trees. It's part of why sites like Reddit and HN succeeded in
           | gaining traction, and why subreddits are now the de facto
           | replacement for fora (unfortunately, as it centralises
           | editorial power).
        
             | ori_b wrote:
             | It's something borrowed directly from email, at least in
             | traditional clients, and it works.
        
             | Macha wrote:
             | This wasn't always through - the first web forums were
             | threaded more often than not, and even vbulletin supported
             | a threaded mode as a user preference well into the 00s
             | (though as it was no longer the default, others were not
             | using linked replies, which hurt its usefulness at that
             | point). This is probably related to imitating the way
             | mailing lists worked, as some of these early UIs were like
             | the older style of mailing list presentation with a few
             | more forms attached.
             | 
             | The flat forums were considered an innovation for a while
             | because "normal users will never understand this nerdy
             | threading model". Arguably to some extent they were right,
             | as mass market products like youtube, facebook, etc. still
             | limit to one level of replies.
             | 
             | The real innovation of the social news sites was the voting
             | and scoring algorithms, which made it manageable by
             | presenting users with the most popular subthreads first,
             | rather than the chronological order the forums had used.
             | And gaining those points had a kind of skinner box effect
             | on keeping users hooked on the sites, which helped their
             | growth too - especially when points used to have a much
             | more prominent display.
        
             | mjr00 wrote:
             | > I couldn't disagree more with this characterisation. Part
             | of the reason that sites like Reddit and HN are preferred
             | to traditional fora (which have their own engagement
             | mechanisms) is because it's possible to have a different
             | set of discussions on a topic.
             | 
             | Preferred by whom, and in what contexts?
             | 
             | The Reddit/HN threaded comment styles work well in
             | scenarios that are relatively high-traffic, ephemeral, and
             | mostly anonymous, in the sense that you generally don't
             | notice or care about the username of people with whom
             | you're having a conversation. You're right that it makes it
             | easier to comprehend because you can just read from the
             | parent downwards to get the full context, and that's rarely
             | going to get to double digits.
             | 
             | But this has a lot of drawbacks. It's a method of
             | communication that's built for a burst of posts on a topic
             | for a day or so, then effectively archived as people move
             | onto the next topic.
             | 
             | This doesn't mean that users prefer this method of
             | communication universally, though. Even though forums are
             | dying, Discord continues to grow and offer smaller
             | communities that actually _have_ a  "community" aspect to
             | them. Depends on the server, but I pretty much never see
             | extensive use of threads in Discord, if at all; most people
             | are happy to have a constant flow of conversation in one of
             | the channels. It's closer to how people have conversations
             | in person.
        
           | GTP wrote:
           | You're right that requiring a small fee to register can solve
           | the problem for small communities (and so OP is wrong in the
           | paragraphs that you quoted), but still this solution doesn't
           | work for OP's case as this requires login and he says that he
           | doesn't want to know who is using his search engine (I guess
           | for privacy reasons) and in this case indeed the bots can
           | harm the proliferation of this kind of small scale services.
        
         | Tehdasi wrote:
         | > And they value engagement over content quality, which is the
         | entire reason comment-tree/vote-based systems have been pushed
         | heavily over web-1.0 threaded forum discussions as well.
         | 
         | While they certainly do value engagement over quality, I
         | suspect that the systems are put in place because they don't
         | scale in terms of manpower, and they don't trust their users to
         | do formalize the structure of the site.
        
         | Nextgrid wrote:
         | There are (at least) 2 kinds of spam - "technical" spam such as
         | bots hammering the web service with requests and consuming
         | resources, and the commonly-accepted definition of spam where
         | bots post promotional or other obnoxious content.
         | 
         | I feel like the article here talks more about the first kind. I
         | do agree with your solution for the second kind of spam though.
        
           | remram wrote:
           | > bots hammering the web service with requests and consuming
           | resources
           | 
           | I've never seen this referred as "spam". Denial of service,
           | botting, scraping, sure, but does anyone call that spam?
        
             | Nextgrid wrote:
             | The author seems to be referring to this as "spam" so I've
             | reused their definition. In general I agree, what the
             | author is experiencing is more akin to a DoS than a spam
             | attack.
        
             | tpoacher wrote:
             | It's spam from a server owner's point of view in the
             | broader sense, in that it is "junk requests" instead of
             | legitimate requests, they can be sent as a flood at no cost
             | or consequence to the senders, and it's up to you as the
             | recipient to find a way to filter it all to separate the
             | wheat from the chaff.
             | 
             | It's certainly not denial of service, that means something
             | far more specific.
             | 
             | One _could_ call it  "scraping", but I'd argue the meaning
             | / emphasis is different (it'd be like describing trolling
             | as 'typing').
             | 
             | And "botting" is not a word. :)
        
               | dtgriscom wrote:
               | > And "botting" is not a word. :)
               | 
               | It is now! OED, here we come.
        
               | bambax wrote:
               | But supposing it's not purely malicious, what's the
               | benefit to the spammer?
        
         | seydor wrote:
         | Alternatively, allow $0 signups but approve every new account.
         | It's rather easy to spot spam signups
        
         | gostsamo wrote:
         | A bit below the author talks how creating any form of account
         | is going counter to their goal of free and anonymous, so they
         | are looking only for a behavioral sift of bots.
        
           | tlholaday wrote:
           | Anonymous proof of stake?
        
         | jart wrote:
         | > it costs 10 bux to register an account, if you're a nuisance
         | your account is banned and you pay 10bux to get back on.
         | 
         | You've highlighted its biggest tradeoff which is that it
         | creates an economic incentive to ban people. The only way to
         | make more money, is to have more rules and culture for
         | ostracizing people. It would have been smarter of Something
         | Awful (since that's the site we're talking about) to charge
         | $4/month or something.
        
           | bluedino wrote:
           | You used to have to really, really try to get banned or even
           | probated on SA, now it doesn't take much at all.
        
             | eropple wrote:
             | I can't remember the last time I saw a ban on SA that
             | wasn't after a string of "stop trashing discussion" probes
             | --sometimes in the dozens--or wasn't some flavor of FYAD-
             | escapee bigot or death-threat-spewing weirdo. Even
             | _extremely_ tedious, thread-killing arguers will often be
             | left alone unless a IK is ignored when they say  "drop it"
             | or whatever, and that's usually just a sixer.
             | 
             | There _are_ people who get really mad that they eat probes
             | for, say, misgendering trans people, and I for one would
             | like those people to be madder still. And preferably no
             | longer on the site.
        
           | coldpie wrote:
           | Nah, there are other ways to raise money. The forums have you
           | pay for avatar changes (or to change others' avatars! which
           | is a fun chunk of the forums culture), there are various
           | upgrades like no-ads and unlocking private messaging and
           | stuff. And the forums now have a Patreon, too.
        
           | paulmd wrote:
           | > It would have been smarter of Something Awful (since that's
           | the site we're talking about) to charge $4/month or
           | something.
           | 
           | The innovative thing off that model is driving the revenue
           | off the misbehavers instead of good citizens. You don't want
           | to have the shitters around _even if they are paying $4
           | /month_, and you don't want to drive off good-faith users
           | even if they're mediocre/hapless/etc. So run the site off the
           | backs of the people you don't want to have around.
           | 
           | People don't like paying monthly (this is even true of, say,
           | app store revenue today) and if you apply recurring charges
           | then when people don't think they're getting enough value
           | they'll leave. You have a hard enough time on the user-
           | acquisition side, why make it worse on the retention side by
           | driving away the users who you're actually trying to keep?
           | 
           | Billing good citizens works in some situations where you have
           | some specific value that you provide to them - providing
           | sales listings on classifieds boards inside interest-specific
           | forums is a good example, since you are providing access to
           | interested buyers, which is a value-add, same as ebay taking
           | their fee - but just in terms of operating a forum, you
           | aren't a big enough value-add that people are going to pay
           | Netflix-level subscriptions to the Parrot-Ass Discussion
           | Club. You need the users more than they need you at that
           | point. But a one-time fee is viewed much differently by
           | people. People will pay $5 for an app, they aren't going to
           | pay you $5 a month for it though, or at least far fewer.
        
       | echelon wrote:
       | I thought this article was referring to the upcoming deluge of
       | GPT-3/DALL-E bots that will eventually flood all of online
       | discourse. And whatever future models that will be even more
       | indistinguishable from people - perhaps even ones that are good
       | at "signup flow".
       | 
       | That's going to be way worse for humanity than spiders and
       | automated scripts sending too much traffic. This article isn't
       | imagining _apocalypse_ creatively enough.
        
         | api wrote:
         | We're coming up on the end of open forums and open social
         | media. Everything will require intrusive verification.
         | Anonymous forums could exist but they'll require something else
         | like an anonymous payment, a ton of proof of work on your local
         | machine, etc. to filter out crap.
        
         | Avamander wrote:
         | We're certainly heading towards a scenario where internet abuse
         | (due to poor regulation against it, IMHO, it's digital
         | pollution) becomes enough of a nuisance to require increasingly
         | intrusive verification.
         | 
         | Though we can all work against that by securing our own systems
         | and preventing them from being abused. Used or unused domains
         | should have a strict SPF policy, website registration (or
         | newsletter signup forms) should have captchas, comments should
         | have captchas. Wordpress or other CMS's plugins should be up-
         | to-date and so on and on. Work on requiring 3DS everywhere,
         | everything in-depth.
         | 
         | That way malicious actors would be limited to the services they
         | pay for and that makes their life significantly harder.
        
       | viraptor wrote:
       | > If Marginalia Search didn't use Cloudflare, it couldn't serve
       | traffic.
       | 
       | Cloudflare is not the only CDN/protection. It's the most popular
       | and the most evil one. You have a choice.
        
         | RL_Quine wrote:
         | Why do you consider them to be "the most evil"? Their services
         | seem to be completely fine in almost every regard, and their
         | communication doesn't at all suggest that they might be evil.
        
           | megous wrote:
           | Not for the website visitors.
        
           | viraptor wrote:
           | They actively (including legally) protect groups which
           | coordinate targeted abuse and swatting (basically murder
           | attempts)
           | https://twitter.com/stealthygeek/status/1485731083534667779
        
         | matkoniecz wrote:
         | What alternatives you recommend?
        
           | viraptor wrote:
           | It depends on your audience and regions you're most
           | interested in. But if you're aiming for the EU, gcore labs
           | may be interesting. Akamai is not bad, but a bit enterprisey
           | - I don't think they even had an official api the last time I
           | used them?
        
             | randunel wrote:
             | None of those are free, though.
        
               | viraptor wrote:
               | No, but they also don't actively help protect pages
               | organising SWATing. It's your choice who to do business
               | with.
        
               | Aachen wrote:
               | If you're not paying, what's the product they're selling?
        
               | RL_Quine wrote:
               | Their paid one when you go over the limits. Not
               | everything has to be black and white and reduced down to
               | a single, oft repeated catch phrase like that.
        
       | nixcraft wrote:
       | I run a popular blog and confirm that spam is a massive issue. I
       | am trying to keep the independent web alive with an old-school
       | commenting system because it helps readers and myself improve
       | outdated posts. My domain is over 20+ years old and attracts all
       | sorts of threats, including monthly DDoS and daily spam. Using
       | Cloudflare solved all of these problems. Next, you need to add
       | firewall rules inside Cloudflare WAF to trigger a captcha for
       | /path/to/blog/wp-comments-post.php. That will not get rid of
       | human spam tho. For that, you need to use another filtering
       | service called Akismet.
        
         | nicbou wrote:
         | Same job, same problem. I simply don't allow comments anymore.
         | 
         | This is unfortunate, because they're amazing feedback if you
         | write about bureaucracy. People won't take the time to write to
         | you about their experience, but they'll leave a comment.
        
           | coldpie wrote:
           | I came to the same solution, just disabling comments. There's
           | a prompt to email me in the footer, but no one ever has.
           | Shame, but that's the world we live in.
        
         | baisq wrote:
         | People here like to say that BigCo has ruined the independent
         | web and that everything is now siloed and blablabla but the
         | truth is that running an independent website fucking sucks in
         | many regards
        
           | jimmywetnips wrote:
           | Exactly, sorry the current solutions don't live up to ideals,
           | but if there's no better then we're already doing the ideal
           | thing
        
         | fariszr wrote:
         | Hmmm, maybe offload your comments to something else ?
         | 
         | Im thinking of using GitHub issues/discussions as a comment
         | system. The website and everything will function normally
         | without CloudFlare, but the comments are based on GitHub, which
         | will deal with spam and hosting for me.
         | 
         | And i personally think using it is better from centralizing the
         | internet view, as you don't increase the absolutely crazy 20%
         | that CF controls of the web.
         | 
         | But that obviously doesn't solve the DDOS problem, which should
         | be solved with big cloud providers which have included DDOS
         | protection.
         | 
         | Another workaround is using IPFS, but for a normal user, he/she
         | will need a gateway, and guess who operates on of the biggest
         | IPFS gateways ?, Yes CF. That without considering the tradeoffs
         | is using IPFS.
         | 
         | I think a static site + external comment provider like Github,
         | might help with the attacks and spam without using Cloudflare,
         | but I don't have any website close to your blogs size, so its
         | all just a predication.
        
         | seydor wrote:
         | Unfortunately, well-known platforms with known URIs are
         | targetted way more than any custom website. I think if
         | wordpress just allowed rewriting all the URLs would reduce spam
         | attacks by a lot
        
         | toastal wrote:
         | Putting everything 'behind Cloudflare' isn't a panacea. By
         | merely living outside the West, I'm getting Geo blocked from
         | 'normal' news sites and constantly having to solve hCAPTCHAs to
         | solve riddles for some AI algo without compensation. It's such
         | a burden and I find myself giving up pretty often. GeoIP
         | blocking is what prevented me from getting my voter information
         | out of my last domicile. Running everything through Cloudflare
         | or similar also contributes to the concept of letting the
         | internet be centralized around a few choke points that can hurt
         | free speech (both the good and bad kind) and when they go out
         | (which did recently) a large swath of the internet comes with
         | it.
        
           | jks wrote:
           | Does Cloudflare's "Privacy Pass" browser plugin help at all?
           | It's advertised as reducing the number of hCaptchas you need
           | to solve by a factor of 30, but I rarely see hCaptchas
           | anywhere on my connection so I can't really evaluate myself.
        
           | nixcraft wrote:
           | I agree with you. But, what solution do you propose for
           | independent solo developers or people who wish to run a blog
           | instead of using FB, Twitter and co to create content?
           | Cloudflare may not be perfect, but it prevented me from
           | shutting down my solo operation without putting a massive
           | cost burden on me. When the first time DDoS hit, I had to beg
           | one of those large cloud companies to reduce bandwidth costs.
           | It took them forever to forgive that abuse and price, which
           | was not my fault, and I was given a strong warning not to
           | repeat such an issue again. There is no easy solution to this
           | problem. At least with Cloudflare, people like me can stay
           | online, but it does cause a problem for a bad IP reputation.
           | 
           | TL;DR: I won't expose any of my projects or API directly
           | these days due to spam, ddos and other abuse.
        
       | lizardactivist wrote:
       | End game: everything runs on US-owned services, and all users
       | need to identify to be allowed to even raise a finger, so that
       | "bad actors" can be kept out.
       | 
       | All while we blame Russia and China, and say that their spambots
       | and evil actions forced us to do this.
        
         | 16amxn16 wrote:
         | This actually makes sense. Or, at the very least, it wouldn't
         | surprise me.
         | 
         | Another end game: some sort of ID is required to use anything
         | (that ID being a local phone number, which can be tracked down
         | to you).
        
         | djohnston wrote:
         | Are you suggesting that the U.S. produces a proportionally
         | similar volume of spam traffic as Russia, China, India,
         | Vietnam?
        
       | minimalist wrote:
       | It is interesting to watch comments about this dance around the
       | topic of barriers to entry. It wasn't exactly easy for the
       | uninitiated to access various internet fora in the early days and
       | with popularity comes the bots born out of the desire to profit
       | for little work at the expense of the community garden. The
       | recent story about VRchat embracing anticheat DRM is another
       | example of this, as its ascending popularity led to more scammers
       | [0].
       | 
       | Does this extend to societies as well? One can think of a
       | membrane that has selective permeability to ideas but resists
       | antisocial actors and concepts. Alexander Bard has talked a lot
       | about social membranics (it's a bit hard to search for).
       | 
       | As odious as the web3 charlatanry is, I'm starting to yearn
       | anything that raises the transaction costs for the dumbest bots.
       | I remember reading something about new ideas with distributed
       | moderation at some point--maybe someone can refresh my memory.
       | 
       | [0]: https://news.ycombinator.com/item?id=32232974
        
       | JohnJamesRambo wrote:
       | What is the reason behind bots spamming marginalia? What's the
       | motivation? What do they gain? I always wonder about these
       | things.
        
         | onefuncman wrote:
         | I want to run a honeypot for doing more research on bots and
         | the economics for them, but I get bogged down quickly in the
         | planning stages. I should just start with a vulnerable
         | wordpress site or something.
        
           | SyneRyder wrote:
           | Just make a site with a Contact page, with a comment form
           | that logs the details of every request (IP address,
           | timestamp, message content, email provided). You'll get
           | plenty of data for research, once the page has been indexed
           | into the database the comment form spammers use. For bonus
           | points, put the contact form at the bottom of every page of
           | your website.
           | 
           | A couple of my toy/project websites accidentally became
           | honeypots. Rather than shut down the comment forms, I now
           | have those sites generate summary logfiles that I can upload
           | daily to AbuseIPDB.
           | 
           | EDIT: Forgot to mention, also log the Referer field and User-
           | Agent on each request. Very, very useful information for
           | research and detection.
        
           | prox wrote:
           | Wordpress is perfect for this. The amount of bots trying to
           | get in is insane. Like up to 80 login tries on some days for
           | a small potato website.
           | 
           | There are also some vulnerable plugins still out there if you
           | actually want them to hack it.
        
         | marginalia_nu wrote:
         | Simple answer is I don't know, but it appears to be happening
         | to other search engines as well.
         | 
         | My best guess is they're assuming it's backed by google, and
         | are attempting to poison its search term suggestions. The
         | queries I've been getting are fairly long and highly specific,
         | often within e-pharma or online casino or similarly sketchy
         | areas.
        
         | Avamander wrote:
         | There are multiple reasons - negative SEO, positive SEO,
         | malware distribution, paid clicks, advertising and probably
         | others I've forgotten at the moment.
        
       | x-complexity wrote:
       | > The other alternatives all suck to the extent of my knowledge,
       | they're either prohibitively convoluted, or web3 cryptocurrency
       | micro-transaction nonsense that while sure it would work, also
       | monetizes every single interaction in a way that is more
       | dystopian than the actual skull-crushing robot apocalypse.
       | 
       | In the interest of practicality: There's a way to go the web3
       | route without being laden with transactions:
       | 
       | - Mint a fixed-cost non-transferrable NFT to an address, with
       | ownership limit of 1 per address. - Use SIWE (sign-in with
       | Ethereum) to verify ownership of address & therefore NFT. - If
       | malicious behaviour is detected, mark the NFT as belonging to a
       | malicious actor at the server's end & block the account. -
       | Require non-malicious-marked NFTs in order to use the site/app.
       | 
       | At most, the user only had to perform 1 transaction (minting the
       | non-transferrable NFT) on any blockchain network where the
       | contract resides, & the costs to do so can be made cheaply with
       | Layer 2 networks. (Polygon PoS, Arbtirum, Optimism, zkSync 2.0,
       | etc)
       | 
       | Can this be done entirely without web3? Yes, but the added
       | friction imposed onto malicious actors to generate new addresses
       | & mint new non-transferrable NFTs increases the costs for them
       | considerably.
       | 
       | > If anyone could go ahead and find a solution to this mess, that
       | would be great, because it's absolutely suffocating the internet,
       | and it's painful to think about all the wonderful little projects
       | that get cancelled or abandoned when faced with the reality of
       | having to deal with such an egregiously hostile digital
       | ecosystem.
       | 
       | In all honesty, there's no perfect solution, just hard-to-make
       | tradeoffs: The prevention of botspam inherently requires tracking
       | in some form to resolve said issue, as there's no immediately-
       | recognizable stateless solution for botspam tracking. Someone has
       | to do the tracking to prevent botspam, which inherently involves
       | in state being changed in order to mark an actor as malicious.
        
         | me_again wrote:
         | That seems "prohibitively convoluted" to me, if nothing else.
        
           | birracerveza wrote:
           | Because you don't have experience with it. There's nothing
           | complicated about SIWE, minting an NFT and checking its
           | validity, certainly not to describe it "prohibitively
           | convoluted" aside from being scared of web3 keywords. Come on
           | now.
           | 
           | Not commenting on op's solution's validity or effectiveness,
           | just replying to your comment.
        
           | G3rn0ti wrote:
           | This whole minting and transaction issueing is done by a
           | single (or two) mouse clicks if you use a browser plug-in
           | like the MetaMask wallet.
        
         | nneonneo wrote:
         | Do you mint one new NFT per site? Then you impose an excessive
         | burden on users per new site they visit.
         | 
         | Do you mint one NFT per address? If blocking only applies to
         | the one site, a malicious operator can just spam the next site
         | using that address - after all, they own tons of addresses and
         | have many sites to spam, and they can surely spam for at least
         | a bit before getting caught (per site and per address).
         | 
         | If blocking is a public operation that gets you banned
         | everywhere, well now one callous server owner can now disable
         | your address's ability to access anything.
         | 
         | I fail to see how Web3 NFTs solve any of these problems...
        
           | x-complexity wrote:
           | In all honesty, there's limited tooling to resolve the
           | botspam issue: There's nothing special about a human that a
           | curated algorithm/botfarm can't sufficiently mimic. No
           | technology is the panacea to this solution.
           | 
           | The only way out is to increase the costs for bots to such an
           | extent that it becomes costly for them to operate. The
           | methodology for performing this is still up for debate, but
           | tracking of some form will have to exist (be it publicly-
           | collated or privately-monitored or some blend of both).
        
         | riscy wrote:
         | > Mint a fixed-cost non-transferrable NFT to an address, with
         | ownership limit of 1 per address. - Use SIWE (sign-in with
         | Ethereum) to verify ownership of address & therefore NFT. - If
         | malicious behaviour is detected, mark the NFT as belonging to a
         | malicious actor at the server's end & block the account. -
         | Require non-malicious-marked NFTs in order to use the site/app.
         | 
         | In the real world that's called a paywall. Just requires users
         | have an email address (wallet address) and proof of payment
         | (NFT). Web3 contributes nothing but a different lingo for the
         | existing concepts, yet has all of the same problems: nobody
         | likes paywalls.
        
       | Aachen wrote:
       | > There has been upwards of 15 queries per second from bots.
       | There is just no way to deal with that sort of traffic, barely
       | even to reject it.
       | 
       | If the queries are not a megabit each, you're doing _way_ too
       | much processing before applying rate limiting. Rejecting traffic
       | ought not to take more than 1-2 milliseconds, even if you need to
       | look up an api key or IP address in the database.
       | 
       | I, too, host services on a residential connection: 50 mbps shared
       | with other users. My domains must host hundreds of separate
       | scripts, a few of which have a database attached (I can think of
       | six off the top of my head, but there's over a hundred databases
       | in mariadb so I'm sure there's more that I've forgotten about).
       | This is a ten-year-old laptop with a regular "apt install
       | mariadb", no special configs.
       | 
       | Yes, most traffic is bots, and yes sometimes they submit more
       | than 1 q/s. But it comes nowhere near to exhausting resources to
       | a noticeable extent. Load average is about 0.15, main peaks come
       | from my own cronjobs. If you're having this much trouble
       | rejecting traffic, you might want to spend some time looking at
       | bottlenecks. You'll also notice the bots knock it off if it's
       | unsuccessful.
        
         | marginalia_nu wrote:
         | That's queries per second (as searches), not requests per
         | second.
         | 
         | You know how blogs sometimes can't handle being on the hacker
         | news front page from all the traffic? Well my search engine has
         | survived that. That was 1-2 QPS.
         | 
         | Unmitigated bot-traffic is roughly 10x the traffic of a hacker
         | news death hug, as a sustained load.
        
       | danrl wrote:
       | Off topic: Great to see more Gemini/Web dual hosted sites.
        
       | hamilyon2 wrote:
       | I experienced this firsthand with government immigration
       | websites. The thing is there are only so many time slots and and
       | people are forsed to use a certain web site to apply, so everyone
       | is hunting for available time and generally none are available.
       | 
       | So, some creative people set up bots which check periodically for
       | them. They are paid services which will do that for you. Now we
       | have bots hammering gatekeeper's website. Perhaps hundreds of
       | bots.
       | 
       | Which results in the website is being unavailable, serving a
       | serious qps to bots. I think it is only a matter of time before
       | someone will write bots that will apply to application bots
       | hoping that more entries with their information will provide them
       | with better probability of success.
       | 
       | This is so dystopian and cruel to the average person, and I don't
       | think there is a good solution besides a deep anti-bot expertise
       | whithin the primary website development team
        
         | BoxOfRain wrote:
         | This is pretty much the only way you can book a driving test in
         | the UK at the moment unless you want to take your test in a
         | random place in the Highlands.
        
         | TomK32 wrote:
         | I faced a website like this recently when booking a slot at my
         | own German embassy, went a different route around the embassy
         | instead. What I don't like about the slot system: You won't get
         | a convenient time slot anyways so why do they bother setting it
         | up like this in the first place? Why not just register with
         | your contact details and receive an email with a guaranteed
         | spot at a selection of three days instead. No more need to
         | reload and no need for bots. The Upper Austrian government did
         | this for covid vaccinations in the early phase and it worked
         | very well that way (early, high vaccination rate amongst the
         | elderly and certain professions).
        
           | luckylion wrote:
           | That reminds me of the chaos that ensued in my state in the
           | early days of Covid-19 vaccination when they were still
           | having centralized systems where the elderly could book an
           | appointment. Of course, they had way more demand than supply
           | but still insisted on First Come, First Served, so you ended
           | up with every member of the extended family being asked to
           | try and book a slot, quickly overwhelming their booking
           | systems.
           | 
           | At least you didn't need to worry about it all day: there
           | wasn't a chance in hell to get something 3 minutes after the
           | booking system opened each day.
           | 
           | Friends described how stressful it was for them, their
           | parents being completely helpless and essentially fearing
           | that their health depended on getting one of those elusive
           | appointments, and being devastated each day they didn't
           | succeed.
        
             | rightbyte wrote:
             | In some miracle of competence my district alotted shots by
             | decreasing age limit so that it were no "shortage" or
             | lagfest for those eligible in the booking system.
        
               | luckylion wrote:
               | We have all the data about everyone in the registers, but
               | god forbid we use them to lower friction in case of
               | emergencies! Good to hear that some got it right.
        
         | probably_wrong wrote:
         | > _I don 't think there is a good solution besides a deep anti-
         | bot expertise whithin the primary website development team_
         | 
         | But there is a solution: the website team should get their act
         | together and remove the "first come first served" aspect
         | altogether.
         | 
         | Do you, citizen, want to register? Cool - leave your e-mail and
         | we'll call you. Is the service optional? Then we'll pick at
         | random from the pool of applicants and e-mail them. Is the
         | service mandatory? Then sign up and we'll call you once you
         | reach the top of the queue. Add a quick ID/credit card/whatever
         | check on top (like good concert venues do), regular e-mail
         | updates to let people know they haven't been forgotten, and
         | you're done.
         | 
         | Any second year CS student could write such a system. The
         | difficult part is accepting that the current approach doesn't
         | work and looking for alternatives.
        
           | jaclaz wrote:
           | >Do you, citizen, want to register? Cool - leave your e-mail
           | and we'll call you. Is the service optional? Then we'll pick
           | at random from the pool of applicants and e-mail them. Is the
           | service mandatory? Then sign up and we'll call you once you
           | reach the top of the queue.
           | 
           | Wait a minute, wouldn't that be more work for "us"?
           | 
           | Let me think ....
        
           | jaredsohn wrote:
           | >Then we'll pick at random from the pool of applicants and
           | e-mail them
           | 
           | This is where having your own email domain with unlimited
           | accounts is useful.
        
             | probably_wrong wrote:
             | The thought did cross my mind, which is why I'd also ask
             | for an ID or equivalent. If you don't show up with that
             | specific ID on the day of your appointment, you lose the
             | appointment. And you can use that ID number to deduplicate
             | requests.
             | 
             | If you don't do that, I 100% agree with you - scalpers
             | could then register with hundreds of accounts for
             | reselling, and we would be back where we started.
        
           | MauranKilom wrote:
           | "Thank you for waiting three weeks for your appointment
           | selection. We are happy to offer you a time slot next Friday,
           | from 1 pm to 1:15 pm. Click here to accept: [button]. If this
           | does not suit you, click here to get sent back to the queue:
           | [button]."
           | 
           | Half the point of these services tends to be giving users
           | some choice in when they have to show up somewhere. Because
           | not everyone can make time in the middle of business hours of
           | an arbitrary day. Not to mention that you might simply be out
           | of town.
        
             | mobiclick wrote:
             | Applicants specify their preferred time slots in decending
             | order and the government agency chooses the time. This is
             | how the my country does it.
        
         | nicbou wrote:
         | Berlin?
        
         | TheCapeGreek wrote:
         | It gets more fun when there are arbitrary restrictions on
         | process. Fun anecdote: South Africa's Home Affairs website,
         | where you make bookings for passports, only lets you book
         | appointment dates 2 weeks ahead normally. It's effectively
         | permanently booked even without bots this way.
         | 
         | Luckily, if you're technically inclined, editing the value of
         | the input element via dev tools is accepted by the form.
        
           | lordgrenville wrote:
           | It's not filtered out on the back end?
        
           | rightbyte wrote:
           | Heh, I've done the same thing to calculate my tax.
           | 
           | The tax agency had grayed out the "calc tax" button until the
           | declaration period started, but you could just enable it with
           | the enable flag.
           | 
           | Did nothing but read querry their servers though.
        
       ___________________________________________________________________
       (page generated 2022-08-04 23:02 UTC)