[HN Gopher] Lightweight open source reCaptcha alternative
       ___________________________________________________________________
        
       Lightweight open source reCaptcha alternative
        
       Author : michalpleban
       Score  : 95 points
       Date   : 2025-05-13 09:19 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | rahimnathwani wrote:
       | Is there a live demo anywhere? I didn't see one linked in the
       | README or on the official site's home page.
        
         | unixfox wrote:
         | Yes found one: https://altcha.org/captcha/
        
       | immibis wrote:
       | The purpose of reCaptcha is to enhance your Google user profile
       | and to deny legitimate users. How does this alternative
       | accomplish those things?
       | 
       | This appears to be a proof-of-work, like Anubis. Real captchas
       | collect much more fingerprinting data to ensure that only users
       | with the latest version of Chrome, the latest version of Windows,
       | and an Nvidia graphics card, can use the site.
        
         | literalAardvark wrote:
         | Yeah, this fails at its most important task, making those
         | filthy, dirty, Firefox users click on bridges for 1 hour a day.
         | 
         | On topic though, how does this improve on hCaptcha?
        
           | akimbostrawman wrote:
           | Don't forget those criminals with non-residential IPs
        
           | Semaphor wrote:
           | > On topic though, how does this improve on hCaptcha?
           | 
           | Cloud vs self-hosted, click annoying things challenge vs
           | automatic proof of work. Or are there other hCaptcha versions
           | and I just never realized it?
        
           | Imustaskforhelp wrote:
           | oh yes. as a firefox (now librewolf) user, it deeply saddens
           | me.
        
           | g-b-r wrote:
           | I know some people who quickly give up and renounce using the
           | service, when they run into hCaptcha puzzles.
           | 
           | I've been bewildered for some time as well, honestly, it took
           | me a while to figure out the first I ran into.
           | 
           | And trying one now, fully knowing that I'd have to solve one,
           | I was dumbfounded by the puzzle I've gotten, it took me a few
           | seconds to understand it.
           | 
           | Cloudflare's ones are horrible and a plague (although they
           | might have slightly improved recently), but I'm not certain
           | I'd prefer hCaptchas over them.
        
         | Semaphor wrote:
         | I know this is partially a joke, but I'd like to mention that
         | as a Firefox user with uBo and uMatrix, I almost never have to
         | solve challenges with ReCaptcha.
        
           | hexagonwin wrote:
           | how? do you just allow all cookies/scripts/xhr on your
           | umatrix? i'm on a similar config and I get captchas far often
           | than any other users on the same network for some reason.
        
             | Semaphor wrote:
             | I use the uMatrix plugin to automatically allow what's
             | required for ReCaptcha, that tends to work. I do get the
             | annoying picker sometimes and have an AI extension to solve
             | it for me, but it's relatively rare, like 1 out of 5 times
             | max (I generally don't see RC that often).
             | 
             | No idea how I'd compare to others on my Network, that'd
             | only be my wife and as a Linux user she'd probably get more
             | than me with Windows ;)
        
               | pabs3 wrote:
               | I thought uMatrix got abandoned?
        
           | piva00 wrote:
           | On the other hand as a user of Firefox I simply cannot pass
           | Cloudflare's verification at all, I always end up in a loop.
           | It's been like that for more than a year... Sometimes it does
           | work on a private window, no idea why or how since I have the
           | same extensions enabled.
        
             | xena wrote:
             | Do you store cookies?
        
       | serhack_ wrote:
       | The real secret of an effective captcha-like system is to
       | identify/collect lots of data, identify suspicious patterns,
       | validate them (checking what kind of data exposes a bot-like
       | system) and then use this for serving dynamic challenges based on
       | a couple of information.
       | 
       | Example: if the system identifies the user as a bot, it tries to
       | give a less performant solution in terms of PoW.
        
         | Imustaskforhelp wrote:
         | Maybe somebody could explain me why your comment is in
         | different contrast of grey?
         | 
         | I think somebody might have flagged your comment, but it is a
         | real fact.
         | 
         | This is one of the reasons why people say cloudflare owns the
         | majority of internet but I think I am okay with that since
         | cloudflare is pretty chill. And they provide the best services
         | but still it just shows that the internet isn't that
         | decentralized.
         | 
         | But google captcha is literally tracking you IIRC, I would
         | personally prefer hcaptcha if you want centralized solution or
         | anubis if you want to self host (I Prefer anubis I guess)
        
           | ArinaS wrote:
           | Cloudflare is not chill because they, either ignorantly or
           | purposefully, block everything that's not Chromium or
           | Firefox[1].
           | 
           | Or sometimes everything that's not just Chromium[2].
           | 
           | [1] - https://www.theregister.com/2025/03/04/cloudflare_block
           | ing_n...
           | 
           | [2] - https://www.techradar.com/pro/cloudflare-admits-
           | security-too...
        
           | Zak wrote:
           | > _Maybe somebody could explain me why your comment is in
           | different contrast of grey?_
           | 
           | Downvotes. Comments with negative scores are shown with lower
           | contrast. The more negative the score, the less contrast they
           | get.
        
       | Jleagle wrote:
       | Can someone explain why a robot would not be able to calculate
       | the PoW?
        
         | jsheard wrote:
         | It could, the idea is just to tip the economics such that it's
         | not worth it for the bot operator. That kind of abuse typically
         | happens at a vast scale where the cost of solving the
         | challenges adds up fast.
        
           | hombre_fatal wrote:
           | Botnets don't even use their own hardware.
           | 
           | Why would someone renting dirt cheap botnet time care if the
           | requests take a few seconds longer to your site?
           | 
           | Plus, the requests are still getting through after waiting a
           | few seconds, so it does nothing for the website operator and
           | just burns battery for legit users.
        
             | jsheard wrote:
             | Botnets just shift the bottleneck from "how much compute
             | can they afford to buy legit" to "how many machines can
             | they compromise or afford to buy on the black market".
             | Either way it's a finite resource, so making each abusive
             | request >10,000x more expensive still severely limits how
             | much damage they can do, especially when a lot of botnet
             | nodes are IoT junk with barely any CPU power to speak of.
        
             | victorbjorklund wrote:
             | There is still an opportunity cost. They can scrape just
             | your site or they can scrape 100 other sites without POW
             | (no idea if it is 10, 100 etc)
        
               | Jleagle wrote:
               | So it's the same as a sleep()
        
               | ahofmann wrote:
               | No, because the bot can just also sleep and scrape other
               | sources in that time. With pow, you waste their CPU
               | cycles and block them from doing other work.
        
               | hombre_fatal wrote:
               | Websites aren't really fungible like that, and where they
               | are (like general search indexing for example), that's
               | usually the least hostile sort of automated traffic. But
               | if that's all you care about, I'll cede the point.
               | 
               | Usually if you're going to go through the trouble of
               | integrating a captcha, you want to protect against
               | targeted attacks like a forum spammer where you don't
               | want to let the abusive requests through at all, not just
               | let it through after 5000ms.
        
             | bityard wrote:
             | If you're a botnet operator of a botnet that normally
             | scraped a few dozen pages per second and then noticed a
             | site suddenly taking multiple seconds per page, that's at
             | least an order of magnitude (or two) decrease in
             | performance. If you care at all about your efficiency, you
             | step in and put that site on your blacklist.
             | 
             | Even if the bot owner doesn't watch (or care) about about
             | their crawling metrics, at least the botnot is not DDoSing
             | the site in the meantime.
             | 
             | This is essentially a client-side tarpit, which are
             | actually pretty effective against all forms of bot traffic
             | while not impacting legitimate users very much if at all.
        
               | remram wrote:
               | A tarpit is selective. You throw bad clients in the
               | tarpit.
               | 
               | This is something you throw everyone through. both your
               | abusive clients (running on stolen or datacenter
               | hardware) and your real clients (running on battery-
               | powered laptops and phones). More like a tar-checkpoint.
        
           | jrochkind1 wrote:
           | That's definitely the idea.
           | 
           | So the crazy decentralized mystery botnet(s) that are
           | affecting many of us -- don't seem to be that worried about
           | cost. They are making millions of duplicate requests for
           | duplicate useless content, it's pretty wild.
           | 
           | On the other hand, they ALSO dont' seem to be running user-
           | agents that execute javascript.
           | 
           | This is in the findings of a group of some of my colleagues
           | at peer non-profits that have been sharing notes to try to
           | understand what's going on.
           | 
           | So the fact that they don't run JS at present means that PoW
           | would stop them -- but so would something much simpler and
           | cheaper relying on JS.
           | 
           | If this becomes popular, could they afford to run JS and to
           | calcualte the PoW?
           | 
           | It's really unclear. The behavior of these things does not
           | make sense to me enough to have much of a theory about what
           | their cost/benefits or budgets are, it's all a mystery to me.
           | 
           | Definitely hoping someone manages to figure out who's really
           | behind this and why at some point. (i am definitely not
           | assuming it's a single entity either).
        
         | dpassens wrote:
         | I think the general idea isn't that they can't but that they
         | either won't, because they're not executing JS, or that it
         | would slow them down enough to effectively cripple them.
        
           | jrochkind1 wrote:
           | As long as their not executing JS, they don't really need a
           | PoW to stop them, though. Something much simpler that
           | requires executing JS would do.
           | 
           | i might at any rate set my PoW to be relatively cheap, which
           | would do for anyone not executing JS.
        
         | diggan wrote:
         | I think this being called a "recaptcha alternative" to be
         | slightly misleading.
         | 
         | There are two problems some website hosters encounter:
         | 
         | A) How do I ensure no one DDOS (real or inadvertently) me?
         | 
         | B) How can I ensure this client is actually a human, not a
         | robot?
         | 
         | Things like ReCaptcha aimed to solve B, not A. But the
         | submitted solution seems to be more for A, as calculating a PoW
         | can be (probably _must_ be actually) calculated by a machine,
         | not a human. While ReCaptcha is supposed to be the opposite,
         | could only be solved by a human.
        
       | progx wrote:
       | In AI century, how you would detect a real person or an AI?
        
         | ArinaS wrote:
         | This thing, despite using "captcha" in its name, is not your
         | typical captcha like hCaptcha or Google's one, because it uses
         | a proof-of-work mechanism instead of writing answers in
         | textboxes/clicking on images/other means of verification
         | requiring user input.
         | 
         | AI bots can't solve proof-of-work challenges because browsers
         | they use for scraping don't support features needed to solve
         | them. This is highlighted by existence of other proof-of-work
         | solutions designed to specifically filter out AI bots, like go-
         | away[1] or Anubis[2].
         | 
         | And yes, they work - once GNOME deployed one of these proof-of-
         | work challenges on their gitlab instance, traffic on it fell by
         | 97%[3].
         | 
         | [1] - https://git.gammaspectra.live/git/go-away
         | 
         | [2] - https://github.com/TecharoHQ/anubis
         | 
         | [3] - https://thelibre.news/foss-infrastructure-is-under-
         | attack-by...: " _According to Bart Piotrowski, in around two
         | hours and a half they received 81k total requests, and out of
         | those only 3% passed Anubi 's proof of work, hinting at 97% of
         | the traffic being bots._"
        
           | graemep wrote:
           | > AI bots can't solve proof-of-work challenges because
           | browsers they use for scraping don't support features needed
           | to solve them.
           | 
           | At least sometimes. I do not know about AI scraping but there
           | are plenty of scraping solutions that do run JS.
           | 
           | It also puts of some genuine users like me who prefer to keep
           | JS off.
           | 
           | The 97% is only accurate if you assume a zero false positive
           | rate.
        
             | ArinaS wrote:
             | > " _It also puts of some genuine users like me who prefer
             | to keep JS off._ "
             | 
             | Non-javascript challenges are also available[1].
             | 
             | > " _The 97% is only accurate if you assume a zero false
             | positive rate._ "
             | 
             | GNOME's gitlab instance is not something people visit daily
             | like Wikipedia, so it's a negligible amount of false
             | positives.
             | 
             | [1] - https://git.gammaspectra.live/git/go-
             | away/wiki/Challenges#no...
        
               | graemep wrote:
               | > Non-javascript challenges are also available
               | 
               | Did not know that. Good news
               | 
               | > NOME's gitlab instance is not something people visit
               | daily like Wikipedia, so it's a negligible amount of
               | false positives.
               | 
               | As an absolute number, yes, but as a proportion?
        
           | diggan wrote:
           | > AI bots can't solve proof-of-work challenges because
           | browsers they use for scraping don't support features needed
           | to solve them. This is highlighted by existence of other
           | proof-of-work solutions designed to specifically filter out
           | AI bots, like go-away[1] or Anubis[2].
           | 
           | Huh, they definitely can?
           | 
           | go-away and Anubis reduces the load on your servers as bot
           | operators cannot just scrape N pages per second without any
           | drawbacks. Instead it gets really expensive to make 1000s of
           | requests, as they're all really slow.
           | 
           | But for a user who uses their own AI agent, that browses the
           | web, things like anubis and go-away aren't meant to (nor does
           | it) stop them from accessing the websites at all, it'll just
           | be a tiny bit slower.
           | 
           | Those tools are meant to stop site-wide scraping, not
           | individual automatic user-agents.
        
         | Jleagle wrote:
         | AI's scrape data from web pages just like anything else does. I
         | don't think their existence makes a difference.
        
           | immibis wrote:
           | AIs don't. AI companies do.
           | 
           | Well, maybe. As far as I can see, the overt ones are using
           | pretty reasonable rate limits, even though they're scraping
           | in useless ways (every combination of git hash and file path
           | on gitea). Rather, it seems like he anonymous ones are the
           | problem - and since they're anonymous, we have zero reason to
           | believe they're AI companies. Some of them are running on
           | Huawei Cloud. I doubt OpenAI is using Huawei Cloud.
        
         | dvh wrote:
         | Certainly! Distinguishing between a real person and an AI in
         | the AI century can be tricky, but some key signs include
         | emotional depth, unpredictable creativity, personal
         | experiences, and complex human intuition. AI, on the other
         | hand, tends to rely on data patterns, structured reasoning, and
         | lacks genuine lived experiences.
        
           | igorbark wrote:
           | i enjoy that i cannot tell whether this is written by an AI,
           | or by a human pretending to be an AI. my guess is human
           | pretender!
        
       | chrismorgan wrote:
       | CAPTCHA stood for "Completely Automated Public Turing Test to
       | tell Computers and Humans Apart".
       | 
       | By this point, it's obvious that that has failed, and even that
       | no general solution is possible any more.
       | 
       | ALTCHA... telling Computers and Humans Apart? No, this is proof
       | of work, meaning it's just about making things expensive--abuse
       | control, not actually distinguishing between computers and
       | humans.
       | 
       | In fact, in https://altcha.org/captcha/ one of the headings is
       | _Inclusive to Robots_! This is _so_ far the opposite of
       | traditional CAPTCHA, on the technical side, that it's mildly
       | hilarious. (Socially, they largely amount to the same thing--
       | people never did actually care about _computers_ , just abusive
       | bots.)
       | 
       | Then the question is: what is the proof of work mechanism? How
       | robust are things going to be, and can you ensure attacking will
       | remain expensive, without burdening users too much?
       | 
       | https://altcha.org/docs/proof-of-work/ indicates it's SHA
       | hashing, not something like scrypt. Uh oh. The best specialised
       | hardware is several million times as good as good laptops1, let
       | alone cheap phones. If this were to become popular, bots would
       | switch to such hardware, probably making the cost of attacking
       | practically negligible. https://altcha.org/docs/complexity/ shows
       | they've thought about these things, but I feel that although it
       | will work for a while, it's ultimately a doomed game. And in the
       | mean time, you can normally go _waaaay_ simpler and less
       | intrusive: most bots are extremely dumb.
       | 
       | Is "captcha" heading in the direction of meaning "bad rate
       | limiting"?
       | 
       | Because really that's what this stuff is: rate limiting that
       | trusts that clients don't have lots of compute power conveniently
       | available, but will get vaporised by powerful and intentional
       | adversaries.
       | 
       | --***--
       | 
       | 1 On the https://altcha.org/docs/complexity/ test, a
       | comparatively ideal browser on my 5800HS laptop might reach
       | 500,000 SHA-256 hashes per second at a cost of at least 25W.
       | (Chromium gets half this with ~50% CPU usage; Firefox one tenth,
       | altogether failing to load the cores for some reason.) The most
       | energy-efficient commercial Bitcoin miners seem to be doing
       | around 80 _billion_ of these hashes per watt-second. That's _four
       | million_ times as good. You cannot bridge such a divide.
        
       | binary132 wrote:
       | It's crazy how much of the internet and our app stacks depend on
       | proprietary hosted service integrations that will almost
       | certainly disappear or break in time. Sure it's convenient to get
       | off the ground with but it doesn't make sense to me to gate your
       | functionality on a third party that can easily break or slip out
       | from under you. It would be one thing if proprietary software was
       | distributed in a form you could keep operating and using on your
       | own, but even that is obviously inferior to being able to "repair
       | your own equipment".
        
         | IgorPartola wrote:
         | reCaptcha is routinely broken for me. Almost every time I see
         | it I have to solve it about a dozen times, then it decides I'm
         | not human. After 2-3 page refreshes it does let up but it's
         | frustrating as hell.
        
           | wkat4242 wrote:
           | Are you on Linux by any chance? For some reason this is now
           | deemed 'suspicious' by recaptcha and cloudflare :( Especially
           | if you use Firefox. It's driving me crazy getting bombarded
           | by these.
        
             | worldsavior wrote:
             | Did you try faking user agent?
        
               | wkat4242 wrote:
               | Yes but that made it worse.
               | 
               | In fact I used to fake user agent all the time because
               | Microsoft 365 is so retarded. With the Firefox/Linux user
               | agent a lot of features don't work. When it pretends to
               | be MS Edge it works fine. Clearly trying to force people
               | to use the 'invented here' browser :(
               | 
               | But as I was getting captcha's I moved to using it only
               | for the MS365 sites and nowhere else. It seems to have
               | reduced the captcha's somewhat, especially the ones that
               | never end (keep looping). But I still get a ton of "Your
               | browser is suspicious, here's an extra check" nonsense
               | from Cloudflare in particular.
        
         | godzillabrennus wrote:
         | In the startup world it is a huge economic advantage if you can
         | prototype an idea in days that would have taken months or
         | years. The tradeoffs are acquiring technical debt but we seem
         | capable of resolving that after the concept has found product
         | market fit.
        
           | graemep wrote:
           | Yes, but its not just startups and people do not seem to
           | actually resolve it.
           | 
           | Lots of big businesses use recaptcha. Quite often
           | unnecessarily. If I need to login with 2FA touse a service
           | does it really need recaptcha?
           | 
           | Similarly, cloudflare sends you emails telling you how many
           | bots and attacks it has stopped - but you do not know how
           | many false positives there were.
        
             | theappsecguy wrote:
             | Yes you still need recaptcha simply to avoid password
             | stuffing attacks.
        
               | damsalor wrote:
               | Certainly not in the mentioned 2fa scenario.
               | 
               | I would guess that simple rate limiting would do the
               | trick for the rest
        
               | Zak wrote:
               | Rate limiting does not solve this problem because botnets
               | often don't make repeated requests from the same IP
               | address. 2FA does solve it.
        
           | dsr_ wrote:
           | Citation, as they say, is needed.
           | 
           | As far as I can tell, most startups resolve their technical
           | debt by failing, and the majority of the rest resolve their
           | debt by being acquired by a company which replaces the
           | original service entirely in 1-3 years because it's too hard
           | to integrate as-is.
        
           | binary132 wrote:
           | Yes, and I certainly was not saying startups should roll
           | their own fraud prevention
        
         | palmotea wrote:
         | > It's crazy how much of the internet and our app stacks depend
         | on proprietary hosted service integrations that will almost
         | certainly disappear or break in time. Sure it's convenient to
         | get off the ground with but it doesn't make sense to me to gate
         | your functionality on a third party that can easily break or
         | slip out from under you.
         | 
         | At least with captchas, it's somewhat understandable with the
         | arms-race aspect. The third party does the work of engaging in
         | the arms race, so you don't have to, but the tradeoff is what
         | you describe.
        
         | rendx wrote:
         | Not only that, but it's also totally acceptable now to
         | broadcast your user's data to a megaton of external services
         | for no good reason. If people had some grasp of what is going
         | on and it was visible to them, they would complain very loudly
         | about it in your face.
        
       | pabs3 wrote:
       | Hmm, how do they know you have calculated the PoW without setting
       | a cookie? Or do you have to calculate it on every page load?
        
         | jrochkind1 wrote:
         | yeah, I need more info to understand what's up.
         | 
         | Maybe it's only used on individual form submit (like the
         | classic captcha use-case), and not on a page load, and it does
         | have to be recalculated on every form submit?
        
         | alamsterdam wrote:
         | Yes, I was wondering what is to stop you replaying the same PoW
         | multiple times. All I can find is:
         | 
         |  _To prevent the vulnerability of "replay attacks," where a
         | client resubmits the same solution multiple times, the server
         | should implement measures that invalidate previously solved
         | challenges.
         | 
         | The server should maintain a registry of solved challenges and
         | reject any submissions that attempt to reuse a challenge that
         | has already been successfully solved._
         | 
         | This doesn't seem very scaleable? Or am I missing something?
        
       | dankobgd wrote:
       | recaptcha is useless, only annoys actual users. I lost 15 minutes
       | last week on miui site with their trash recaptcha. The point is
       | to steal more data from you
        
       | remram wrote:
       | Why call it a CAPTCHA if it is not even trying to tell Computers
       | and Humans Apart (CHA)?
       | 
       | This is only trying to tell human browsers from bot browsers
       | apart. Not even that, it seems all it does is slow all browsers
       | down equally.
        
         | immibis wrote:
         | Because <s>human</s> western society is in its post-competence
         | era. It doesn't matter whether you can do your job, only
         | whether your manager thinks you are, and they don't understand
         | your job so they use all the wrong metrics.
         | 
         | Like whether there's a checkbox you have to click, and whether
         | it spins for a while when you click it. That's a CAPTCHA now.
         | And working is when your butt is in the chair. And investing is
         | when you give someone money and they promise to give more back
         | tater. And food is things that fit in your mouth and don't kill
         | you. And free speech is when you get turned away at the border
         | for disliking the president on social media. And top-of-the-
         | line CPUs are ones that die within 24 months. Meanwhile the
         | totalitarian dictatorship across the pond actually does all
         | these things better somehow (except the politics).
         | https://en.wikipedia.org/wiki/HyperNormalisation#Etymology
        
       ___________________________________________________________________
       (page generated 2025-05-15 23:01 UTC)