[HN Gopher] NopeCHA: Captcha Solver
___________________________________________________________________
NopeCHA: Captcha Solver
Author : zuhayeer
Score : 78 points
Date : 2022-11-27 22:10 UTC (6 hours ago)
(HTM) web link (chrome.google.com)
(TXT) w3m dump (chrome.google.com)
| layer8 wrote:
| Great, now captchas are going to get even more annoying.
| version_five wrote:
| Re-captcha is about google exercising monopoly power to try and
| force you to use their browser and let them track you. It has
| little to do with finding stopsigns or whatever. It would be cool
| to see that problem addressed, i.e. allow me to use the internet
| normally without a browser and privacy settings that google
| endorses.
|
| Incidentally, in almost all cases, if I'm faced with a recaptcha,
| I just don't do the thing. I have foregone purchases and charity
| donations, and not used products, because organizations care so
| little about their customers that they think making us solve a
| puzzle before we give them money is acceptable.
| lagt_t wrote:
| Isn't recaptcha browser agnostic?
| crakenzak wrote:
| Seems like forgoing charity donations over something like a
| recaptcha is an incredibly bizarre hill to die on.
|
| The main purpose of recaptcha is to prevent bots from abusing
| services, and has much less to do with exercising "monopoly
| power to ... track you".
| listenallyall wrote:
| > The main purpose of recaptcha is to prevent bots from
| abusing services
|
| It's funny when people think they are adding to the
| conversation by contradicting a thoughtful and interesting
| comment (even if it may be a bit conspiracy-theory-ish), by
| simply re-reciting the corporate line.
| version_five wrote:
| I call that sort of reply "coin operated" - there is some
| crude pattern recognition that goes on (dropping any
| subtlety or new information a parent comment may be
| providing) and a sort of pre-recorded viewpoint gets spit
| out. There are certain topics where it's very common
| listenallyall wrote:
| I think it also happens sometimes when the parent comment
| hits close to home... perhaps someone actually worked on
| implementing a feature for a few years and even from the
| inside never figured out the _actual_ purpose of what was
| being built. That cognitive dissonance hits hard when an
| outsider points it out in black and white. "Wait, _I_
| built non-consensual tracking software? But nobody told
| me they would use it for that!!! "
| jtbayly wrote:
| I downvoted because I run several non-profit websites, and
| start without any captcha's by default. But forms always
| end up getting spammed, and then I have to add some
| protection. This is literally why captchas exist in the
| first place, as well as why so many sites have them. Google
| came along and offered a convenient, free version, so many
| people started to use it, although I don't. Why Google
| decided to offer it is possibly what you claim, but that
| changes nothing about why the nonprofit makes use of it.
| version_five wrote:
| > free
|
| It's not free. It increases friction, and at least in my
| case, results in abandoned transactions. I'm not well
| versed in the different options for spam protection (or
| the attacks) but I do know that most merchants don't make
| their users solve a puzzle, especially at a critical
| point along the purchase workflow where is it most likely
| to get derailed.
|
| The fact that google is (probably unintentionally)
| particularly appealing to small providers or charities,
| pretending they offer a "free" product, makes it even
| worse.
|
| Edit: not an endorsement, but elsewhere in the discussion
| someone posted a link to cloudflare's captcha solution,
| which they say specifically addresses the privacy and
| annoyingness concerns of Google's captcha. So there are
| options: https://www.cloudflare.com/en-
| ca/products/turnstile/ (I'm not actually familiar with
| this, it may have a downside I don't know about)
|
| (Also, disagreeing with something is generally a poor
| reason to downvote. It's much better to have a
| discussion, and I appreciate your comment)
| jtbayly wrote:
| > It's not free.
|
| True enough. That's why I don't use it. Google's solution
| is absolutely awful, and I'm positive you aren't the only
| one abandoning important flows on non-profit websites
| because of it.
| ajsnigrutin wrote:
| Anything that wastes users time is not really free.
| TylerE wrote:
| Sigh. And so the next battle in the war.
|
| There is no way this doesn't get abused, including, probably by
| the Company making it.
|
| So, I'm dreading recaptcha v4
|
| We may already be passing the captch event horizon where machines
| actually outperform humans on the damn things.
| meta2023 wrote:
| abused? it's a captcha solver. I'd argue abuse (from the
| perspective of the target website/app) is the primary business
| case.
| wolpoli wrote:
| Computers are getting better than human at solving these
| challenges. So recaptcha v4 might end up being a micropayment
| system since humans still have more money than bots.
| eli wrote:
| Several decades of experience suggests micropayments ain't
| it.
| [deleted]
| charcircuit wrote:
| >humans still have more money than bots
|
| Simultaneously humans can be less likely to want to pay than
| bots which can skew the bot to human ratio.
| bogomipz wrote:
| >"There is no way this doesn't get abused, including, probably
| by the Company making it."
|
| How would the company making it abuse this? I feel like maybe
| I'm missing something obvious.
| TylerE wrote:
| Sell bulk captcha solving to bad actors.
|
| The extension probably has a hidden limit of 50 solves a day
| or something
| vbezhenar wrote:
| In my opinion next gen captcha should be asking user to prove
| that he's human.
|
| For example ask him to upload his video with his ID. This video
| will be verified by another human operator.
|
| In the end, user will be given some kind of identifier. He
| should present that identifier to anyone asking if he's a
| robot.
|
| Of course that kind of verification will be paid. So you're
| paying $100 to get a verified identifier and then you keep that
| identifier (probably in the form of private key with signed
| public key).
|
| There will be multiple certificate authorities who will issue
| those certificates to people. Rest of software companies will
| trust those authorities.
|
| You need to renew that certificate every year.
|
| If someone spotted your certificate being used in a nefarious
| schemes, your certificate will be revoked and you'll need to
| pay $5000 fine next time you'll ask for new certificate.
|
| If you don't possess certificate, you're not qualified to be a
| human.
| CasualSuperman wrote:
| And thus poor people became unable to access the internet
| EMIRELADERO wrote:
| But then you're imposing an expensive yearly tax on people to
| use basic services. Very poor people use the internet too!
| bawolff wrote:
| For pay captcha solvers have existed forever (e.g.
| 2captcha.com) and the world hasnt ended yet. I doubt this will
| change that much.
| fratlas wrote:
| Maybe the solution is to look for problems where human's
| imperfections are identifiable
| TylerE wrote:
| If a machine can identify it, a machine can fake it.
|
| Maybe given a large enough input, but do you want to spend 10
| minutes solving a captcha?
| svnpenn wrote:
| one thing that annoys me is they dont ask you HOW MANY boxes to
| check. So you dont know if you need to be "conservative" or
| "aggressive". So I started just clicking a single box, then if it
| prompts me I will keep adding one until I meet the requirement. I
| think sometimes its just one or two.
|
| However some shitheads like Discord also wont tell you how many,
| and will also outright fail you if you click too few, forcing you
| to restart the whole multi-test process. So fuck all of it. I
| fully support this extension, they deserve what they get. They
| need to figure out how to make it hard to fake, without making it
| a nightmare for legitimate users.
| vbezhenar wrote:
| I think that you need to have a behaviour similar to other
| humans. So I'm trying to think what squared would select some
| kind of ordinary human who want to get it done as soon as
| possible. Being very careful might actually work backwards.
| dankwizard wrote:
| But it kind of does - Click ALL of the Lions.
|
| If there is a lion in the box.... Click it.
| Blue111 wrote:
| but what if a car takes 6 squares and two of the squares have
| a minute amount of car in them... do you need to click it?
| johntash wrote:
| Usually yes, at least that's how I interpret it. So far, I
| have not been identified as a bot.
| svnpenn wrote:
| this is obviously wrong, for reason I already gave. many
| times you can click one or two boxes, even though more
| "correct" boxes might exist. I dont want to click more than
| needed, thats wasted time. Although to be fair my method is
| probably slower overall.
| freitasm wrote:
| From the submitted link we can find the homepage for this
| extension. You will then find that you can use the service over
| an API and a pricing page ($4.99/2K daily recognitions,
| $19.99/20K daily recognitions).
|
| I would say this is useful for spammers and snipper bots.
| odo1242 wrote:
| I would argue that ReCAPTCHA's still work, at least to some
| extent. Spamming a form is much easier to do when you don't have
| to spin up an entire virtual browser to fill out those form while
| also paying for the GPU computer necessary to run this ML model.
| Plus, "click farms" for solving captchas have always existed, at
| cents per solve.
|
| Plus, ReCAPTCHAv3 makes this entire attack irrelevant by making
| image classification not a part of the CAPTCHA.
| fastball wrote:
| This extension actually claims to side-step v3 as well.
| TheCycoONE wrote:
| I have seen recaptcha v3 bypassed with seemingly little
| effort by financially motivated spammers. I have also seen
| them spin up large numbers of Gmail accounts for email
| verification. I'm curious what people have tried that
| actually worked.
| fastball wrote:
| Can't speak for anyone else but we recently implemented V3
| with V2 as a fallback entirely to help mitigate DDoS
| attacks. Haven't been hit with another one yet but I have a
| feeling it will be sufficient.
| MontyCarloHall wrote:
| The original reCAPTCHA served the dual purpose of fighting spam
| and training optical character recognition algorithms. It
| displayed a pair of words, one of which was unambiguously
| resolved by an OCR and the other of which OCRs couldn't easily
| read. The first word was used to disambiguate humans from bots,
| and the second word was used to train the OCR.
|
| Today, CAPTCHAs serve a similar purpose, except they're used to
| train self-driving cars' image recognition AIs. I always try to
| be a little subversive and correctly identify the images that are
| clearly unambiguously classified by the AI, and then purposefully
| screw up identifying the image that the AI struggles with. It
| lets me through the majority of the time, which indicates that my
| bad input made it into their training data.
|
| Unlike the CAPTCHAs of yore, when machine vision simply was not
| advanced enough to solve them, anyone has access to pre-trained
| vision models easily capable of identifying the unambiguously
| resolved buses or crosswalks in the CAPTCHA image. The deterrent
| to spammers is no longer that actual humans need to solve the
| CAPTCHA, but rather that it's too computationally expensive to
| solve them at scale. Today's CAPTCHAs are basically Hashcash
| proof-of-work [0], but with the added benefit to Google et al.
| (and annoyance to users) that they help train computer vision
| models.
|
| [0] https://en.m.wikipedia.org/wiki/Hashcash
| greesil wrote:
| Very insightful. You forgot to mention "and is solvable by a
| human". Otherwise the captcha would be just be proof of work of
| some kind.
| MontyCarloHall wrote:
| Proof-of-work is exactly what Cloudflare's Turnstyle CAPTCHA
| alternative is: https://blog.cloudflare.com/turnstile-
| private-captcha-altern...
|
| The only reason we still solve those stupid image recognition
| puzzles is because Google/Waymo and other self-driving car
| companies have managed to trick us into helping them do their
| training work for them.
| porphyra wrote:
| Why do you want to screw up the training data though? You have
| nothing to gain while making life a little harder for everyone.
| MontyCarloHall wrote:
| The cynic in me says because I resent being forced to help
| multi-billion dollar companies crowdsource their AI training.
|
| The techno-optimist in me says because I want to force them
| to improve their underlying models. When their engineers
| notice that their model struggles with weird edge cases that
| I purposefully mislabel (e.g. when prompted to select images
| containing motorcycles, I also pick a mountain bike with fat,
| motorcycle-sized tires), perhaps they will contemplate how to
| rigorously encode the concepts of "motorcycle" and "mountain
| bike" into their model, rather than simply pushing an
| abundance of training data through a black box classifier and
| hoping that by adding more crowdsourced data, it will
| eventually arrive at the right answer.
| jrm4 wrote:
| Not if you believe that the people working on this are going
| too fast and/or have a misguided goal.
|
| I think it's _reasonable_ to believe that real self-driving
| cars are not inevitable, or even if they are, deliberate
| disruption of this process is healthy; e.g. it shouldn 't
| rely on something this dumb.
| RockRobotRock wrote:
| Do you really think reCaptcha data only benefits Waymo?
| What about Google Maps detecting stop lights? Or wheelchair
| ramps?
| Sprite_tm wrote:
| Was wondering how these people make money... looks like you can
| buy 'enterprise plans' where you can have them solve captchas en-
| masse... Not sure if I agree with whatever people want to make
| use of that.
| dendav_rai wrote:
| I cannot for the life of me figure out how this magic works. They
| claim Deep learning. If someone has some relevant material,
| please suggest them. Thank you!
| system2 wrote:
| I am actually surprised to see this extension existing and listed
| on the extensions page. I bet google will remove this very soon.
| Unlike adblocks, this is threatening google's security claims.
| judge2020 wrote:
| > Featured
|
| > Follows recommended practices for Chrome extensions. Learn
| more
| system2 wrote:
| Featured until removed. Follows recommended practices at
| first look until checking and finding out it is not Google
| TOS compliant.
| kevmo314 wrote:
| I know a little bit about this "industry". I would be pretty
| surprised if this actually done by AI. At least if it is, it's
| likely only AI-assisted. If it were truly AI, then they would
| make more money offering their own CAPTCHA service instead of a
| CAPTCHA-breaking service. You can see how many active workers (ie
| humans) are online on their network stats screen:
| https://nopecha.com/statistics_network
|
| Typically, the API is a screen recorder and the CAPTCHA is sent
| to thousands of workers who essentially mini-remote-desktop in
| and solve them for about 80 cents/1k CAPTCHAs. Here are some
| other, similar services: https://0captcha.com/,
| http://bypasscaptcha.com/, https://deathbycaptcha.com/
|
| I'm surprised these players are still around. They've been
| operating for nearly 20 years back when I had discovered them.
|
| The entire industry is actually not completely as black hat as
| you might think. Yes, it's used for spam and botting, but at
| least at the time a lot of people used it for bulk downloading,
| which is how I discovered it. Additionally, it does provide work
| for the poorer parts of the world.
| kijin wrote:
| If I were to enter the CAPTCHA-breaking business today, I'd
| probably use one of these services at first to collect a
| million correct solutions for $800, and then use that dataset
| to train my AI.
|
| Once the AI is good enough, I can buy a bunch of used GPUs from
| former ethereum miners, throw them in a cheap DC somewhere, and
| undercut everyone else! Sounds like a decent side project that
| could yield a bit of passive income. Somebody else has probably
| done it already. Maybe OP is that somebody.
| nerdponx wrote:
| This works fine until Google changes the image set. Of course
| then you can pay another $800, but then your product doesn't
| work until you update.
| jdironman wrote:
| I wonder if stable diffusion / dall-e type offerings could
| procedurally generate images?
| slothsarecool wrote:
| This is what hCaptcha is currently doing, they are
| switching the image category every 24-72 hours. How useful
| is it? Not very. Modern ML models such as mobilenet, resnet
| or yolo require only a few hundred images for it to be
| accurate to solve those captchas.
|
| You don't need few million samples, with 500-700 images per
| category you are more than ready to solve current captchas.
| kijin wrote:
| Yep, the cost of keeping the model up to date would be
| negligible compared to the hosting bill.
| EMIRELADERO wrote:
| This does seem to use AI or at least not use the "human
| workers" method.
|
| Going to the Google SSO page for their signin flow and clicking
| on the blue domain name for their app, the Google auth page
| shows the email of the GCP account that started the auth
| project, which in this case is jaewany@gmail.com
|
| Looking that up on Google shows that it corresponds to Jaewan
| Yun.
|
| Looking him up on GitHub gives you his profile which contains
| some captcha solver extension code for this very website and
| also many TensorFlow-related things.
|
| His personal website[1] also lists the solver under "My
| Products"
|
| [1] https://jaewan-yun.com/
| 005 wrote:
| As someone with experience using services like these, and at
| the price point and solve speed their offering its quite clear
| that is a model. Legacy players using low paid humans had solve
| speeds >20 seconds usually and now model based solvers are now
| down to under a second.
| slothsarecool wrote:
| Ever since ML has reached the "general public", developing models
| against hearing or vision based CAPTCHAS has become trivial.
|
| Sure, you have to emulate or simulate the client JS challenges
| but when bots are running browsers in the background you can only
| do so much.
|
| I wonder what the future of captchas, if any, will look like.
| judge2020 wrote:
| It's identity, which is why Google shows "Your computer or
| network may be sending automated queries" message on recaptcha
| if you trigger too many heuristic and IP reputation signals to
| be classified as a bot. That's why, for Google, you get to
| carry around your reputation in the form of your Google
| Account, and for Cloudflare, they have private access tokens[0]
| (which might be the only reason you don't get blocked by every
| CF site on iCloud Private Relay), and otherwise Cloudflare's
| big ambition is "human attestation" via WebAuthn
| credentials[1,2].
|
| 0: https://blog.cloudflare.com/eliminating-captchas-on-
| iphones-...
|
| 1: https://cloudflarechallenge.com/
|
| 2: https://blog.cloudflare.com/introducing-cryptographic-
| attest...
| ajsnigrutin wrote:
| ...which really sucks when you try to use any of those sites
| via tor (no cookies, "bad" IP) or at a place with a shared
| external IP (public access points).
|
| Open google.. captcha... every page has a 5 second cloudflare
| page before opening the page itself.
|
| Bots have the time, they can wait and do other stuff in the
| meantime, but we, humans get bothered by that.
| slothsarecool wrote:
| However, that's not a solution but a patch.
|
| Google accounts give you a good score and tend to deliver
| easy captchas while dealing with Recaptcha; however, for this
| reason, google accounts are being sold and bought constantly.
|
| People have tried similar fight tactics in the past. SMS and
| phone verification have failed because the return on
| investment is far greater than the price barrier it adds to
| get any of those "virtual identities".
|
| iPhones might work but then, for how long? If you guarantee
| that an IPhone won't get captchas, it's a good investment to
| buy many old(or new) ones and sell token access to skip any
| captcha.
|
| Many farms already have thousands of phones scrolling through
| youtube videos to get views, likes, and other stats for
| videos/channels.
|
| The same "logic" applies to yubikeys and similar auth
| hardware; attackers can exploit it similarly.
|
| Companies will tell you that they have abuse policies and
| actively fight abuse/bot farms, but again, they are not
| solving a problem but solving the problem with tape.
|
| ReCAPTCHA was very useful for a while, it did genuinely stop
| bots reasonably well, but none of the "newer" versions seem
| as efficient as the older versions used to be. Progress
| stopped after V2.
| throwup wrote:
| Very cool, thanks for submitting this. I use Buster[1] but I've
| always been annoyed it doesn't support hCaptcha (used by
| Cloudflare). I'm excited to try this out!
|
| [1]: https://addons.mozilla.org/en-US/firefox/addon/buster-
| captch...
| Acen wrote:
| Cloudflare have their own technology that they're using pretty
| heavily now, turnstile.
|
| https://www.cloudflare.com/products/turnstile/
|
| Don't get any goofy puzzles which is nice.
___________________________________________________________________
(page generated 2022-11-28 05:00 UTC)