[HN Gopher] Breaking the 4Chan CAPTCHA
       ___________________________________________________________________
        
       Breaking the 4Chan CAPTCHA
        
       Author : hazebooth
       Score  : 476 points
       Date   : 2024-11-29 20:32 UTC (1 days ago)
        
 (HTM) web link (www.nullpt.rs)
 (TXT) w3m dump (www.nullpt.rs)
        
       | anigbrowl wrote:
       | Congratulations, now it will get upgraded and become more work
       | for humans to solve, increasing the burden on every non-malicious
       | user.
        
         | jeroenhd wrote:
         | It's not like bots aren't already bypassing these CAPTCHAs. One
         | author writing a blog post about how they accomplished what
         | spammers and bots have been doing for ages isn't going to
         | change anything.
         | 
         | I just opened 4chan and after the initial Cloudflare bot
         | detection I was told to register an email or wait 15 minutes
         | before I was allowed to even obtain a CAPTCHA. Looks like
         | they're already taking a layered approach to combat bots.
        
           | blackjackfoe wrote:
           | (author here) Interestingly, the email registration/time-
           | limit was added after I started this project, but before I
           | told anyone about it.
        
         | sunaookami wrote:
         | There are already loads of extensions and scripts out there
         | that can solve these captchas with a great success rate.
        
           | anigbrowl wrote:
           | Adding one more will degrade rather than improve that.
           | Notwithstanding all the downvotes, the author's comment (just
           | above) seems to endorse my argument.
           | 
           | I dislike the captcha a lot, but I wish people would invest
           | the same effort in attacking spam that they do in defeating
           | anti-spam techniques. Spam and similar kinds of abuse are the
           | bane of the internet but most people seem to shrug it off but
           | declaring that a 'hard problem' so they can ignore it.
        
         | credus wrote:
         | It only took about three days until the very first captcha
         | solver was made back in 2021, and the dev's only response was
         | to blanket ban the author's name sitewide until he became
         | popular again for other reasons so they had to remove the
         | filter. They know it's only a matter of time for someone to
         | train a new model no matter how much they update the captcha so
         | they don't really care much about it these days.
        
       | tumsfestival wrote:
       | I can only imagine how much worse they'll make the captcha after
       | stuff like this picks up speed with the users all the while being
       | ineffective against the bots.
        
         | rany_ wrote:
         | I really doubt that they're the first to do this.
        
         | cchance wrote:
         | I mean at some point ... the average visitor is dumber than the
         | AI and your now just blocking dumb people
        
           | OmarShehata wrote:
           | yes, we're creating websites that are gated by IQ tests. This
           | isn't the way
        
             | hsbauauvhabzb wrote:
             | I'd like to believe I have at least an average IQ and I
             | can't pass half the google captchas.
             | 
             | Whether or not a square is part of the motorbike when it's
             | either the rider or a few pixels of the wheel is subjective
             | and fuzzy. Fuck google for not making these questions clear
             | cut enough that answers aren't disputable.
        
           | djbusby wrote:
           | *you're
        
         | OmarShehata wrote:
         | captchas are broken, forever. There is no way to prevent bots
         | without also preventing a bottom tier of human users (visually
         | impaired people, old people, or just impatient people). Like
         | this xkcd [1] comic suggests, we need to just focus on
         | rewarding and punishing specific behavior, regardless of
         | whether the agent is human or not
         | 
         | [1] https://xkcd.com/810/
        
           | echelon wrote:
           | I think a better approach is to make account creation
           | frictionful (eg. charge money, set karma thresholds, require
           | an invite, etc.), score each account, and ban or time out
           | accounts when they break community rules.
           | 
           | But an even better approach would be to go fully P2P and
           | leave the scoring and ranking and filtering at the end nodes,
           | with the possibility of friendly networks of interest group
           | peers assisting with the task. BitTorrent for social media,
           | pgp signed accounts, fully flexible annotation and ingestion.
           | It's also less subject to cabal-based censorship.
        
           | shortrounddev2 wrote:
           | Jokes aside, we don't want any bots at all. Even if they're
           | posting constructive comments, we should interact with
           | humans, not machines
        
             | hsbauauvhabzb wrote:
             | That doesn't mean that webcrawlers have no legitimate value
             | (think: search indexers) or illegitimate value (think:
             | intellectual property theft via data scraping for AI
             | purposes), and bots which communicate while they have no
             | place, aren't going to go away.
        
             | Philpax wrote:
             | In the interest of provoking discussion: why?
             | 
             | If a bot can meaningfully pass and act as a productive
             | member of the community, what does it matter?
        
               | matheusmoreira wrote:
               | Because some of us go to sites like 4chan in order to
               | learn what people _really_ think. We want to see how they
               | react and what they say when they are protected from
               | consequences by the anonymous nature of the forum. We
               | want the full spectrum of humanity, good and bad.
               | 
               | The opinions of bots are not just irrelevant, they are a
               | form of consensus creation attack. They make it seem like
               | a lot of people have an opinion when the reality might be
               | the opposite. We are not interested in the made up
               | realities that people pay bot operators to create. We
               | want the truth, and the truth comes from real humans
               | expressing their real unfiltered thoughts.
        
               | fragmede wrote:
               | It's nice to want things. The people paying expensive
               | programmers for bot armies to parrot their thoughts are
               | currently paying cheaper humans sitting at a bank of
               | beheaded cellphones to parrot amplify their thoughts
               | instead. You're being lied to, regardless, the only
               | difference is if it's a shell script to do the lying or a
               | paycheck to a human to do the lying.
               | 
               | Who's driving phone farm?
               | 
               | https://www.some3c.com/blogs/news/unified-control-20-pcs-
               | pho...
        
               | matheusmoreira wrote:
               | I'm aware of the risk. I try to mitigate it by also
               | browsing smaller sites which are hopefully too small to
               | be targeted by people with vested interests. And I know
               | I'm being lied to. That's why I want to see every lie,
               | every extreme. I'm especially interested in witnessing
               | them try to debunk each other's lies. In the chaos, a
               | synthesis is bound to emerge.
               | 
               | Because in the end it's up to us. We're the ones who have
               | to draw the conclusions. At some point we're gonna have
               | to decide whether some idea is right or wrong. This is
               | much harder compared to just blindly taking a side at
               | face value and just believing them and repeating what
               | they say. I suppose it's possible that most people would
               | prefer to be told what to think and what to say. I for
               | one can't live like that. Things gotta make sense before
               | I'll believe in them.
               | 
               | It's important to witness every possible argument and to
               | see every single one of them viciously attacked on the
               | proverbial ideological battleground. Then you can figure
               | out which points remain convincing. Declaring oneself
               | right, unwillingness to engage in debate, attempts to
               | suppress opposing viewpoints, emotional appeals, these
               | are all signs of authoritarianism. This is reason enough
               | to cast everything they say into doubt. Good ideas don't
               | need to be forced in this manner in order to convince.
        
           | webstrand wrote:
           | PoW like hashcash (not a cryptocurrency thing) might be a
           | better solution. Users could even delegate solving the PoW
           | puzzles to a 3rd party for low power devices like phones. But
           | it imposes a cost on spammers that's inescapable.
        
             | jeroenhd wrote:
             | That assumes spammers are using their own hardware to post.
             | If they're using a botnet, they don't care about CPU
             | cycles. Botnets would probably become even more profitable
             | in that model.
        
       | lofenfew wrote:
       | It might be worth noting that this, including the harder version
       | the op encountered, are not the hardest captchas that 4chan can
       | serve. There is a still harder version which is sent to less
       | trustworthy IPs. I imagine it would still be tractably solved
       | with computer vision. This in part misses the point though, since
       | 4chan has been continuously altering their captcha since it
       | released, making it difficult to create a permanent solution that
       | won't be broken down the road.
        
         | blackjackfoe wrote:
         | Yeah, I encountered those as well in my data gathering. I threw
         | them out from the training set, but I kept them for possible
         | future experimentation.
        
           | Shank wrote:
           | Can you upload a few of these samples somewhere?
        
             | blackjackfoe wrote:
             | I need to manipulate the data a bit, because right now it's
             | just raw, unaligned foreground/background images with
             | solutions. I need to do the alignment and save them as
             | images rather than JSON files. I'll do that when I have the
             | time.
        
         | chatmasta wrote:
         | Datacenter IPs can't even post at all, nevermind needing to
         | solve a CAPTCHA. That's why the accusations of "VPN shill" are
         | usually wrong, as is the assumption of anonymity - 4chan is in
         | fact one of the least anonymous sites on the internet. The
         | optional username feature gives it a veneer of anonymity, but
         | the strict IP requirements ensure almost every post is
         | attributable to a residential internet connection, and reliably
         | associable with other posts from that same connection.
        
           | blackjackfoe wrote:
           | Some datacenter IPs can post fine, mostly just not those
           | belonging to any large hosting company. I would mention a
           | list of ones I know aren't blocked, but, well, that might get
           | them blocked.
        
             | chatmasta wrote:
             | That's surprising to me. I assumed they were using some
             | service (like Cloudflare) with an updated list of non-
             | residential IP addresses.
             | 
             | I've only ever tried to post through Cloudflare WARP (or
             | Apple Private Relay, which is also Cloudflare but different
             | exit IP range). Once I realized that didn't work, I thought
             | maybe it wasn't worth posting at all :) I don't like the
             | idea of my ISP having any suspicion I posted to 4Chan (even
             | if it's technically https yadda yadda...)
        
           | gruez wrote:
           | What about users behind CGNAT, like mobile users?
        
             | chatmasta wrote:
             | That's attributable with the right warrant and correlation
             | with other data available to the ISP.
             | 
             | CGNAT is not an anonymity mechanism - at best it may be a
             | very crude one, but the carriers will make extra effort to
             | remove that anonymity through logging, retention, and
             | segmentation.
        
             | BlueTemplar wrote:
             | "Attributable" means by law enforcement, and mobile
             | carriers, like all ISPs, must keep logs. In this case, for
             | who had which IP address when.
             | 
             | (Otherwise, it's akin to the usual confusion between
             | anonymity and pseudonymity.)
        
               | chatmasta wrote:
               | That's true, but to be fair my original comment also said
               | posts would be reliably associable with other posts from
               | the same IP. With CGNAT, that association will be
               | slightly less reliable, but not meaningfully so. The
               | segment of the population who posts on 4chan is so low
               | that there is negligible chance of two 4chan users
               | sharing an exit IP and time window. Even with non-
               | overlapping time windows, the population will be low
               | enough for stylometry (and other factors) to remove any
               | remaining ambiguity.
        
             | Hamuko wrote:
             | Some mobile users can post but I think they've gone so far
             | as to ban entire ISP mobile IP ranges to prevent people
             | from constantly rolling new IPs on their phone.
        
               | jorvi wrote:
               | Nice callback to Moot banning an entire Australian region
               | (Queensland or Victoria, if memory serves) because
               | Aussies did an outsized share of shitposting, and of
               | Aussies those particular ones were the worst.
        
             | jabroni_salad wrote:
             | I'm pretty sure all of t-mobile is rangebanned.
             | Phoneposters are usually told to buy a pass.
        
               | numpad0 wrote:
               | That sounds old 2ch.net. Was that plan from Hiroyuki, by
               | chance? IIRC they entrusted the key to kingdom to that
               | guy, or am I mistaken...
        
               | irusensei wrote:
               | Hiro owns 4chan. I remember something about Moot giving
               | him the website for free.
        
           | jterrys wrote:
           | 4chan tries to make its users anonymous to each other.
           | There's nothing in there about you being anonymous to their
           | servers.
        
           | codexon wrote:
           | You can get residential ips nowadays. They are much more
           | expensive for an individual, but for a business or nation-
           | state, it is a feasible option.
        
       | antirez wrote:
       | Appropriate response by 4Chan to this: simplify the human work
       | given that anyway it's simple to solve via NNs. We are at a point
       | where designing very hard captchas has high probabilities to
       | increase the human annoyance without decreasing the machine
       | solvability.
        
         | hackernewds wrote:
         | Just use Worldcoin retina scans next
        
         | codetrotter wrote:
         | > simplify the human work given that anyway it's simple to
         | solve via NNs. We are at a point where designing very hard
         | captchas has high probabilities to increase the human annoyance
         | without decreasing the machine solvability
         | 
         | Or disallow free users to post at all, and require everyone to
         | buy the 4chan Pass for $20 USD per year if they want to post.
         | 
         | https://4chan.org/pass
         | 
         | This is already available to not have CAPTCHA. So if CAPTCHA is
         | totally ineffective, it follows that they should do away with
         | CAPTCHA and free users being able to post at all and everyone
         | should buy the 4chan Pass if they want to post.
        
           | ranger_danger wrote:
           | Agreed, charging for accounts is the only halfway viable
           | solution I have seen any service use that gives a sizable
           | downtick in the sheer number of bots/spam.
           | 
           | Of course it's not perfect, and it will still happen, but I
           | have yet to hear any better solutions. Please prove me wrong
           | though!
        
             | jcpham2 wrote:
             | This is known as a Sybil [1] attack and it lays the
             | groundwork for stuff like Adam Backs hashcash [2] protocol
             | and it's basically why things like proof of work [3] have a
             | monetary value today.
             | 
             | Very chicken and egg this entire field- defending against
             | the spammers while simultaneously operating a "free"
             | system. How to do it without making it prohibitively
             | expensive to join the system...
             | 
             | Any free system will be abused yada yada yada
             | 
             | [1] https://en.wikipedia.org/wiki/Sybil_attack
             | 
             | [2] https://en.wikipedia.org/wiki/Hashcash
             | 
             | [3] https://en.wikipedia.org/wiki/Proof_of_work
        
           | fullspectrumdev wrote:
           | This kills the board. Users will go elsewhere, fuck all
           | people pay for pass.
        
           | poincaredisk wrote:
           | At this point I have to wait 90 seconds before making every
           | post. (maybe because I don't persist cookies). I posted very
           | rarely, but now I just stopped - I get it when someone shows
           | me the door.
        
           | matheusmoreira wrote:
           | That would work. It would also kill the site.
        
           | efilife wrote:
           | What? So you use 4chan? It would completely kill what makes
           | this website special
        
         | hsbauauvhabzb wrote:
         | What is NN?
        
           | layer8 wrote:
           | https://en.wikipedia.org/wiki/Neural_network_(machine_learni.
           | ..
        
           | numpad0 wrote:
           | "AI" but pre-COVID
        
             | marcosdumay wrote:
             | Oh my!
             | 
             | Is the oversimplification from "deep neural network" into
             | "AI" caused by the prevalence of brain-fog due to long
             | COVID?
        
         | YeahThisIsMe wrote:
         | We've been stuck at that point for at least 5, if not 10,
         | years.
        
         | encom wrote:
         | 4chan doesn't care about human annoyance. They just started
         | doing a 15 minute post delay, which is infuriating. I had to
         | whitelist 4chan in Cookie AutoDelete.
        
           | poincaredisk wrote:
           | Hi fellow cookie autodeleter, I experienced the same thing,
           | but I just decided to stop posting. Whitelisting felt too
           | much like giving in to terrorists. I'm considering just not
           | going there in the future. Maybe after all this time I will
           | finally be free.
        
             | encom wrote:
             | See you tomorrow, anon.
        
             | Arnavion wrote:
             | Same. In my case I always use a separate incognito mode
             | browser for posting and a regular locked-down browser with
             | JS disabled etc. So I'd have to either give in and leave
             | the incognito mode browser running in the background while
             | I browser on the main browser, or give in and stop blocking
             | as aggressively on the main browser, and I chose to do
             | neither and just stop posting.
             | 
             | Given the schizos that are still present and drowing out
             | the conversation in half the threads I read, there wouldn't
             | be a point to posting anyway.
        
           | matheusmoreira wrote:
           | Just stop posting there. The whole point of it is to post
           | anonymously in a high traffic forum. The rate limiting timers
           | have reduced traffic to the point many boards feel dead, and
           | their solution to that problem is to sell accounts.
        
         | brodo wrote:
         | I am totally in favor of increasing the annoyance of 4chan
         | users.
        
         | gosub100 wrote:
         | "Drag each symbol to the group that is most likely to be
         | offended by it."
        
       | dmitrygr wrote:
       | > The official TensorFlow-to-TFJS model converter doesn't work on
       | Python 3.12. This doesn't seem to really be documented, and the
       | error messages thrown when you try to use it on Python 3.12 are
       | non-obvious. I tried an older version of Python (3.10) on a
       | hunch, using PyEnv, and it worked like a charm.
       | 
       | Amazing. And then people wonder why "just use python 2" is still
       | a thing.
        
         | orhmeh09 wrote:
         | Do you have examples of "just use python 2" still being a thing
         | in 2024?
        
           | dmitrygr wrote:
           | Yeah, whenever i need to write a quick script and have no
           | time to suffer "$library needs python 3.x, where x must be >
           | $value and <= $value2, and not a prime except when that ends
           | in a 3, except on leap days"
           | 
           | 2 is stable and does not change from under you. Which is what
           | you want in a programming langiuage
        
             | Zopieux wrote:
             | In my recent experience, this dependency hell is quite
             | specific to scientific / ML python.
             | 
             | The general state of ML code is abysmal, as it attracts a
             | lot of inexperienced developers, and Python's duck/relaxed
             | typing spirit makes it easy to write incomprehensible code
             | with megabytes of unnecessary or bloated dependencies.
             | 
             | It's not bad per se, the amount of innovation is
             | impressive, but a lot of it is a castle of cards, from low
             | level libraries to end-user software.
        
             | sadeshmukh wrote:
             | Python 3.10 seems to work for almost everything, and Python
             | 2 most certainly doesn't. In fact, even latest works for
             | almost everything - there's an alternative to 99.9% of
             | Python 2 stuff in Python 3.
        
       | ChrisMarshallNY wrote:
       | That's like spending a few hours, learning to take the lid off
       | your septic tank.
        
         | blackjackfoe wrote:
         | Little bit, but at least you learned something :)
        
         | salawat wrote:
         | ...Don't underestimate the things to be learned studying a
         | septic system.
        
         | gherkinnn wrote:
         | Oddly enough, I find most of 4chan less brainrot inducing than
         | Twitter, even pre-Musk.
        
           | thrance wrote:
           | I have bad news for you, then...
        
           | tovej wrote:
           | There's no smart algorithm for sorting posts, and there's a
           | limited number of active threads, so it's not rage baiting in
           | quite the same way. Only active threads stay alive though, so
           | it has the exact same issue as twitter and other social
           | media, only engaging content is served to users, and the most
           | engaging things are rage bait, conspiracy theories, and porn.
           | Things that get someone riled up enough to respond.
        
           | JasserInicide wrote:
           | It's still brainrot, it's just on the opposite end of the
           | political spectrum.
        
             | irusensei wrote:
             | Back when Llama was leaked on /g/ 4chan's /g/lmg/ was the
             | best place to be up to date with local models. It still
             | might be but not so much.
             | 
             | People think 4chan is just /pol/ when in fact more boards
             | exist and their users don't really appreciate when /pol/
             | leaks into their threads.
        
           | meowface wrote:
           | I am a liberal and also genuinely find many 4chan boards less
           | politically awful than current Twitter most of the time.
           | 
           | The chronological sorting at least offers some diversity of
           | opinion. The first 50 replies to a 4chan thread about Trump
           | (in the right board) will usually contain many, maybe even
           | mostly, anti-Trump posts. On Twitter you usually need to
           | scroll through the sea of blue checkmark replies for a while
           | to find even one anti-Trump post.
           | 
           | Some 4chan boards are majority neo-Nazis who want all
           | minorities expelled or murdered. But stumble across a
           | particular Twitter thread and it's the same thing but with
           | even more ideological uniformity within the thread, and with
           | 4000 neo-Nazis in the thread instead of 60.
           | 
           | That said, both sites definitely are not great to use if you
           | aren't very right-wing.
        
       | morkalork wrote:
       | Following the links to the captcha solving service you can read
       | profiles of the humans doing the work where its pitched as more
       | ethical than them working in hazardous factories!
        
       | cherryteastain wrote:
       | The part about bad Keras<->Tensorflow.js interop is classic
       | Tensorflow. Using TF always felt like using a bunch of vaguely
       | related tools put under the same umbrella rather than an
       | integrated, streamlined product.
       | 
       | Actually, I'll extend that to saying every open source Google
       | library/tool feels like that.
        
         | Retr0id wrote:
         | something something Conway's law
        
         | alecco wrote:
         | related (15 days ago)
         | 
         | https://news.ycombinator.com/item?id=42130881 on Francois
         | Chollet is leaving Google
         | 
         | > "Why did you decide to merge Keras into TensorFlow in 2019":
         | I didn't! The decision was made in 2018 by the TF leads -- I
         | was a L5 IC at the time and that was an L8 decision.
        
       | cchance wrote:
       | Jesus looking at both example captchas... as a human... i have no
       | fucking clue the answer lol
        
         | paulpauper wrote:
         | And now we can look forward to even harder ones now that those
         | have been broken. soon the web will be unusable to everyone but
         | robots
        
         | anigbrowl wrote:
         | You get used to them, there are various heuristics built in
         | that make them easier then they at first appear.
        
           | blackjackfoe wrote:
           | I initially wrote the alignment-only script (in the source
           | repo as `user-scripts/4chan-captcha-aligner.ts`) before the
           | rest of the project because the person who was collecting the
           | data manually for me couldn't wrap their head around the
           | slider-style CAPTCHAs. There's definitely a learning curve.
        
       | makifoxgirl wrote:
       | This project also solves the 4chan captcha
       | https://github.com/moffatman/chan
        
       | chad1n wrote:
       | I've built 3 iterations of captcha solvers for that crappy
       | website based on https://github.com/drunohazarb/4chan-captcha-
       | solver/issues/1 . The only thing I've learned along the way is
       | that it's mostly pointless outside of a "learning" exercise,
       | since they'll change the captcha (in terms of letter count or the
       | entropy background). Initially, it was 4 characters with pretty
       | obvious background, then it turned to 5, then it was both 4 and 5
       | and the current iteration which is also either 4 or 5, but with a
       | lot of entropy surrounding the characters.
        
         | bryan0 wrote:
         | In the article it mentions they changed the number of
         | characters in the captcha after he trained the model, and the
         | model could still solve it
        
           | oefrha wrote:
           | Changing the number of characters barely registers as a
           | change. They merely need to use a variety of fonts (according
           | to the post right now there are a grand total of 15 possible
           | glyphs which is tiny) and it would vastly increase the
           | difficulty of generating the training set, and probably
           | affect model accuracy by a lot. Not to mention more complex
           | backgrounds. What's seen here is an ancient and relatively
           | simple form of captcha.
        
         | blackjackfoe wrote:
         | This project was really my first decent introduction to
         | computer vision and machine learning (along with that of those
         | who helped me in various ways; none of them desired to be
         | credited here other than the guy who collected some of the data
         | for me.)
         | 
         | It was definitely a successful learning exercise, and it's made
         | me more confident tackling some other problems I've had in mind
         | for awhile.
        
           | normie3000 wrote:
           | How did this project help you to learn computer vision? I'd
           | also like to write a basic captcha solver as an intro, but
           | superficially this project just looks like a dump of
           | generated code.
        
             | blackjackfoe wrote:
             | What do you mean by "generated code"? All of the code in
             | the linked GitHub repo was written by me, with the
             | assistance of a couple friends who helped here and there,
             | but didn't request to be credited.
             | 
             | I learned a lot because I had to do a ton of research and
             | experimentation (fancy word for trial-and-error) to write
             | the code and have it work as I expected.
        
               | normie3000 wrote:
               | I think there's been a misunderstanding. I didn't
               | understand you were the author of the linked article, and
               | read the following exchange to mean you'd found the code
               | at https://github.com/drunohazarb/4chan-captcha-solver to
               | be a helpful introduction:
               | 
               | > > I've built 3 iterations of captcha solvers for that
               | crappy website based on
               | https://github.com/drunohazarb/4chan-captcha-
               | solver/issues/1
               | 
               | > This project was really my first decent introduction to
               | computer vision and machine learning
               | 
               | I see now that your code is linked from the article, and
               | looks really informative - thanks for sharing!
        
             | throwaway314155 wrote:
             | Not OP, but maybe consider reading the fucking article
             | before throwing out rude insinuations?
        
               | normie3000 wrote:
               | I'm not sure this is helpful - please see my other reply.
               | 
               | From https://news.ycombinator.com/newsguidelines.html
               | 
               | > Please respond to the strongest plausible
               | interpretation of what someone says, not a weaker one
               | that's easier to criticize. Assume good faith.
        
           | spookie wrote:
           | To help you out if you're interested:
           | 
           | - a smeared gaussian in one axis and another in another axis
           | can really help segmenting chars, finding lines of text in
           | OCR
           | 
           | - You can unshear chars using the Radon or Hough transform as
           | a basis to understand the angle
           | 
           | Went through MNIST a few weeks ago and I agree it's
           | interesting!
        
             | blackjackfoe wrote:
             | I am always interested! Thank you for the tips, I'll
             | definitely research these.
        
             | sorenjan wrote:
             | Shearing is a linear operation that should be trivial for a
             | NN to learn. Have you found that unshearing is actually
             | useful? Was it to feed the image to an existing OCR
             | program?
        
       | fresh_broccoli wrote:
       | I wasn't a very active 4chan poster to begin with, but when they
       | introduced this awful CAPTCHA, and later the 300s countdown
       | before making the _first_ post, I completely lost interest in
       | using the website.
       | 
       | Anonymous boards were supposed to be low-friction, but now 4chan
       | is one of the most user-hostile social media platforms around. It
       | takes a special kind of dedication to post there, which I
       | seriously doubt helps the quality of the site.
        
         | blackjackfoe wrote:
         | Do a Web search for "4Chan CAPTCHA" sometime. All the top
         | results will likely be people complaining about how terrible it
         | is. You're certainly not alone.
         | 
         | The worst part about the countdown: if you wait too long to
         | make a post after waiting the 10 minutes (eg: you get
         | distracted,) it will expire, and you have to wait another 10
         | minutes.
        
         | prettywoman wrote:
         | > 300s countdown
         | 
         | I don't get why they added that nasty "feature" to the post
         | form, it really discourages you to post(maybe it's because they
         | want to sell you their 4chan pass), I don't understand why
         | 4chan is still active
        
           | hombre_fatal wrote:
           | Presumably, anyone who regularly uses 4chan would register.
           | Once you register and click the login link in your email, you
           | just get the easy Cloudflare captcha and no countdown.
           | 
           | The horrible captcha + 300s countdown is for completely
           | unauthed users. Most sites don't even allow unauthed users to
           | post at all.
        
           | Hamuko wrote:
           | If you don't get it, you probably don't spend too much time
           | on 4chan.
           | 
           | There is A LOT of ban evasion on 4chan. If you have a dynamic
           | IP address from your ISP, you just spam/derail threads with
           | personal crusades/whatever until you get banned, reset your
           | router and repeat.
           | 
           | This countdown increases the cost of ban evasion, since you
           | can't get right back in to continue. Everyone on your
           | targeted board/thread now gets at least a 15-minute respite.
           | 
           | They've also had to blacklist entire ISP from making any
           | posts because some people are constantly ban evading on them.
           | Especially mobile ISPs, where there's basically an unlimited
           | amount of fresh IPv6 addresses available.
        
         | alekratz wrote:
         | one of the biggest problems that 4chan has to combat is spam.
         | unfortunately, at 4chan's scale, hcaptcha and recaptcha are not
         | free. 4chan is not exactly a font of money, either. the only
         | reason they turned to this awful homebrew captcha was because
         | recaptcha stopped being free. is there any better way to do it
         | with a single developer for a website that serves millions of
         | people a day?
        
           | avar wrote:
           | > is there any better way to do         > it with a single
           | developer for         > a website that serves millions
           | > of people a day?
           | 
           | No, the other reason they're using this is to make it so
           | annoying that you'll spend $20/yr to buy a 4chan pass to
           | bypass it.
           | 
           | If you're not making your free website annoying to drive
           | revenue there's obvious ways to make it less annoying.
           | 
           | E.g. keep the annoying captcha, but don't show one again for
           | the lifetime of a cookie, validate users who can make a money
           | transfer of $0.01 etc.
        
             | Anon1096 wrote:
             | > keep the annoying captcha, but don't show one again for
             | the lifetime of a cookie
             | 
             | This is already being done, there's a cookie and heuristics
             | in place that will give you an easier captcha or
             | occasionally skip it entirely. But 4chan really does have a
             | couple (and I truly mean a small amount of super super
             | dedicated users) of bad actors who constantly spam and try
             | to work around any roadblocks given to annoy the rest of
             | the userbase. You cannot give them a reliable way to spam
             | no matter what. That's why there's now many country and
             | region blocks in addition to your standard VPN/DC IP range
             | blocks. Plus the Cloudflare check added a couple years ago.
        
               | bsagdiyev wrote:
               | Is the anontalk guy still up to his shenanigans? It's
               | admittedly been a very long time since I've used 4chan.
        
             | alekratz wrote:
             | >No, the other reason they're using this is to make it so
             | annoying that you'll spend $20/yr to buy a 4chan pass to
             | bypass it.
             | 
             | I think this is a really cynical outlook, especially for a
             | website that is not run as a modern tech-centric company.
             | 4chan's roots are in that of the Old Internet, where it is
             | a creative and messy and interesting place to be. why would
             | they be banking solely on using a terrible captcha as a
             | method to drive user subscriptions, when they have the
             | option to run circus-tent ads? if making money was their
             | sole purpose, why would they not kick the problematic and
             | porn boards to the curb and ban the use of slurs to make
             | room for more friendly advertisers? there are so many other
             | avenues to increase profitability that most websites have
             | taken which 4chan has staunchly refused to follow. why
             | would they choose only the 4chan pass and ads as their only
             | opportunity at making money?
        
               | iterance wrote:
               | Doing so would destroy the culture of 4chan.
               | 
               | Companies centered around communities don't generally
               | have leeway to shape their communities into a profitable
               | form by directly altering the fabric of the community.
               | Time and again it has been shown that forcing changes to
               | the identity of a space leads to communities' rapid
               | demise. In rare circumstances and with a skilled hand a
               | community can be guided here and there in even some
               | significant ways, but 4chan probably does not have that
               | option: they'd need a massive shift to pull off what you
               | describe.
               | 
               | Instead profit must generally be built around what is
               | there. But whether or not such communities exist to make
               | profit, they surely must be _profitable_ , or they will
               | not survive. They must, some time or another, be free of
               | deficit. This is not a matter of capitalist greed for
               | most communities, but an attempt to find a path towards
               | stability.
        
               | TZubiri wrote:
               | Cynical yes, incorrect no.
               | 
               | The link between spam protection and payment is well
               | documented and as old as the internet.
               | 
               | Consider the origins of bitcoin and PoW have been as a
               | currency to stop email spam.
               | 
               | I do agree that the incentive is probably not to make
               | money, but to deter spam. That said after so many times
               | the company has been sold, I wouldn't disregard that
               | theory
        
           | joe-collins wrote:
           | Not the rampant racism or sexism or simple misanthropy or
           | outright calls to violence or overflowing hostility.
           | 
           | It's the spam that tops the problem list.
        
             | ChadNauseam wrote:
             | GP said "one of the biggest problems", not "the biggest
             | problem"
        
             | Etherlord87 wrote:
             | It's a bit embarrassing I even have to explain this, but
             | yes, because racism or sexism are very important parts of
             | 4chan's appeal: it's a place with freedom of speech. Let's
             | be real the standards of discussion are low, but people can
             | discuss stuff freely, which they wouldn't be able to if
             | everything was buried under some GPT generated spam.
        
               | codexon wrote:
               | A lot of people think 4chan is one of the last bastions
               | of free speech on the internet because they see a lot of
               | racism that would normally be banned anywhere else.
               | 
               | But if you post something that goes against the alt-right
               | that pisses them off too much and getting a lot of
               | replies, it'll be deleted within minutes, or you'll even
               | get banned for being "off topic".
               | 
               | 4chan is not free speech, it is just a haven for the alt-
               | right.
        
               | TZubiri wrote:
               | They do have rules and the site is quite moderated.
               | 
               | I do think though that any such site or platform will
               | have the issue of judges inflecting their bias in their
               | application of the rules.
               | 
               | So I wouldn't say that it is a unique phenomenon.
               | 
               | That said, of course there is a semantic as well as
               | technical identity to 4chan. And they are quite
               | connected, rather than isolated.
               | 
               | 4chan, apart from its lax rules on what we now call hate
               | speech, has developed a community where insults are now
               | part of its culture. The fact that the site is anonymous
               | greatly influences that animosity.
               | 
               | I like to think of 4chan not as a place where horrible
               | people go, but where people go to be horrible. Of course
               | you have the dedicated users, neets or schizos or
               | chronically online, but again that's a propery of every
               | site, and not necessarily a majority.
               | 
               | So if you read /pol/ or /b/ like articles of an
               | organization with an editorial line, sure you will see
               | nazis and a deranged group of people.
               | 
               | If you however see it like bathroom wall writings, you
               | will see a bit of everyone.
        
               | codexon wrote:
               | There were no rules broken.
               | 
               | The political bias of the moderators on the website have
               | been documented by others.
               | 
               | https://www.vice.com/en/article/the-man-who-helped-
               | turn-4cha...
        
             | sky2224 wrote:
             | The thing is, addressing the spam and also allowing users
             | to have a low friction experience would be the first step
             | to addressing the concerns you mentioned (without
             | compromising the purpose of the site: anonymous and totally
             | free speech).
             | 
             | There aren't many places for the people that share the
             | views you mentioned to go other than sites like 4chan, so
             | even though there's an awful captcha, they're going to be
             | quite dedicated as they don't have many mainstream options
             | elsewhere.
             | 
             | I believe if users were able to have a frictionless
             | experience, then it'd reduce the chances of someone
             | throwing their hands up in the air and saying, "this isn't
             | worth it". I've actually attempted to reply to threads to
             | challenge the views of others, but once I'm hit with the
             | 300-1000 second wait time to post, I just close the tab and
             | move on.
        
             | krmboya wrote:
             | > Not the rampant racism or sexism or simple misanthropy or
             | outright calls to violence or overflowing hostility.
             | 
             | Isn't that more easily solved by just not visiting the site
             | in the first place?
        
               | tovej wrote:
               | This problem is a societal one, it mostly harms you
               | indirectly by creating spaces for hateful ideas to
               | spread, 4chan's harm is through the capacity to organize
               | and strengthen hateful and harmful political movements.
               | More socially conscious people not visiting the site only
               | serves to create a stronger echo chamber.
        
               | ranger_danger wrote:
               | This is how oppression starts. First it's "let's only get
               | rid of the most offensive content", then "let's suppress
               | opinions we don't like".
        
               | matheusmoreira wrote:
               | The fact you think some ideas are "harmful" is exactly
               | why humanity needs sites like these. We don't trust
               | people like you to determine which ideas are "harmful"
               | and which aren't, which ideas are worth spreading and
               | which aren't. We want to see for ourselves, thank you
               | very much.
               | 
               | We are especially interested in the ideas that people
               | deem offensive enough to suppress. Are they actually
               | wrong or are they just socially unacceptable? Whatever
               | the truth is, it can't be learned from a place that
               | suppresses discussion of it. Declaring the matter as
               | settled and suppressing any opposing viewpoint is the
               | very definition of an echo chamber.
        
             | skotobaza wrote:
             | That's the price you pay for ability to freely and
             | anonymously voice different opinions. And even then 4chan
             | is considered "soft", because mods still delete some
             | egregiously "incorrect" opinions.
        
         | 123yawaworht456 wrote:
         | recaptcha is terrible if you are cursed with an ISP that Google
         | deems icky for some indiscernible reason. at the time, I was
         | getting slowly fading bullshit that invariably gaslit me with
         | "try again" several times. when they've switched to custom
         | captcha, I actually started posting again instead of just
         | lurking.
         | 
         | yeah, the recent 5-15 minute countdown before your first post
         | is a bizarre thing, but I assume the volume of spam and ban-
         | evading schizos they're dealing with is ungodly. a single
         | dedicated shithead can shit up a general or a slow board
         | indefinitely by just resetting their router or switching
         | airplane mode on/off for a few minutes when they get banned.
         | 
         | >but now 4chan is one of the most user-hostile social media
         | platforms around.
         | 
         | virtually every single big platform requires your phone number.
        
         | shortrounddev2 wrote:
         | They had a gigantic spam problem, captcha saved the site
        
           | paulpauper wrote:
           | then how does Reddit and Twitter work without such an
           | obnoxious captcha? I find it hard to believe those sites get
           | less spam. Or any other community.
        
             | mikeyouse wrote:
             | You need accounts with unique emails to post everywhere
             | else, and those sites are massive with hundreds/thousands
             | of devs, some of whom work exclusively on anti-spam. If you
             | make a site immune to advertising revenue and any other
             | source of profit, you're going to struggle to pay for
             | "internet-scale" efforts.
        
             | blackjackfoe wrote:
             | Reddit and Twitter both have huge bot problems. On Reddit
             | it's a bit less obvious due to the upvote/downvote system,
             | and on Twitter it's a bit less obvious because you usually
             | only follow people you want to see. Make a post on Twitter
             | that mentions something like cryptocurrency, and you'll get
             | a dozen bot replies immediately.
        
             | KaoruAoiShiho wrote:
             | They don't surface every post to everyone unlike 4chan so
             | spam is much less visible though they still exist.
        
             | shortrounddev2 wrote:
             | Reddit and Twitter are replete with bots
        
             | RockRobotRock wrote:
             | First, they aren't anonymous. It's a lot more friction when
             | you have to generate an account (which also requires a
             | captcha).
             | 
             | Second, Twitter absolutely does make you perform captchas
             | if they suspect you are a bot. I say this as someone who
             | ran Twitter bots previously.
        
             | anigbrowl wrote:
             | By selling your data to advertisers.
        
             | heavensteeth wrote:
             | Twitter is extremely user hostile. Every time I've made an
             | account it has inevitability asked for an email and a phone
             | number, and at least a few captchas.
        
           | raincole wrote:
           | - Obscure proprietary algorithm decides what you read
           | 
           | - Obscure CAPTCHA and other anti-spam features
           | 
           | - Pay to post
           | 
           | Choose one.
        
             | matheusmoreira wrote:
             | Paying to post sounds like the only good solution. There is
             | no privacy problem if they accept Monero.
        
         | paulpauper wrote:
         | Same here. the captcha is the tip of the iceberg. VPNs ,
         | proxies...all blocked. Tons of ghosting and censoring of posts
         | too. Also crawling with feds and people trying to get you to
         | incriminate yourself. I love the option to bypass it with
         | crypto. Yeah, like I am going to give them btc, which will be
         | traced by every agency and coin analysis firm and also get my
         | wallet/exchange account restricted by being linked to 4chan.
         | The owners more than happy to comply with every 3-letter agency
         | request for info.
        
           | Der_Einzige wrote:
           | _taps the sign_
        
         | scrlk wrote:
         | The addition of the post countdown has had a pretty noticeable
         | effect on posts/day across multiple boards: https://4stats.io/
         | 
         | When an earlier version was trialled on /biz/ (mandatory email
         | verification - https://warosu.org/biz/thread/58388587), it
         | nuked the board and it hasn't recovered.
        
         | jimbob45 wrote:
         | _but now 4chan is one of the most user-hostile social media
         | platforms around_
         | 
         | Stay off /v/, /tv/, /pol/, and /a/ and you'll have a pretty
         | good time.
        
           | yungporko wrote:
           | certainly won't have a good time on /b/ either
        
             | meowface wrote:
             | It's mostly porn nowadays but through some chain of events,
             | /b/ actually is ideologically one of the most normal boards
             | on the site now. Not even kidding. Many - probably most -
             | other boards are majority Trumpists or neo-Nazis but /b/ is
             | roughly at least 50% liberal or libertarian.
             | 
             | So politics threads in /b/ are actually better than in a
             | ton of other boards.
        
       | hobom wrote:
       | Does 4Chan also have bot BEHAVIOR detection (e.g. unnatural mouse
       | movements)that google captcha has?
        
         | kalleboo wrote:
         | Yeah I had been under the impression that the point of captchas
         | like this (and those "slide a puzzle piece" ones) weren't the
         | solution to the problem as much as checking for human-like
         | mouse movements.
        
         | ipnon wrote:
         | The results here suggest it does not.
        
         | blackjackfoe wrote:
         | It does not, at least not once you pass the Cloudflare
         | Turnstile challenge (which can be done with an API as well.)
        
       | bawolff wrote:
       | There is a reason why people moved away from distorted text based
       | captcha. We are basically at the point where computers are better
       | at them then humans.
       | 
       | https://www.usenix.org/system/files/conference/woot14/woot14...
       | is a paper on the subject i think is really interesting
       | 
       | However a surprising amount of text based captchas can be solved
       | in a few line shell script of, using imagemagik to convert to
       | greyscale, dilate and undilate, then pass to teserract
       | 
       | However there are also sites like https://2captcha.net , so
       | really captchas are more like putting a small min amount of
       | effort.
        
         | noprocrasted wrote:
         | Just because you can technically crack them doesn't mean
         | they're useless.
         | 
         | There's a significant amount of time, skill and effort that
         | went into the solution from this post, and the end result
         | doesn't generalize well (you'd have to start all over for a
         | different kind of captcha).
         | 
         | The vast majority of spammers would _not_ be able to replicate
         | this; those who do would either make money legitimately, or
         | focus their skills on juicier targets (if you have AI /ML
         | skills and want to do nefarious things there are other options
         | that pay much better than spamming).
         | 
         | Such captchas still work well at raising the cost of successful
         | spamming above the expected payoff from said spam.
        
           | fragmede wrote:
           | > there are other options that pay much better than spamming
           | 
           | Are there? Say you've got a felony record and can't get a
           | legit AI/ML job at eg OpenAI/anywhere. What would you do
           | instead? most of the options I can think of involve getting
           | paid for doing things that are basically spam if you zoom out
           | enough.
        
             | noprocrasted wrote:
             | There's plenty of mischief potential with "deepfakes".
        
             | andrewflnr wrote:
             | How many people are there like that, and how much damage
             | are they collectively likely to do? If you're a random
             | spammer, how hard will it be to hire that person? Again,
             | not aiming for impossibility, just reducing the damage.
        
               | TravisPeacock wrote:
               | I've been working for myself for over a decade doing
               | random projects for clients while also doing my own
               | thing. My resume looks awful and the job market is trash.
               | If be willing to take a job as a jr developer and work my
               | way up (or a sys admin).
               | 
               | I used to run one of the world's largest ebook piracy
               | websites but want to put that life behind me. Recently
               | work came across my desk to create tens of thousands of
               | accounts on a well respected website so they could more
               | easily scrape it.
               | 
               | I just want a traditional job, but I also want to support
               | my family and $4000 for a months work
        
               | supriyo-biswas wrote:
               | If I were you, I'd probably try looking at companies
               | working in the web scraping and reverse engineering
               | fields, who might even appreciate the skills even if they
               | were acquired in a, let's just say, "different" way.
        
             | benreesman wrote:
             | I've got no criminal charges of any kind and I'd still want
             | to know about any way to work without getting flagged as a
             | known enemy of the Cartel.
             | 
             | I'm lucky that some people still want chops no matter the
             | thought crime, I'm very grateful such excellent employers
             | exist (love you guys).
             | 
             | But you're never sure you'll line up two such in a row,
             | this isn't the IBM until company casket and company funeral
             | days. Makes life "interesting" even for a risk-taker.
        
           | reaperman wrote:
           | So, I do this type of AI development for solving CAPTCHAs.
           | 
           | I can't get any real jobs that pay me for my more advanced
           | skills. My primary sins were going to a second/third-tier
           | university and some performance concerns in a portion of my
           | previous roles due to divorce and burn-out. I make $80k/year
           | in government IT, and $30-150k/year as the "AI" guy in a
           | small 2-5 person group that offers a CAPTCHA-breaking API.
           | 
           | The spammers aren't the ones replicating this. They just pay
           | B2B rates (combo of SaaS + Consulting, depending on client
           | needs) to help them remove the roadblocks.
        
             | blackjackfoe wrote:
             | Is your company hiring? :)
        
             | benreesman wrote:
             | If there were a totally 100% aboveboard way to do this in a
             | net transfer of utility from Tessier-Ashopool SA to the
             | typical web surfer I would be a superfan.
        
             | HeckFeck wrote:
             | Why do you do this?
             | 
             | While I can appreciate the technical achievement, you know
             | most users of forums and imageboards don't want any AI
             | content at all.
        
               | KomoD wrote:
               | > Why do you do this?
               | 
               | Money, obviously. I'd also do it for $30-150k/year
               | 
               | > you know most users of forums and imageboards don't
               | want any AI content at all.
               | 
               | He's not creating or posting any "AI content"?
        
               | HeckFeck wrote:
               | Okay, but you know his actions are enabling more AI
               | content and spam to proliferate? I hardly think he is
               | making that much money just because legitimate users
               | don't want to fill in a captcha.
        
               | brookst wrote:
               | It's very easy to opine about the ethics others _should_
               | have. Different when it's you and your family and a
               | comparatively easy effort will make a material difference
               | in quality of life. And especially when you know ghe
               | market need will be met by someone else anyway.
        
               | IWeldMelons wrote:
               | If someone is making a brazen statement of being "a bad
               | guy because 80K is not enough, and could not find
               | anything decent for those extra $30K" what kind of
               | treatment would they expect?
        
               | xp84 wrote:
               | I'd argue that someone cracking CAPTCHAs has a lot less
               | dirty hands than someone who works in an actually scummy
               | industry like US health insurance. Those companies
               | literally kill people by denying them care to pinch
               | pennies. This guy might cause a little more spam on the
               | already useless mess that is YouTube comments. Who cares.
               | I'd take the money, too.
        
               | IWeldMelons wrote:
               | Typical US-centric way of thinking. If your healthcare
               | system is crippled it does not justify making internet
               | mess for everyone else in the world.
        
               | lostlogin wrote:
               | I'd have suggested adtech before health insurers.
        
               | marcosdumay wrote:
               | To be fair, there's a huge amount of people around here
               | that work on the universal surveillance industry, and for
               | many of them the alternative is way higher than 80k.
        
               | HeckFeck wrote:
               | So you get a bit richer for less effort... but how do you
               | think moderating legions of spam posts affects the lives
               | of independent website owners, who just want to create
               | communities around the things they love?
               | 
               | Or indeed the users, who have to wade through trash
               | invading their threads?
               | 
               | Or other legitimate users, who now have to answer
               | captchas from CloudFlare just to access their favourite
               | websites?
               | 
               | Ultimately this is a parasitical element, choking the
               | internet. It will kill the things it profits from. Many
               | will give up running these sites, you walk away with your
               | $100k, and no one can ever do it again... you've not
               | created anything of value, but destroyed it.
        
               | Jerrrry wrote:
               | Hard to think anyone who can't solve a captcha is
               | something other than a parasite.
               | 
               | Parasites can solve captchas, people with accessibility
               | issues and the poorest of people are the one being locked
               | out.
        
               | lostlogin wrote:
               | It isn't just those with accessibility problems that get
               | stuck. Some are stupidly difficult- I've given up on one
               | in recent times.
               | 
               | Maybe my disability is CAPTCHA blindness.
        
               | marcosdumay wrote:
               | Captchas are not only for stopping people with
               | disabilities anymore. They also stop people using non-
               | approved browsers, people trying to stay anonymous,
               | people coming from the wrong geographic areas...
        
               | TZubiri wrote:
               | If the AI has access to a credit card, but Mgulu from
               | Nigeria doesn't, then the system doing the filtering
               | might evolve to filter out the 'undesirable' rather than
               | the non human.
        
               | idiotsecant wrote:
               | Yes, people often do unsavory things for money. Is the
               | point you're making that they _shouldn 't do bad things_.
               | Like, what are we even talking about here?
               | 
               | The lesson here is that systems that rely on humans to do
               | the 'moral' thing and fail otherwise are bad systems.
        
               | gosub100 wrote:
               | HN moderates spam without the use of AI-crackable "prove
               | you're a human" bs
        
               | jrflowers wrote:
               | > So you get a bit richer for less effort
               | 
               | If $150k/year is "a bit richer" for you you could simply
               | offer to pay that poster that much in exchange for
               | stopping.
        
               | gopher_space wrote:
               | Let us know who you work for and I'm sure we could find a
               | few stones to throw your way.
        
               | kristopolous wrote:
               | I had no idea I'd see so many people rising in defense of
               | bots and scammers.
        
               | TZubiri wrote:
               | This sounds so out of touch, you are comparing the
               | livelihood of a person with the quality of a meme
               | imageboard.
        
               | rad_gruchalski wrote:
               | So says every drug dealer.
        
               | qqqult wrote:
               | I would go a step further. Solving captchas is strictly
               | worse than selling drugs.
               | 
               | All captcha solvers must be imprisoned for life, or worse
               | forbidden to post on hn
        
               | gosub100 wrote:
               | Everyone's got their price. I would certainly do dirty
               | deeds myself for the right amount.
        
               | BolexNOLA wrote:
               | People who do jobs like that don't really care about the
               | impact
        
               | Der_Einzige wrote:
               | Being able to spam and mind control 4chan would be
               | amazing just for personal reasons, let alone how juicy
               | that is to governments around the world!
        
               | lukas099 wrote:
               | It's working for the Russians.
        
               | noprocrasted wrote:
               | The percentage of captchas used to deter spam is probably
               | a minority these days. A lot of captchas nowadays are
               | used to prevent adversarial interoperability or the free
               | flow of information.
               | 
               | If you want to spam, you don't actually need to break
               | many captchas. Just make your spam/scam/misinformation
               | "engaging" enough and the social media platforms will
               | host and promote your spam _for free_ and won't even ask
               | a captcha.
        
             | ryandrake wrote:
             | Despite the spamming angle, I think CAPTCHA-breaking is, on
             | the balance, noble and honorable work. These things are
             | user-hostile blights on the web, and any effort towards
             | making them disappear as useless is worthwhile. Sites
             | worried about spam should invest more in automated spam
             | classification/elimination instead of punishing real users
             | with CAPTCHA-solving. Not that I can offer a solution--if I
             | could, I'd be a millionaire.
        
               | persnickety wrote:
               | Who do you think spam classification false positives are
               | going to be pubishing if not real users? At least with a
               | captcha, you have some idea that you were rejected before
               | you put in the effort to write your comment.
        
             | jostinian wrote:
             | I am a nafri with a PhD and engineering experience (with
             | europeans), I can't make good living going the traditional
             | way either with with remote jobs being impossible and no
             | luck landing a visa.. I have built custom solutions for big
             | name EU companies to keep an eye on the competition through
             | scraping. captcha solving cloudflare bypass is a great part
             | of that. Getting back at companies making the UX bad with
             | captcha does feel good also.
        
               | ValentinA23 wrote:
               | >I am a nafri
               | 
               | surprised_hitler.gif
        
             | ape4 wrote:
             | $30-150k/year is a big range
        
             | TZubiri wrote:
             | Ahh the good ole dilemma of selling your soul, you study
             | what you love only to destroy it for profit. Like an
             | entomologist hired by a pesticide company.
             | 
             | I get it man, gotta make the bucks helping spammers
             | advertise their shitty products, even if they destroy the
             | internet.
        
               | noprocrasted wrote:
               | What about the spammers that _already_ destroyed the
               | internet by steering it entirely towards advertising  &
               | surveillance capitalism? It's like the pot calling the
               | kettle black.
               | 
               | We're all complicit in the enshittification of the
               | internet and technology in general, just that we delude
               | ourselves into believing we're on the "good" side because
               | we call it "advertising" or "marketing" or "analytics"
               | instead of spam, more spam and spyware.
               | 
               | The end result is exactly the same however.
        
           | delfinom wrote:
           | >he vast majority of spammers would not be able to replicate
           | this;
           | 
           | Eh? They just need to buy their software from someone that
           | can. I would say many of the malware and spamware isn't
           | created by every individual deploying it, but instead vendors
           | that got good at it and decide to make revenue by licensing
           | out their software to other bad actors.
        
           | hamilyon2 wrote:
           | Captchas are now useful to distinguish well-intentioned bots
           | (they stop whenever they see captcha) from malicious ones,
           | which solve them, but still behave a lot like bots.
           | 
           | Well-intentional bots are first-class citizens
        
             | brookst wrote:
             | Wouldn't a well-intentioned bot follow robots.txt anyway?
        
             | lostlogin wrote:
             | Do you complete the circle and do the good bot bad bot
             | classification with a mod bot?
        
           | atomicnumber3 wrote:
           | The watershed of "good enough at programming to just get a
           | real job" vs "can code enough to be really annoying to
           | businesses, but not enough to hack it as a dev" is a lot more
           | on the annoying side than you'd think.
           | 
           | I say this with the chagrin of someone who works on a cool
           | software product that is also coincidentally really well-
           | shaped to make people want to abuse it.
        
           | TZubiri wrote:
           | Interesting, subtle difference but I always thought of
           | captchas as having computational difficulty, but that's
           | clearly not the point as you say. The cost is not compute but
           | developer time.
           | 
           | If you manage crack it at 1mhz per captcha or 1ghz or
           | 1000ghz, it makes no difference, as the bottleneck is the
           | network identifier (ip address/block)
           | 
           | While still a type of PoW, these economics are different than
           | offline mechanisms like password hashing or crypto. Where a
           | 1ghz cost is still significantly different than 1mhz.
        
         | brian-armstrong wrote:
         | Makes me wonder what comes next. Could we create a forum where
         | every member must do a 15 minute video interview with a
         | moderator? I know this "doesn't scale" but I think it could
         | make for a funny gimmick.
        
           | jabroni_salad wrote:
           | private torrent trackers are/were doing that. It was really
           | just to make sure you understood how p2p culture works and
           | what the expectations are, and really easy to pass if you
           | just followed a guide. However, I did see many people fail
           | their interview.
        
             | jmb99 wrote:
             | Was there ever video interviews? Admittedly I wasn't really
             | paying attention but back when I was getting into what it
             | was only IRC, and these days it still seems to be IRC
             | anywhere that does interviews (otherwise class-restricted
             | forum invites).
        
               | jabroni_salad wrote:
               | I dont recall ever seeing that. I dont think anyone doing
               | piracy wants to be photographed or videoed lol. I did get
               | in mumble with some community members but it was just a
               | hangout.
        
             | drexlspivey wrote:
             | The famous RED tracker has a full on technical interview
             | asking about:
             | 
             | * Audio Formats
             | 
             | * Transcoding
             | 
             | * Spectral analysis
             | 
             | and more.
             | 
             | This is the interview prep website:
             | https://interviewfor.red/en/index.html
        
           | bobsmooth wrote:
           | A small signup fee is much easier.
        
             | grishka wrote:
             | But it excludes people who don't have easy access to
             | international banking.
        
           | ggu7hgfk8j wrote:
           | We are increasingly moving to ID checks. Australia law just
           | now. For all its faults it solves spam as side effect.
        
             | ranger_danger wrote:
             | There are lots of random ID documents available on dark
             | networks however.
        
             | qqqult wrote:
             | It also makes it 100x more likely for you IDs to leak
             | online as KYC companies are valuable targets that get
             | hacked every month
        
           | matchamatcha wrote:
           | When I was a teenager, I stumbled upon a music forum that
           | required phone interviews for signing up. They had other
           | interesting sign up rules, like you could not have _silly_
           | user names (judged by the admin). I guess it served as an
           | effective filter for their member base..
        
         | RobotToaster wrote:
         | > so really captchas are more like putting a small min amount
         | of effort.
         | 
         | At that point a proof of work captcha (mCaptcha.org is one, but
         | there are others), is probably the best option. Especially with
         | how any reasonably effective traditional captcha is an
         | accessibility nightmare.
        
           | cubefox wrote:
           | It's completely unclear what a "proof of work" captchas is
           | supposed to be.
        
             | porridgeraisin wrote:
             | Brave search uses it. From my limited understanding, it
             | sends a time-consuming javascript function and its input to
             | your browser, and has your browser calculate the output and
             | send it back. The server matches your output with the
             | expected output. I assume the server would pre-compute in
             | some way? On the spectrum, it leans more towards being a
             | spam-alleviating thing rather than a human-distinguishing
             | thing.
        
               | porridgeraisin wrote:
               | > pre-compute
               | 
               | Or it could be a SAT or something that's easy to verify
               | and hard to solve.
        
               | shreyshnaccount wrote:
               | id think its some kind of proof of sequential work,
               | basically an un-parallelizable calculation that is
               | guaranteed to take a certain number of steps, and making
               | solving thousands of them much harder and hopefully not
               | worth it
        
             | jamesnorden wrote:
             | It's CPU intensive JS code that must run to get an output
             | that must match something server-side, the idea is that it
             | makes attacks/spam not economically viable to run.
        
               | hombre_fatal wrote:
               | The problem is that it doesn't do anything. Maybe you
               | slightly slow down a volumetric spam attack, but you're
               | just putting a sleep() before letting spam through which
               | might be the worst solution. As for economic viability,
               | it's still just a sleep(). Even if it somehow did cost
               | extra money to use more of the CPU, botnets don't even
               | use their own hardware.
               | 
               | And if you make the PoW so hard that it takes very very
               | long to solve then you basically made a captcha that bots
               | have no problem doing (it's just time) and humans don't
               | want to do at all especially on their phone.
        
             | marcosdumay wrote:
             | Almost always it's some variation of "give me a string with
             | the SHA256 hash starting with 0.a471"
        
         | 3abiton wrote:
         | I think captchas are just another lind of defense to make it
         | harder for actors abusing the system. It's not a solution, just
         | a little (getting outdated) fortification.
        
         | poincaredisk wrote:
         | Small? From your own link, recaptcha v3 takes 10-15s and costs
         | $1.3 for 1000 captchas. This is actually huge, and cost
         | prohibitively expensive for many things where you would want to
         | use it (like scrapping a large website).
        
         | nyclounge wrote:
         | Wow Funcaptcha cost the most and it is open source.
        
       | mieko wrote:
       | If you're into this, here's my 2014 breakdown of the Silk Road
       | CAPTCHA: https://github.com/mieko/sr-captcha
        
       | ranger_danger wrote:
       | For those that don't know, the JKCS extension has been doing this
       | for years already:
       | 
       | https://addons.mozilla.org/en-US/firefox/addon/jkcs/
       | 
       | https://chromewebstore.google.com/detail/joshi-koukousei-cap...
       | 
       | Userscript version: https://github.com/drunohazarb/4chan-captcha-
       | solver
        
         | blackjackfoe wrote:
         | I really hope my post didn't come off as if I was trying to
         | make it sound like this was a new idea. Regardless, this is
         | good information, because it counters the posts of the form
         | "great, now that you made this, you're going to make it
         | harder."
        
           | ranger_danger wrote:
           | I didn't look at it that way, just maybe that you (and/or
           | others) might not have been aware of its existence since I
           | didn't see it mentioned anywhere.
        
       | tomcam wrote:
       | If there's one place on the web I would apply anonymity with
       | great diligence, it would be posting any article that might put
       | me at odds with the good people of 4Chan.
       | 
       | mostly kidding! mostly
        
         | blackjackfoe wrote:
         | The 4Chan userbase hates the CAPTCHA as much as I do :)
        
           | snvzz wrote:
           | This, but unironically.
        
       | asynchronous wrote:
       | [meta] what blog site is this? Is it a joint among authors? I
       | can't find more information on their GitHub. Looks neat.
        
         | nullpt_rs wrote:
         | I (veritas) run the blog but accept contributions from anyone.
         | The blog itself is open source :-) https://github.com/nullpt-
         | rs/blog
        
       | matrix87 wrote:
       | the blacked out minimalist aesthetic on this site looks really
       | cool
        
         | bhasi wrote:
         | I really like it too. I'm always excited to see the themes of
         | personal and other tech blogs I come across here.
        
       | tomxor wrote:
       | Bet it can't break reCAPTCHA on a VPN.
       | 
       | [edit]
       | 
       | More specifically I mean when they insidiously give you infinite
       | tests even though it's impossible to pass because the IP has been
       | blacklisted... There's a special place in hell for the anti-
       | human's that made that decision, and yes it involves captcha.
        
         | blackjackfoe wrote:
         | I would also be inclined to believe that my project to solve
         | the proprietary 4Chan text CAPTCHA cannot solve an unrelated
         | image CAPTCHA. I'd bet a lot of money on it, in fact!
        
       | kattagarian wrote:
       | I remember trying to use 4chan once and i couldn't even pass
       | through the captcha.
        
         | morkalork wrote:
         | I remember using it before it had a captcha
        
           | HaZeust wrote:
           | There was a chaotic neutral time in my life where I used it
           | daily for an extended period of time; and then found myself
           | out of that rut and would only go back to see unhinged takes
           | on a particular current event that I was interested in seeing
           | the hivemind's thoughts on. Each and every time I went back,
           | and tried to contribute to a thread, the Captchas and the
           | CloudFlare checks were increasingly intrusive.
           | 
           | During this election, I completely gave up even trying to
           | participate and just lurked.
        
           | __turbobrew__ wrote:
           | I tried to post and it gave me a 900 second cooldown, not
           | even on vpn. I too remember the good old days when there was
           | no capcha.
        
           | not_your_vase wrote:
           | ^       ^ ^
        
       | somat wrote:
       | I wonder if it would be better to pretend to have a captcha but
       | really you are analysing the user timing and actions. Honestly I
       | half suspect this is already going on.
       | 
       | If you wanted to go full meta "never go full meta" you would
       | train a AI to figure out if the agent on the other side was human
       | or not. that is, invent the reverse turing test. it's a human if
       | the ai is unable to differentiate it's responses from normal
       | humans responses. as opposed to marketing human responses.
       | 
       | Well now I have to go have a lay down, I feel a little ill from
       | even thinking on the subject.
        
         | kccqzy wrote:
         | That's what reCAPTCHA does.
        
         | wraptile wrote:
         | That's kinda what every major captcha distributor does already!
         | 
         | Even before captcha is being served your TLS is first
         | fingerprinted, then your IP, then your HTTP2, then your
         | request, then your javascript environment (including font and
         | image rendering capabilities) and browser itself. These are
         | used to calculate a trust score which determines whether
         | captcha will be served at all. Only then it makes sense to
         | analyze captcha's input but by that time you caught 90% of bots
         | either way.
         | 
         | The amount your browser can tell about you to any server
         | without your awareness is insane to the point where every
         | single one us probably has a more unique digital fingerprint
         | than our very own physical fingerprint!
        
           | zoltrix303 wrote:
           | Would it be possible to serve a fake fingerprint that appears
           | legitimate? Or even better mimic the finger print of real
           | users who've visited a site you own for example?
        
             | nullpt_rs wrote:
             | yep, but it can get tricky.
             | 
             | some projects worth checking out:
             | https://github.com/refraction-networking/utls
             | https://github.com/berstend/puppeteer-extra
        
             | barbolo wrote:
             | https://github.com/lwthiker/curl-impersonate
        
           | encom wrote:
           | This is how ClownFlare and its ilk, make life hell on the
           | internet, when you use a "weird" browser on a "weird" OS.
        
             | jeroenhd wrote:
             | My experience is that IP reputation does a lot more for
             | Cloudflare than browsers ever did. I tried to see if they'd
             | block me for using Ladybird and Servo, two unfinished
             | browsers (Ladybird used to even have its own TLS stack),
             | but I passed just fine. Public WiFi in restaurants and
             | shared train WiFi often gets me jumping through hoops even
             | in normal Firefox, though.
             | 
             | I can't imagine what the internet must be like if you're
             | still on CG-NAT, sharing an IP address with bots and
             | spammers and people using those "free VPN" extensions
             | donating their bandwidth to botnets.
        
           | PUSH_AX wrote:
           | In that case why do I ever receive a captcha?
        
             | Pikamander2 wrote:
             | It adds another layer of analysis. For example:
             | 
             | If the user solves the CAPTCHA in 0.0001 seconds, they're
             | definitely a bot.
             | 
             | If the user keeps solving every CAPTCHA in exactly 2.0000
             | seconds, each time makes it increasingly likely that
             | they're a bot.
             | 
             | If the user sets the CAPTCHA entry's input.value property
             | directly instead of firing individual key press events with
             | keycodes, they're probably either a bot, copy-pasting the
             | solution, or using some kind of non-standard keyboard
             | (maybe accessibility software?).
             | 
             | Basically, even if the CAPTCHA service already has a decent
             | idea of whether the user is a bot, forcing them to solve a
             | CAPTCHA gives the service more data to work with and
             | increases the barrier of entry for bot makers.
        
             | sdk16420 wrote:
             | I found several websites switched to 'press here until the
             | timer runs out', probably they are doing the checks while
             | the user is holding their mouse pressed, it would be
             | trivial to bypass the long press by itself with automated
             | mouse clickers.
        
           | gosub100 wrote:
           | Re: your last paragraph, https://coveryourtracks.eff.org/
           | 
           | EFF have been running this for years. Gives an estimate about
           | how many unique traits your browser has. Even things like
           | screen resolution are measured.
        
       | NoMoreNicksLeft wrote:
       | I suspect really strongly that the available characters in the
       | 4chan captcha were chose to be able to spell out the most
       | racist/nazi/extreme slurs and slogans imaginable. For instance,
       | not all numerals are ever used, but 1, 4, and 8 are. K is often
       | there, and whatever the algo is, pseudorandom or not, it often
       | doubles/triples characters. I've personally seen "kkk" twice over
       | the years. Mind you, it does _seem_ random. But even randomly,
       | these must happen often enough to set that crowd off, they make a
       | game of posting a screenshot of the  "good ones".
        
         | blackjackfoe wrote:
         | All the worst slurs I can think of in my limited vocabulary
         | can't even be spelled with the characters available. I suspect
         | the opposite - they might have been chosen to _avoid_ spelling
         | things like that.
        
           | NoMoreNicksLeft wrote:
           | You either know some radioactively hot slurs, or you've just
           | not hung out there enough. Only the "i" is missing, and a
           | week doesn't go by that someone doesn't post it with the 1
           | instead. Granted, I think that one's a repost (never bothered
           | to try to check).
        
         | BriggyDwiggs42 wrote:
         | Oh no you're probably on the money
        
         | Der_Einzige wrote:
         | 4chan was gaming the previous captchas for awhile to label some
         | of the data with racial slurs, as they had discovered the
         | threshold that you're allowed to be wrong by, and were
         | aggressively abusing it.
        
       | unit149 wrote:
       | Parsing the visualization data, within a JSON script tasked with
       | parsing it is a complex endeavor when the site requires verifying
       | email.
       | 
       | If the JSON file is corrupt, it shows the following if tt1 and cd
       | do not align.
       | 
       | > "error": "You have to wait a while before doing this again"
        
       | Dachande663 wrote:
       | Semi-related but I needed a CAPTCHA on my site[0] mainly to block
       | comment form spam and settled on repurposing a fun method I'd
       | seen before. Is definitely not foolproof (or hard at all), but I
       | really liked making it.
       | 
       | [0] https://www.hybridlogic.co.uk/contact
        
         | winrid wrote:
         | It says I've been blocked when I try to view that. Not on a
         | VPN.
        
           | EasyMark wrote:
           | Are you in a safari browser?
        
             | winrid wrote:
             | Chrome android
        
           | Dachande663 wrote:
           | The site runs off of a tiny little server at home so I've got
           | some very aggressive firewall rules. Anything from the usual
           | bad countries, certain signatures etc are blocked. Reduced
           | traffic to 1% of previous load.
        
             | efilife wrote:
             | What are the bad countries? Russia and china?
        
         | chamomeal wrote:
         | No way, that is a cool fucking captcha!!
        
         | vunderba wrote:
         | Reminds me of the Doom captcha.
         | 
         | https://vivirenremoto.github.io/doomcaptcha/
        
           | Dachande663 wrote:
           | 99% certain this is where I copied the idea from.
        
       | benreesman wrote:
       | In my opinion the granddaddy of all 4chan CAPTCHA busts is still
       | Yannick Kilcher's GPT-J tune on "Raiders of the Lost Kek" set,
       | and might be the coolest thing an LLM has ever done on video:
       | https://youtu.be/efPrtcLdcdM?si=errY0PrEhnX9ylDw
        
         | chiph wrote:
         | Nearly a full minute of disclaimers and warnings about 4chan.
         | That's got to be a record.
        
       | smithcoin wrote:
       | I'll never forget spending the evening of the 2016 election on
       | /pol/
        
         | BrandonY wrote:
         | What happened?
        
           | poincaredisk wrote:
           | A lot of memes and shitposting, I assume. /pol/ was always
           | political, pro-trump, and according to some was even
           | important enough to influence elections. I find that claim
           | dubious, but it's true that many pro-trump memes (and memes
           | in general) were created on 4chan.
        
             | tovej wrote:
             | /pol/ is 100% a big factor in the rising popularity of MAGA
             | and far-right nationalist sentiment among young men
        
               | Der_Einzige wrote:
               | I've never seen such a small group of people have such a
               | big impact on world affairs.
               | 
               | 4chan pol has straight up mainstreamed most incel talking
               | points to young boys all accross the world.
        
               | s777 wrote:
               | I know people personally who recently graduated high
               | school and went down the 4chan rabbithole because they
               | wanted to be "edgy", then they got comfortable with the
               | extremely racist attitudes they were promoting
        
               | TZubiri wrote:
               | To what extent is it a factor as in the cause, and to
               | what extent is it just an organic manifestation of the
               | desires of the people?
               | 
               | You can apply this to most social media, but in the
               | spectrum of wikipedia (the people control the content) to
               | netflix(the private owners control the content), I'd
               | think 4chan would be closer to wikipedia.
        
         | trallnag wrote:
         | Made a profit of 40 bucks betting 10 bucks on Trump that
         | evening / night
        
       | Pikamander2 wrote:
       | > The official TensorFlow-to-TFJS model converter doesn't work on
       | Python 3.12. This doesn't seem to really be documented.
       | 
       | > TensorFlow.js doesn't support Keras 3.
       | 
       | I tried getting into some casual machine learning stuff a few
       | years ago and more or less gave up because of stuff like this. It
       | was staggering how many recent tutorials were already outdated,
       | how many random pitfalls there were, and how many "getting
       | started" guides assumed you were already an expert.
        
         | sigmoid10 wrote:
         | As someone who has been working in ML for years, I can only
         | recommend to stay away from anything recent. Grab an old
         | bayesian statistics textbook and learn the fundamentals, then
         | progress to learning the major frameworks like Pytorch. Try to
         | write every part of a CNN, RNN and Transformer architecture and
         | training pipeline yourself the first time (including data
         | loaders, but maybe leave out CUDA matrix kernels). Stay the
         | hell away from wrappers for other people's wrappers like
         | Langchain. Their documentation is often not just outdated, but
         | flat out wrong regarding the fundamentals. Huggingface is great
         | if you know the basics and thus how to fix things if their
         | standard wrappers break.
        
           | rohansuri wrote:
           | Any book you would recommend?
        
             | sigmoid10 wrote:
             | You can try Theodoridis if you can find a first or second
             | edition. It is old enough to not be diluted by the recent
             | craze but still recent enough to cover all the necessary
             | fundamentals. There is also a new edition coming out soon,
             | but that seems to have been heavily tainted by the ChatGPT
             | hype.
        
       | chistev wrote:
       | Man, is there anything computers won't be able to break!
       | 
       | crazy
        
       | brodo wrote:
       | I'm asking myself if a post titled "Breaking the Stormfront
       | CAPTCHA" would lead to the same discussion. Maybe I should spend
       | less time on this website.
        
       | nfRfqX5n wrote:
       | Hi veritas
        
       | Yeul wrote:
       | I understand why Cloudflare has to exist. But its beyond annoying
       | that it forces you into using an unmodified Chrome sans VPN.
        
       | thrance wrote:
       | 4Chan is probably one of the only social platforms where genuiune
       | users and russian bots share the same views, why even bother with
       | CAPTCHAs?
        
       | cubefox wrote:
       | Not a word on how describing and releasing this code is obviously
       | unethical!? Captchas have a legitimate use to keep bots out.
        
       | Alifatisk wrote:
       | If there is one blog I've fell in love it, it's nullpt.rs. Still
       | waiting for part 2 of Reverse Engineering Tiktok's VM Obfuscation
        
       | 2Gkashmiri wrote:
       | Hey dude. Any idea if 1000 labelled images are good enough for
       | training and how much time it would take to train on a a40 nvidia
       | like on https://www.runpod.io/pricing ?
        
       | axpy906 wrote:
       | It's nice to see this posted and interesting that it's in
       | tensorflow. I wonder for how many years the capture was already
       | broken but not just posted about publicly.
        
       | b8 wrote:
       | Glad to see Blackjack and Jordin. We used to hack on Minecraft
       | together. nullpt.rs and secret.club are full of former video game
       | hackers :)
        
       | m3kw9 wrote:
       | Very tasteful title animation I must say. It's fast enough, you
       | feel it, and not distracting, gives a vibe even from glancing
        
       | mgaunard wrote:
       | I remember when they introduced their new captcha; it was so
       | tedious to solve it I stopped interacting there entirely.
        
       ___________________________________________________________________
       (page generated 2024-11-30 23:00 UTC)