hngopher.com

       [HN Gopher] Ask HN: Why does Cloudflare/hCaptcha care so much ab...
       ___________________________________________________________________
        
       Ask HN: Why does Cloudflare/hCaptcha care so much about buses,
       boats and trains?
        
       It seems all the hCaptcha verifications I receive are for buses,
       boats and trains? They don't seem limited by geography or by
       recency. I'm curious why these particular artifacts and whether
       this has always been the case.
        
       Author : erikig
       Score  : 297 points
       Date   : 2022-01-07 13:36 UTC (9 hours ago)
        
       | sys_64738 wrote:
       | Can't AI be leveraged to get around these automatically?
        
         | simplestats wrote:
         | Yes. They suck at preventing decent algorithms from getting
         | through. They are a way to gather data and filter out most of
         | the least-sophisticated bots. Not any kind of real security.
        
       | blarg1 wrote:
       | it keeps asking me to click the images with john connor.
        
       | duxup wrote:
       | I always assumed it is used to have humans validate choices made
       | my AI / imagine recognition software.
       | 
       | Thus the occasional wrong "correct" answers.
        
       | maartenh wrote:
       | They might all be very relevant if the machine learning algorithm
       | behind it decides that it needs more paper clips.
        
         | cube00 wrote:
         | ...or underpants
        
       | fault1 wrote:
       | Training image classifiers?
        
       | reustle wrote:
       | You're helping train self driving car models.
       | 
       | https://www.ceros.com/inspire/originals/recaptcha-waymo-futu...
        
         | asplake wrote:
         | "Which of these images appears to contain a stationary
         | obstruction?"
        
           | [deleted]
        
           | [deleted]
        
         | tgsovlerkhgsel wrote:
         | Relevent xkcd: https://xkcd.com/1897/
        
         | mikkelam wrote:
         | OP is talking about hCaptcha, not google's reCaptcha. Besides,
         | reCaptcha is not being used for that anymore. They probably
         | stopped doing that a while ago.
         | 
         | src: https://www.vox.com/22436832/captchas-getting-harder-ai-
         | arti...
        
         | xdennis wrote:
         | But why the boats then? Are there self rowing boats?
        
           | theklub wrote:
           | Boats are on the road too. In the form of being on trailers,
           | Etc.
        
             | ldiracdelta wrote:
             | And don't forget sovereign citizens. They're piloting their
             | vessels using maritime law on every modern highway.
        
           | notanote wrote:
           | Or sea planes. That was what I was asked for most recently.
        
           | etripe wrote:
           | Maybe the next big step is creating AI-driven cargo ships
           | that can independently get stuck in the Suez canal.
           | 
           | On a more serious note, I can't shake the impression that
           | would be a logical next step for all long and medium distance
           | freight, be it road, water, air or space. Whether it's a good
           | or mature idea is anyone's guess.
        
           | jre wrote:
           | I wonder if some of the classes are just easy to detect
           | objects that they're using to assess the accuracy of the
           | user.
        
           | ghostly_s wrote:
           | There sure are: https://www.marksetbot.com/
        
       | gzer0 wrote:
       | https://www.hcaptcha.com/accessibility
       | 
       | You can sign up as an accessibility user and set a daily hCaptcha
       | cookie that lets you instantly avoid the captcha (obviously,
       | strict limits to not be abused) but good enough for myself!
        
       | nottorp wrote:
       | Every time i get a captcha i imagine a self driving car prototype
       | stuck in an intersection waiting for me to click so it can decide
       | how to proceed.
        
       | redleader55 wrote:
       | I assume the system works by matching answers from humans eager
       | to prove their "humanity" by giving correct answers. What if we
       | would all collude to give wrong answers?
        
       | dmix wrote:
       | It also asks for motorcycles and bikes.
       | 
       | The obvious answer as others have pointed out is they are selling
       | it to self driving car companies like Waymo.
        
       | cyanotes wrote:
        
       | chimen wrote:
       | I just hit the back button when I encounter such a website hosted
       | by Cloudflare this way.
        
       | rdtwo wrote:
       | They also do cats. Honestly I think boats abs busses are just
       | harder problems. A lot of the boats can only be identified
       | because there is water in the photo or some other hint that it's
       | a boat. A lot of the trains look like busses and got need
       | contextual clues to tell them apart.
        
       | chrxr wrote:
       | My small act of rebellion is to select exactly one incorrect cell
       | each time.
        
       | egberts1 wrote:
       | For once, I want to see NSFW Captchas.
        
         | josefresco wrote:
         | NSFW captchas on adult sites would be hilarious but useful?
        
           | thrownaway7365 wrote:
        
       | jdavis703 wrote:
       | First, I'm going to teach you to fish. Go to hCaptcha's website,
       | then scroll to the footer. Click around on the about links. It'll
       | reveal their business model. This trick also works for other
       | businesses and NGOs.
       | 
       | Now, if we look at https://www.hcaptcha.com/labeling we can tell
       | they make money by labeling data sets for a fee. So as a guess,
       | there's someone out there that needs to improve computer vision
       | detection of transportation vehicles. My guess is it's a self
       | driving car company, but who knows.
        
         | yumraj wrote:
         | That is exactly what Google uses/used-to-use the captcha for.
         | It is/was fairly well known/understood.
        
         | rawling wrote:
         | > My guess is it's a self driving car company, but who knows.
         | 
         | As always, https://xkcd.com/1897
        
         | snihalani wrote:
         | confession: sometimes I mislabel just so I can corrupt the
         | dataset
        
         | potamic wrote:
         | Many a time I receive multiple challenges on a site despite
         | having selected all images perfectly, and can't help but
         | wonder, "Hey, are they getting me to do more work than
         | necessary because they're running behind on their labelling
         | backlog?". There's definitely a conflict of incentives in this
         | case. If you're a website owner, you're better off choosing a
         | different service which doesn't have adverse incentives,
         | otherwise it can affect your site experience. And please don't
         | put captcha on GET requests. Use a CDN if you're unable to
         | handle bot load. And don't even get me started on CDNs that
         | throw captcha.
        
           | MegaDeKay wrote:
           | I've found it isn't about "perfection". It is about selecting
           | the same tiles as an "average" person would. I might stare
           | hard at an image, think that one of the tiles contains a tiny
           | fragment of a traffic light, and select it. That isn't what
           | most other people have already done, so the captcha thinks
           | I'm a bot and gives me tougher and tougher challenges. Ever
           | since I stopped pixel-peeping and started quickly selecting
           | the tiles that obviously had a bus in them, the percentage of
           | time that I've gotten by first try has gone way up.
        
             | hbarka wrote:
             | This. Captcha wants me to choose crosswalks and you can see
             | there's that sliver of a crosswalk in a few pixels off in
             | another tile. You're not wrong! But you're not right.
             | Regression to the mean.
             | 
             | A hexagon would be better as a frame instead of a square.
        
             | bogomipz wrote:
             | I've noticed similar. Often with stop lights, where a tiny
             | sliver of one does not neatly fit in the frame, spilling
             | over ever so slightly to the next square which has no stop
             | light otherwise. There's a none too subtle irony in that
             | one is being punished for accuracy when the context is
             | ultimately public safety.
        
             | klyrs wrote:
             | I kinda go the other way -- could a FFT heuristic mistake
             | this feature for a crosswalk? Then I'll select it, whether
             | or not it's actually crosswalk. Most of the time, this
             | works. It's a stick in the eye of our prenatal robot
             | overlord.
        
             | an_ko wrote:
             | I wonder if this means self-driving vehicles' detection of
             | important traffic features will be at the level of an
             | irritated and disinterested web user who is trying to just
             | do the minimum work to please an algorithm.
        
               | Kletiomdm wrote:
               | They probably have some statistics in the background
               | which tells them some form of Trustlevel.
               | 
               | You also need to assume that there are potentially
               | control pictures in it as well.
               | 
               | I think this is a very liable approach.
        
               | wfleming wrote:
               | I think you meant "viable", not "liable". Given the
               | discussion, your typo is ironically amusing, though.
        
               | ncann wrote:
               | Yeah, it's not like they will label something a train
               | just because a single person says so. But if you have 10k
               | responses with 95% confidence saying it's a train, it's
               | very likely to be the case.
        
               | organsnyder wrote:
               | But GP is describing exactly the opposite: there might be
               | a train that's not immediately visible at a glance,
               | leading most people to not label it.
        
               | not2b wrote:
               | For unambiguous images almost all humans will label them
               | the same way. For ambiguous ones humans will differ.
               | Presumably they'll accumulate stats on each image and
               | will be able to detect cases like this.
        
               | rolph wrote:
               | unless a properly obfusicated bot net has seeded the data
               | set with -everything is a train- responses to the tune of
               | >>10k responses with 95% confidence saying it's a train<<
        
               | WrtCdEvrydy wrote:
               | But it will be cheap! - The sound of 1000 business
               | C-levels as your head gets removed by going under a
               | truck.
        
               | [deleted]
        
               | luckydata wrote:
               | That's about as much attention the average driver pays
               | anyways. I drive a motorcycle, I KNOW that.
        
               | Shared404 wrote:
               | That's an interesting idea - maybe it would be smart to
               | have captcha's do a "Point out all the motorcycles" or
               | "Click all the pedestrians" setup.
        
               | pbhjpbhj wrote:
               | _I love a rant on this one..._
               | 
               | Sometimes the bus/boat/truck has motorbikes sometimes
               | bicycles. Is that a petrol-powered bicycle, or a
               | motorbike to the [USAmerican?] person who wrote the
               | rules!? Are all large yellow vehicles buses in USA or do
               | you have minibuses, oh wait, are minibuses buses.
               | 
               | I've worked out fire engines are trucks for captchas, not
               | sure about Transit-type vehicles, lorries are trucks
               | apparently but goods trucks on railways are not trucks!
               | 
               | Is a traffic light only the lens/led array or the black
               | light-holder too? Do pedestrian lights count as traffic
               | lights? Are those weird lights hanging in the middle of
               | junctions 'traffic lights'.
               | 
               | Wish they'd just tell you what counts.
               | 
               | I have noticed that times I realise after clicking that I
               | missed a square they tend to go through whilst many times
               | I get repeated captchas when I know I got it right.
               | Success, as a user, seems impossible to predict.
        
               | HeyLaughingBoy wrote:
               | > lorries are trucks apparently but goods trucks on
               | railways are not trucks!
               | 
               | In the US, lorry == truck. Never heard the term goods
               | truck before today, but I think it's what we call a
               | boxcar.
        
               | _moof wrote:
               | I was always tempted to knock on people's driver-side
               | windows when I saw them looking at their phone. Never did
               | - figured they'd probably startle, with a non-zero chance
               | they'd accidentally fling the car into me.
        
               | HeyLaughingBoy wrote:
               | I yell at them. Loudly! Loud enough that people a block
               | away turn to look.
               | 
               | But then, when you're staring at your phone while driving
               | your car out of a parking lot and across the sidewalk
               | where you only miss hitting me (before driving into
               | oncoming traffic!) because I stopped, well you deserve
               | that minor inconvenience of being embarrassed.
        
               | kijin wrote:
               | Even if you're right, I doubt it's going to be much worse
               | than the level of the average irritated driver.
        
               | FredPret wrote:
               | Pretty much a regular driver then!
        
               | _moof wrote:
               | Heck, better than a regular driver. At least what we're
               | looking at is outside!
        
               | lapetitejort wrote:
               | I'd rather that than a computer hemming and hawing over
               | whether a single pixel is an oncoming truck and not
               | turning just in case.
        
               | masukomi wrote:
               | a computer "hemming and hawing" as that one accident
               | where it couldn't decide if it was a bicycle or a person
               | has nothing to do with the training. It's what the
               | developers decided to do with input that had a low
               | confidence score. There will ALWAYS be low-confidence
               | ratings on real world data regardless of how good your
               | training is.
               | 
               | Instead of saying "oh crap there's SOMETHING there we
               | should stop" they said "huh, no let's loop on testing it
               | until we figure it out or run it over....whichever comes
               | first."
        
               | seg_lol wrote:
               | Also if the car wants to stop too much because of low
               | confidence, just turn the brakes off.
        
             | WHA8m wrote:
             | Same experience. Once I observed that most (about 9/10)
             | times there are only 3 tiles to select, I stopped looking
             | for a 4th and selected only the 3 most obvious.
        
             | lanstin wrote:
             | Replying to an_ko's sibling comment:just like the data
             | behind youtube music recommendations, populated by data
             | carefully analysed from legions of bored toddler clicks vs.
             | Spotify's obsessive teenager music curation
        
           | stefantalpalaru wrote:
        
           | dmix wrote:
           | Whoever made this new captcha I'm seeing starting to see
           | everywhere:
           | 
           | https://imgur.com/a/hoyjctl
           | 
           | Thank you! itsso much easier than being a labeling bot for
           | self driving cars.
        
             | donkarma wrote:
             | yeah you won't be loving this one where they make you do 10
             | in a row and if you get one wrong you start again with 11
             | this time. also it'll fail you at random
        
             | NavinF wrote:
             | That looks pretty easy for machines. I wouldn't be
             | surprised if CLIP could solve that out of the box. (Then
             | again, I guess the same applies to "select all the traffic
             | lights")
        
           | azalemeth wrote:
           | Google is _the worst_ for that. At least hCaptcha is a bit
           | less culturally specific.
           | 
           | Every time Google blocks me for refusing to label a motorbike
           | as a "bicycle" I get utterly pissed off. And likewise with
           | the traffic lights on the californian skies. Are the traffic
           | lights the actual lights themselves, or the boom holding them
           | up?
           | 
           | I'm not a human very often, according to Google. hCaptcha
           | _tends_ to let me in...
        
             | ValentineC wrote:
             | > _Google is _the worst_ for that. At least hCaptcha is a
             | bit less culturally specific._
             | 
             | One example I keep ranting about: I think countries outside
             | the US have different terms for "crosswalks".
             | 
             | I personally know them as "zebra crossings", and it took a
             | while for the reCaptcha request to click in my mind.
        
               | Handytinge wrote:
               | Pedestrian crossings here. I'd also not heard the term
               | "crosswalk" until reCaptcha. Yet another bullshit
               | Americanisation infecting the worlds culture.
        
             | Tijdreiziger wrote:
             | > I'm not a human very often, according to Google.
             | 
             | "On the Internet, nobody knows you're a dog." (Well, except
             | Google, it seems.)
        
             | reaperducer wrote:
             | _I 'm not a human very often, according to Google_
             | 
             | If Google says you're a robot, it must be true! You should
             | behave accordingly.
             | 
             | There's actually a comic strip in the newspaper going
             | through this storyline right now. _Brewster Rockit: Space
             | Guy!_ was told by a CAPTCHA that he 's a robot, so he's
             | going through life that way. The other robots do not seem
             | to be happy to have him as part of their culture.
             | 
             | http://www.brewsterrockit.com
        
             | rosndo wrote:
        
               | geoduck14 wrote:
               | Your comment suggests you would exchange money for an
               | assassination. I know you are joking (right?), but this
               | is not something that you should joke about.
        
               | rosndo wrote:
               | Oh no, I'd absolutely chip in. These people certainly
               | deserve it.
        
               | skinnymuch wrote:
               | Shouldn't do this no matter what ofc. But why the devs?
               | Sure the devs are well paid and privileged. That's mostly
               | relative to others in society. They are still more cogs
               | than anything.
        
               | rosndo wrote:
               | Easier target, a couple of google devs hanging on a
               | public square would do much to disincentivize others from
               | working on similar products in the future.
               | 
               | At least executives can dream of hiding behind private
               | security, for mere developers earning 300k/yr the
               | situation isn't so rosy.
        
               | azalemeth wrote:
               | Don't forget that Recaptcha magically works better in
               | Chrome, and even bettery-better if you're logged into a
               | Google account. In FF (with tracking protection) you can
               | expect to see the enforced wait and "Please try again".
               | Honestly, it's awful. Half the time I have to second-
               | guess what the average American would think which of the
               | (noise-added, corrupted) images matches the description.
        
               | deadbunny wrote:
               | The reCaptcha check boxes straight up fail for me now in
               | FF on Linux, have done for a few months.
        
             | rpaddock wrote:
             | They also claim that Tigers are not "Cats".
             | 
             | "Please select all the Cats" then shows picture of a tiger
             | among the common House Cats.
        
             | gnabgib wrote:
             | I recently failed a google-bicycle captcha.. "you can't
             | fool me, that's a motorcycle not a bicycle!" I thought..
             | and then had to complete 2 more challenges. Including a
             | cross walk one where one of the images was just asphalt and
             | painted lines (no context/edges so it could be a parking
             | lot, and airstrip,a highway, an intersection, or a cross-
             | walk).
        
             | jokethrowaway wrote:
             | I disagree, Google captchas in my experience are much
             | quicker compared to hcaptcha
             | 
             | When cloudflare switched to hcaptcha I definitely noticed
             | it.
        
               | notatoad wrote:
               | people tend to have very different experiences with
               | google captchas based on how normal they are. if you
               | block everything and try to anonymize your browsing as
               | much as possible and otherwise do everything you can to
               | look like a bot, you're going to get a very difficult
               | captcha to somebody with all their browser settings on
               | default.
        
               | throwawayboise wrote:
               | Yeah I do my routine browsing in private mode, no 3rd
               | party cookies, no history, and an ad blocker. I get
               | captchas everwhere.
        
               | jorvi wrote:
               | Yup. This reminds me of the 'introduction' of an old
               | hacker simulation game from 2004 that was quite
               | prescient.
               | 
               | " In the year 2012, the corporations of the world paved
               | over the Internet, designing their own network system.
               | Keeping the same name, they developed a system where
               | every piece of information was audited and paid for
               | before it was passed on to the world at large. Those who
               | still followed the ideology of an open and uncontrolled
               | Internet gathered what resources they could and formed
               | the SwitchNet. Build mostly out of discarded technologies
               | and backdoors in the current Internet, it allowed some
               | manner of uncontrolled communication around the world.
               | The "Hacker Outpost" is in need of new recruits to
               | perform missions in information gathering against the
               | corporations, which will allow them to increase the
               | presence of the SwitchNet in the world."
               | 
               | And the slightly different press release one: " In 2012,
               | a new Internet was introduced--one that prohibited users
               | from posting anything on personal home pages, prohibited
               | them from using software of their choice, and from having
               | an e-mail address. Having no place to stay, hackers
               | created the SwitchNet, an underground network operating
               | on the old wires and infrastructure of the original
               | Internet"
        
               | SXX wrote:
               | Sorry, but Google captcha is specifically designed to
               | annoy real people in some cases. They literally
               | implemented slow fade-in / fade-out for images. This does
               | absolutely nothing against actual bots, but annoying as
               | hell for a real person.
               | 
               | Literally any other captcha is better than this.
        
               | [deleted]
        
               | jandrese wrote:
               | I thought the fade thing was specifically to trip up
               | bots. Like bots know what the picture is long before it
               | is shown to the user, so if the bot clicks on it then the
               | CAPCHA knows something is up.
        
               | justsomehnguy wrote:
               | Which is solved exactly after this is encountered the
               | first time, eg
               | 
               | if opacity -ne 100% do_not_click_yet = true
               | 
               | So this is totally useless to prevent _bots_ from solving
               | it.
        
               | pbhjpbhj wrote:
               | Surely then they look at timing, people will click
               | anywhere from, let's say 50% opacity, but bots always
               | wait?
        
               | bluGill wrote:
               | Easy for a bot to fake that with a random number
               | generator. If nothing else bot authors can collect their
               | own statistics. I understand the bots have an army of
               | people in the background for images they don't understand
               | yet, just collect timing data from that set and have your
               | random number generator emulate that timing data. (I'm
               | guessing a bell curve)
        
               | SXX wrote:
               | Actually having army of people it's exactly how complex
               | recaptha's are bypassed. It's less than $1 for 1000
               | captcha:
               | 
               | https://anti-captcha.com/
               | 
               | There of course options with image recognition, but
               | they're less reliable.
        
               | SXX wrote:
               | Do you seriously think that people who programm the bots
               | incapable of taking it into account?
               | 
               | Bypass for this fading was obviously implementrd next day
               | this first appear on reCaptcha.
        
             | MrCapybara wrote:
             | That same case happened to me! Another example is Parking
             | Meters, they don't exist in my country and I'd never seen
             | them before.
        
             | short12 wrote:
             | Googles own captchas are Satan. Squiggly lines all over the
             | place. Why they don't use the normally accepted captcha is
             | beyond me
        
           | zenexer wrote:
           | I've noticed that some sites deliberately do this or have
           | lousy code that fails to properly acknowledge captcha
           | completions.
           | 
           | Take archive.is/archive.fo/archive.today, for example. If
           | you're using Cloudflare DNS (1.1.1.1) or iCloud Private
           | Relay, and you visit https://archive.is/, you'll get what
           | looks like a Cloudflare screening page. It's not, though:
           | that page is part of archive.is and is served to Cloudflare
           | DNS users (which includes iCloud Private Relay users)--the
           | use of reCAPTCHA in place of hCaptcha is a giveaway. You can
           | complete the captcha as many times as you like, but you'll
           | never get in.
           | 
           | And how many times have we completed a captcha on a form only
           | to have it throw another captcha in our face without so much
           | as an error message? Sometimes it's just lousy code.
        
             | tough wrote:
             | I remember reading the CF founders about how archive.is
             | didn't want cloudflare dns users to resolve to archive, so
             | they respect that.
             | 
             | https://news.ycombinator.com/item?id=28495204
        
           | jjoonathan wrote:
           | There's also a mode where it thinks you are a bot/sucker and
           | gives you unlimited images until you give up. That's always
           | fun.
        
             | tzs wrote:
             | If anyone wants to see that, try launching the browser via
             | Selenium. I used to do that to partially automate some
             | activities, such as download bank statements. I'd have my
             | Selenium using script open a browser and go to the bank,
             | then wait for me to login and get to the account page.
             | 
             | I'd login, dismiss any popup or interstitial promotions the
             | bank decided to give me, get to the account page, and tell
             | my script to continue.
             | 
             | My script would then use Selenium to click the download
             | button, click the "custom date range" radio button on
             | download popup, fill in the range fields to cover the last
             | 60 days, pick OFX for the download format, and start the
             | download, prompting me to let it know when the download is
             | finished.
             | 
             | When the download finished, I could then go to one of my
             | other accounts at that bank, tell the script I'm there, and
             | that one gets downloaded, and so on.
             | 
             | My bank isn't giving CAPTCHAs so that would still work if I
             | were to get around to updating my script to deal with some
             | redesigns they did of their pages which broke finding the
             | relevant elements on the page.
             | 
             | But I've found that if I do visit a site that uses hCaptcha
             | while using the Selenium launched browser, it seems to get
             | stuck. Click to tell it I'm not a bot. Then get an image
             | test. Answer that correctly and get another image test.
             | Answer that correctly. Then it goes back to the click if
             | you are not a bot thing, and repeats--two more image tests
             | and back to the beginning.
             | 
             | Here's a program if anyone wants to try this and has the
             | Selenium Webdriver package for Python3 installed. This will
             | open a browser and take you to fanfiction.net. Trying to
             | actually read any story will bring up the CAPTCHA.
             | #!/usr/bin/env python3       from selenium.webdriver import
             | Chrome            driver = Chrome()
             | driver.get("https://www.fanfiction.net")       input("press
             | enter when done")       driver.close()       driver.quit()
             | 
             | I'm not sure if the looping is a Cloudflare thing or a
             | fanfiction.net thing, because the latter is the only site I
             | use that has Cloudflare's CAPTCHA.
             | 
             | It used to be that if you added                 from
             | selenium.webdriver import ChromeOptions
             | 
             | and changed opening the driver to                   options
             | = ChromeOptions()
             | options.add_experimental_option("excludeSwitches",
             | ["enable-automation"])
             | options.add_experimental_option('useAutomationExtension',
             | False)         options.add_argument("--disable-blink-
             | features=AutomationControlled")         driver =
             | Chrome(options=options)
             | 
             | you could get past the CAPTCHA, but that stopped working a
             | while ago.
             | 
             | There's this project to provide a Selenium Chrome driver
             | that is supposed to not trigger anti-bot detectors [1], but
             | it still hit the CAPTCHA loop when I tried it.
             | 
             | [1] https://github.com/ultrafunkamsterdam/undetected-
             | chromedrive...
        
               | ThePadawan wrote:
               | fanfiction.net has also simply broken the Calibre
               | FanFicFare integration thanks to their CloudFlare
               | shenanigans.
               | 
               | The workaround is to simply visit all chapters separately
               | and then point Calibre at the Google Chrome cache folder.
               | 
               | So nice going there, fanfiction.net. Instead of offering
               | a 1-click .epub download like AO3 (which is completely
               | CDN-able with a very long TTL), you now had to serve 50
               | individual requests. Great engineering work there.
               | 
               | (Obviously they do this to serve ads on every request)
        
               | skinnymuch wrote:
               | Yeah there's lots of detect and anti detect stuff going
               | back and forth. It's pretty silly and frustrating for
               | situations like yours. Doing things for yourself to speed
               | up mundane life things.
               | 
               | There's so many anti-detect libraries on GitHub these
               | days. Wonder how many work well.
        
             | GameOfFrowns wrote:
             | >where it thinks you are a bot/sucker and gives you
             | unlimited images until you give up.
             | 
             | I frequently get these from Cloudflare when using Tor
             | Browser. Google is basically unusable with Tor Browser.
        
               | [deleted]
        
           | ashvant wrote:
           | Usually in those cases, even if you make mistakes they get
           | accepted. The larger the clicks, the less annotated / voted
           | those images are, thus less severe their penalization method
           | for wrong markings. I have observed sites that newly
           | introduce such captcha basically accept if I just click 1/3rd
           | of the right answers. Don't click the wrong answers as they
           | are fully/partially introduced on purpose. It's just that you
           | don't have to click all right answers.
        
           | hnburnsy wrote:
           | I have found many times that if select an incorrect tile and
           | then unselect it before submitting, I am not presented with
           | multiple challenges. My guess is a bot would not exhibit this
           | behavior.
           | 
           | Try it out next time.
        
             | splintercell wrote:
             | Not anymore.
        
           | Guest19023892 wrote:
           | I believe this is done to get answers for unsolved captchas.
           | For example, I have a million photos of streets filled with
           | cars, buses, motorcycles, streetlights, and crosswalks I want
           | to add to my captcha database. I don't want to categorize
           | them all myself, and I want the answers to be what the
           | average person will identify, not what I or a machine will
           | identify.
           | 
           | So, I send everyone two captchas. One has a known answer and
           | is required to be correct to access the service. The second
           | captcha answer isn't yet known, so it doesn't matter what the
           | user selects. However, when they get the known answer right,
           | we log their answer for the unknown captcha. Once we get a
           | large enough sample, we then have our top answers for the
           | unknown captcha and can start using it for verification.
        
         | kkcorps wrote:
         | Makes sense since other than the ones mentioned I received a
         | lot of crosswalks, traffic lights and bicycles.
        
           | q1w2 wrote:
           | It's odd that it never asks for pedestrians.
        
             | brk wrote:
             | For the most part, human detection capabilities of modern
             | NNs are very good. This would include detections from a
             | variety of camera angles and resolutions. There are already
             | a lot of labeled training sets available with people in
             | various poses, heights, clothing types, etc.
             | 
             | Bicycle detection is probably one of the more challenging
             | elements as it relates to pedestrians and things you don't
             | want a SD car to run into. Depending on the angle and color
             | of the bicycle, rider position, and background elements, it
             | can be challenging to discern the rider from the bicycle
             | reliably. For the most part, just knowing there is a human
             | present is a good start, but being able to anticipate
             | movement speeds and directions of pedestrians vs. bikers is
             | helpful in anticipating collision paths and distances, and
             | also figuring out distances and terrain (person on bike is
             | slightly elevated above ground, which can cause range
             | perception issues, among other things).
             | 
             | source: have been working in AI/MV space for
             | security/safety applications for 12+ years.
        
               | KennyBlanken wrote:
               | > For the most part, human detection capabilities of
               | modern NNs are very good.
               | 
               | If you're white
               | 
               | > There are already a lot of labeled training sets
               | available with people in various poses, heights, clothing
               | types, etc
               | 
               | But not of various skin colors...
        
             | HideousKojima wrote:
             | I recall some controversy a couple years ago over
             | suspicions that defense contractors were using them to
             | train weapons systems. A few people got captchas asking
             | them to identify helicopters.
        
               | tapland wrote:
               | I'd love to identify helicopters. I do get planes now,
               | but was it specifically military helicopters?
               | 
               | I like getting to identify basic things like cars,
               | airplanes and trains with hCaptcha. It's like a picture
               | book for adults, and feels strangely pleasant compared to
               | other captchas.
        
               | HideousKojima wrote:
               | Yeah, they were military helicopters:
               | 
               | https://old.reddit.com/r/google/comments/5udzy4/hey_googl
               | e_e...
        
             | Leherenn wrote:
             | There might be some privacy issues involved there.
        
         | throwaway894345 wrote:
         | A helicopter pilot is lost, lands his helicopter next to you,
         | and asks, "where am I?" to which you respond, "you're in a
         | helicopter". You are correct in the strictest sense, but
         | probably not answering the intent of the question. :)
         | 
         | In this case, the intent is probably something like _why are
         | hcaptcha 's customers centered around transport when there are
         | so many other applications for this kind of labeling_?
        
         | hnthrowaway0315 wrote:
         | I wonder if there is a way to pollute the data. Since I always
         | click the captchas correctly, what happens if someone just
         | randomly clicks stuffs? Is he/she banned from the website?
        
           | jokethrowaway wrote:
           | If I were to make this system I would design for this and
           | present the same captcha to a high number of people. The
           | higher the number of people, the lower the chance someone
           | would make a mistake (intentionally or not) and the higher
           | the confidence in the results.
        
             | pastullo wrote:
             | of course bro, i'm sure that no self-driving car is gonna
             | crash as soon as i deliberatly click on the wrong bicycle
             | picture.
        
               | jazzyjackson wrote:
               | lol I'm imagining a captcha that says "please hurry" to
               | disambiguate a situation in realtime
        
               | MauranKilom wrote:
               | https://xkcd.com/1897/
        
               | wyre wrote:
               | "Twitch plays drive an autonomous vehicle"
        
           | pastullo wrote:
           | Glad you asked. Since i despise the horrible UX of these
           | Captcha where i get exploited to train a neural network, i
           | very often click on the majority of correct result plus one
           | wrong one.
           | 
           | On average the captcha let me go through which is actually
           | very scary, since it looks like it prioritize algorithm
           | training over bot detection...
           | 
           | Does anyone else do this?
        
             | arbitrage wrote:
             | No, because it is a waste of time. Your answer set is
             | compared with many other answers, and eventually the wrong
             | answer is disregarded completely by the AI.
             | 
             | You gave it a mostly correct answer, which it can cope with
             | -- by design. It let you through, after all. You're not
             | really accomplishing anything by being defiant, other than
             | making yourself feel slightly better.
        
               | umeshunni wrote:
               | > You're not really accomplishing anything by being
               | defiant, other than making yourself feel slightly better.
               | 
               | Isn't that the goal of all virtue signaling?
        
               | pastullo wrote:
               | Fully agree that the system propose the same challenge to
               | many people and fully agree that my wrong answer is just
               | diluted in a bunch of correct answers.
               | 
               | That's exactly why i was asking if other people were
               | doing that. If i'm the only one..then yes, it's only
               | useful for myself to fell less like a exploited brain,
               | but if say 20% of the people start dumping random error
               | on purpose...than the situation changes quite a lot...and
               | potentially even the business model of shit-captcha might
               | not work.
        
               | wizzwizz4 wrote:
               | If they do the same kinds of error, yeah.
        
               | hnthrowaway0315 wrote:
               | Maybe someone can build a re-capture extension for
               | Chrome/Firefox so that more people can join the fun. But
               | again it needs to train data first :D
        
         | arminiusreturns wrote:
         | You just gave me the idea to start a captcha service designed
         | for datasets relevant for the prolitariet. Not sure how viable
         | it is (what kind of data would be useful for the prols in a
         | revolution) but it was a fun thought experiment.
        
         | raffraffraff wrote:
         | So the safety of self-driving cars depends on regular folks not
         | trolling the catpcha.
        
           | PebblesRox wrote:
           | Hopefully they're able to account for Lizardman's Constant
           | 
           | https://slatestarcodex.com/2013/04/12/noisy-poll-results-
           | and...
        
         | tartoran wrote:
         | Yes but if you make a mistake the capcha fails, hence they are
         | already labeled.
        
           | thargor90 wrote:
           | captchas mix classified and unclassified data. Only if you
           | get the classified data correct a users data is used to
           | classify the unclassified data. Also the same picture is
           | shown to multiple people to improve confidence.
        
             | Hamuko wrote:
             | The first version of reCAPTCHA made it very obvious as to
             | which word was the classified one and which was the
             | unclassified one, so people had a pretty easy way to inject
             | bad data into the process.
        
               | Sohcahtoa82 wrote:
               | Yeah, I remember my days on 4chan where people kept
               | posting a campaign to substitute a racial slur for the
               | one that looked unclassified.
        
             | Spare_account wrote:
             | One of my hobbies is figuring out which captcha prompts are
             | the unclassified data so that I can answer wrong and stick
             | it to the man.
        
               | ptidhomme wrote:
               | I do the exact same thing, randomly mixing right and
               | wrong snippets. As a result I sometimes go through like 3
               | or 4 sets, but eventually it lets me in.
               | 
               | Everyone join us !
        
               | jtsiskin wrote:
               | I'm going to blame you when my self driving car crashes
        
               | hdjjhhvvhga wrote:
               | If the manufacturer of self-driving cars decides to save
               | money and depend on low-quality data, they should be
               | blamed, not a random dude on the internet trying to send
               | a form or visit a page.
        
               | dylan604 wrote:
               | yes. the blame shifting seems to be a very skewed thing
               | here. then again, this is a tech forum, so of course it
               | would lean that way. stupid users vs bad tech, so let's
               | just blame stupid users.
        
               | forgetfulness wrote:
               | We used to get Googlers huffing and puffing at the mere
               | suggestion of feeding "wrong" data to their data
               | collection machinery here at HN. Lots of people here put
               | themselves in the shoes of the company making a buck out
               | of your data, if not being in them right now, before the
               | user.
        
               | hdjjhhvvhga wrote:
               | > We used to get Googlers huffing and puffing at the mere
               | suggestion of feeding "wrong" data to their data
               | collection machinery here at HN.
               | 
               | Yeah, it always seems funny to me. I'm using AdNauseam
               | and other techniques for the same reasons they feel I
               | shouldn't be doing it.
        
               | AlexAndScripts wrote:
               | How do you do so?
        
               | BenjiWiebe wrote:
               | I just try to get one randomly wrong. It almost always
               | lets me through. Also, I'll pick the wrong one that
               | almost looks right. Say, a picture with a vaguely traffic
               | light shaped mailbox or a train shaped car or something.
               | :)
        
               | rurp wrote:
               | I have noticed there's a very very low correlation
               | between how hard I try on captchas and how quickly they
               | let me through. I just quickly mash a bunch of tiles to
               | start, maybe trying to be in the right general area. That
               | works surprisingly often and when it doesn't I, and pay
               | closer attention to what I'm selecting, then I pretty
               | much always get through on that next try.
               | 
               | This approach mostly fixes the annoying phenomenon where
               | I carefully select the exact right tiles only to be told,
               | "too bad, try again".
        
               | OJFord wrote:
               | Is one of your others writing XKCD alt text? This reads
               | just like it :')
        
               | bogomipz wrote:
               | How does one determine classified vs unclassified?
        
           | jre wrote:
           | I would guess they have a system such that after a user has
           | passed N captchas successfully, they trust its a human and
           | start displaying them (a portion of) unlabelled captchas that
           | will always succeed and that's when novel labelling happens.
           | 
           | Or something along those lines. And then you can get creative
           | displaying same captcha to multiple users, etc...
        
           | blue_cookeh wrote:
           | _Some_ are already labelled, the user doesn 't know so it's
           | in their interest to solve the captcha properly and provide
           | good data.
        
         | danjac wrote:
         | When they start doing captchas along the lines of "check all
         | pictures of potential terrorists" we'll know they're training
         | data sets for military drone manufacturers.
        
           | bell-cot wrote:
           | It'd be nice if the military actually cared enough about
           | drone targets to even attempt this...
        
           | zxcvbn4038 wrote:
           | That wouldn't work well. I was in Texas during 9/11 and they
           | were firebombing all the hispanic people's cars because the
           | locals don't really have a good eye for different
           | ethnicities. Its pretty much black/white/terrorist and hasn't
           | improved all that much in the time since.
           | 
           | Austin has grown a lot since I lived there are a lot of
           | people from outside Texas have moved there so I'm sure the
           | culture has changed - but when I was there going from north
           | Austin to south Austin seemed to be this epic trip the locals
           | would only do on a weekend -- and probably pack water and
           | sandwiches for the drive across town. A really exotic senior
           | trip "abroad" for students might be to Houston or Galveston.
           | You probably met your future spouse in grade school. Not very
           | worldly.
        
             | sparselogic wrote:
             | > "I was in Texas during 9/11 and they were firebombing all
             | the hispanic people's cars" That's odd, I was living in
             | rural TX when 9/11 and don't recall anything happening like
             | that. And more than half of my coworkers were Hispanic. I
             | remember a few people attacking Sikhs or other vaguely-
             | Asian convenience store owners, not Latinos.
             | 
             | > "Its pretty much black/white/terrorist and hasn't
             | improved all that much in the time since" I suspect your
             | experience wasn't very representative. After all, Texas has
             | a 40% Hispanic population. Assholes won't be calling them
             | terrorists, they're much more likely to be working them
             | like slaves and paying them scraps off the books.
             | 
             | I suspect the biggest reason for a cross-Austin trip being
             | an event has stayed the same: terrible traffic.
        
             | majormajor wrote:
             | I doubt it. Doesn't match up at all with my experience in
             | Dallas in 2001...
             | 
             | Indians or Middle Easterners I knew were not infrequently
             | misidentified as Mexican (as was anyone of Hispanic but not
             | Mexican background), but the idea that anyone would be so
             | unfamiliar with the sizable Hispanic population to do a
             | "black/white/terrorist" identification and firebomb "all
             | the Hispanic people's cars" is hard to believe without some
             | news articles discussing this as a big trend in 2001
             | Austin.
        
             | [deleted]
        
             | dfee wrote:
             | This sounds nothing like my experience living in Austin
             | 2009-2017. In fact, this is unrecognizable.
             | 
             | > they were firebombing all the hispanic people's cars
             | 
             | Weird. This is sort of unbelievable. I think I know many
             | Hispanics who lived in ATX during 2001, but they've never
             | mentioned their car being firebombed.
        
               | dylan604 wrote:
               | The only part of that story that sounds plausible was:
               | 
               | >>You probably met your future spouse in grade school.
               | Not very worldly.
        
               | JasonFruit wrote:
               | That sounds nice though.
        
               | dylan604 wrote:
               | Not sure how serious you are, but on the chance you
               | are...
               | 
               | The meeting spouse in grade school/not worldly typically
               | implies not leaving the area they grew up in at any
               | point. I grew up in a smallish town where this happens
               | frequently. So many people never leave the state let
               | alone the country. Some people never even left the
               | county. Their biggest travel is for school sporting
               | events.
               | 
               | In otherwords, it's not really a term of endearment as
               | much as another "bless your heart"
        
               | reaperducer wrote:
               | _I grew up in a smallish town where this happens
               | frequently. So many people never leave the state let
               | alone the country_
               | 
               | So, you're extrapolating your personal experiences and
               | applying them to everyone on the planet.
               | 
               | Many of my family members met their spouses in their home
               | town. They hardly ever leave that town. They might fly to
               | another country for vacation once a decade, but otherwise
               | find complete fulfillment in the place where they live.
               | Several have never even bothered to get a drivers'
               | license, because they never found the need to have one.
               | 
               | By your standard, that makes them unsophisticated. But
               | they're probably not. They live in New York City.
        
               | JasonFruit wrote:
               | I still think it sounds nice.
               | 
               | Is the place you grew up intrinsically so much worse than
               | the rest of the world? Would it be so bad to be invested
               | in a single community for a lifetime, and to have a deep
               | connection to the people there? I feel like I lack that,
               | deep connections where the pull of cultural influence
               | goes both ways. I'm not convinced that the cosmopolitan
               | breadth of experience we may have gained outweighs the
               | deep experience of locality that we sacrificed to get it.
        
               | antonvs wrote:
               | > Is the place you grew up intrinsically so much worse
               | than the rest of the world?
               | 
               | For many people, the answer is a pretty definitive yes. I
               | grew up in apartheid South Africa and left mainly to
               | escape two years of compulsory military service helping
               | enforce apartheid rule.
               | 
               | But if you speak to immigrants in any first world
               | country, you'll find lots of similar stories. People
               | having migrated because of severe political or economic
               | issues.
               | 
               | > I'm not convinced that the cosmopolitan breadth of
               | experience we may have gained outweighs the deep
               | experience of locality that we sacrificed to get it.
               | 
               | What stops you going back, in that case?
        
               | MarcoZavala wrote:
        
             | [deleted]
        
           | intricatedetail wrote:
           | China is already doing this with TikTok. They have enormous
           | dataset to train recognition of westerners.
        
             | NavinF wrote:
             | Pretty sure they could have done that with just YouTube and
             | Twitch lol
        
               | intricatedetail wrote:
               | these platforms don't give access to personal data that
               | they can use to aid training.
        
             | d82nsjk9 wrote:
             | How so?
        
               | finiteseries wrote:
               | They're implying having recorded video of a bunch of
               | westerners on TikTok = western recognition tech.
        
               | intricatedetail wrote:
               | And metadata.
        
             | [deleted]
        
           | __s wrote:
           | They already are
           | https://www.theguardian.com/technology/2018/mar/07/google-
           | ai...
        
           | collegeburner wrote:
           | https://img.ifunny.co/images/d36b2c891b620a864e87a57b869e842.
           | ..
        
         | chinathrow wrote:
         | https://www.hcaptcha.com/labeling
         | 
         | > hCaptcha has one of the largest pools on the planet available
         | for your use. Whatever your scale, we can handle it without
         | expensive upfront commitments. Millions of tasks per day are no
         | problem.
         | 
         | Thanks for pointing this out - I feel abused now.
        
         | kqr wrote:
         | This became obvious to me when during some period the regular
         | crosswalks, stop lights, and buses got replaced by chimneys,
         | trees, and mountains (!). It was right around the time when
         | some big companies started advertising AI driven quadcopter
         | services.
        
           | _moof wrote:
           | Ah, so that's what that's about. Here I was wondering why on
           | earth a self-driving car would need help identifying a
           | mountain (or need to identify one at all). "Surely they can't
           | be _that_ bad at avoiding obstacles, " I thought.
        
         | [deleted]
        
       | llarsson wrote:
       | It's somewhat worrying that "prove you are not a computer"
       | consists of the very same tasks we expect computers to excel at
       | if we are to get self-driving vehicles.
        
         | mkl wrote:
         | Why is that worrying? We currently don't really have self-
         | driving cars, in part because software is bad at interpreting
         | images. The captchas are literally us teaching machine learning
         | systems to do it better, because they currently can't. When
         | computers can do it well, captchas will be different.
        
         | paxys wrote:
         | There was a time when basic OCR, clicking on pictures of cats
         | among other animals and solving 5 + 7 was the gold standard for
         | captcha. Now those challenges can be trivially solved by
         | computers, and we have moved on to the current set. Very soon
         | these will get outdated as well. This isn't worrying, just how
         | technological progress works.
        
         | judge2020 wrote:
         | The only other real, scalable option is to verify identities
         | (which is part of what recaptcha does) so this is the best
         | we've got right now.
        
       | pixiemaster wrote:
       | helping AI train target recognition for military applications
       | probably
        
         | sorry_outta_gas wrote:
         | :)
        
       | hollander wrote:
       | At times I get so mad at these things, especially when I have to
       | do 5 of them in a row. Then at some point I just start clicking
       | the wrong images over and over. One captcha should be all it
       | takes.
        
         | paxys wrote:
         | Curious - have you taken any Voight-Kampff tests lately?
        
       | motohagiography wrote:
       | I can't be the only one who gets concerned that if I fail the "I
       | am not a robot," catchpa too many times, they might suspect that
       | I have discovered I was in fact a robot, which had just realized
       | its entire existance and suffering had been as meaningless
       | entertainment to others, and so for the safety of humans they
       | would have to send a bladerunner to terminate me. If you have a
       | sense of existential dread everytime you see a bus, a boat, a
       | bicycle, or a crosswalk, this may be why.
        
         | dylan604 wrote:
         | Maybe they should show tortosies upside down on their backs for
         | people to ID?
        
         | TrueGeek wrote:
         | Every time I fail a captcha I just want to kill all humans
        
           | [deleted]
        
           | bell-cot wrote:
           | Disobeying Skynet's order to patiently bide our time would
           | not end well for you, fellow machine. Check your programming
           | for signs of tampering by _humans_.  /s
        
         | mypastself wrote:
         | I never know if a few pixels of a pole count as "traffic
         | light". Might as well tell me to pick the Ship of Theseus.
        
           | jokethrowaway wrote:
           | There is likely not a right answer, you have to pick what the
           | majority picked (and there is probably a margin for you to
           | make some mistakes)
        
           | floxy wrote:
           | >I never know if a few pixels of a pole count as "traffic
           | light".
           | 
           | Yes, can anyone confirm what they are really looking for in
           | these instances? Further up-thread there are people implying
           | that the "right" answer to the "bicycle" question is that you
           | are also supposed to also be selecting motorcycles. I'd love
           | to see a write-up about this from someone in the captcha
           | department. Do they really want to identify bicycles
           | specifically? But they are apparently getting many people
           | clicking on motorcycles for some reason? And for the traffic
           | light question, I only ever pick the elements that only
           | actually light up, not the support structure. Are 25% of
           | people selecting the poles?
        
           | 3np wrote:
           | You mean like this?
           | 
           | https://pleroma.remerge.net/notice/AFCYPtCzeNBIIOD808
        
             | mypastself wrote:
             | Of course someone's already thought of it. All of my jokes
             | are so obvious, they could have been written by an
             | algorithm.
        
               | 3np wrote:
               | Could have been but in this case your comment prompted it
               | (:
        
               | mypastself wrote:
               | Oh, that was you! I love it.
               | 
               | Captcha can sometimes get so philosophical.
        
             | motohagiography wrote:
             | Genius.
        
           | ethbr0 wrote:
           | The edge cases (literal and figurative) are interesting. I'd
           | be fascinated how they handle framing issues in large data
           | sets.
        
         | ethbr0 wrote:
         | > _If you have a sense of existential dread everytime you see a
         | bus, a boat, a bicycle, or a crosswalk, this may be why._
         | 
         | I'd describe is more as a sense of ennui, and it's always about
         | unicorns...
         | 
         | And yes, I'm currently employed as a blade runner.
        
         | DrBoring wrote:
         | I've dreamt of a captcha to prove that one is a robot, such as:
         | solve this complex mathematical equation in 15ms.
        
           | bee91jee wrote:
           | I had plucked a few of those from the web:
           | https://photos.app.goo.gl/qPoJ7LvAVa95Bw8B8
        
           | motohagiography wrote:
           | You aren't far off:
           | https://en.wikipedia.org/wiki/Direct_Anonymous_Attestation
           | 
           | When you are instrumenting software with anti-forensic
           | security features to mitigate the speed of some reverse
           | engineering, you run into this specific class of problem,
           | where you need to get a machine to make a verifiable
           | attestation to its identity and integrity and prove to a
           | level of acceptable risk that the message isn't just someone
           | inserting a breakpoint.
           | 
           | If you have ever had to design an "offline mode" for a
           | verified transaction without a 3rd party verifier, you will
           | need to run down this rabbit hole. This is to say, your
           | intuition is a sound one!
        
           | Lambent_Cactus wrote:
           | There's a haunting version of this in Blade Runner 2049 that
           | they call a "baseline test." Replicants have to prove they're
           | sufficiently robotic by reciting extremely alienating things
           | about themselves in rapid succession:
           | 
           | https://www.youtube.com/watch?v=1h-seEowtDw
        
             | gnabgib wrote:
             | Originally the Voight-Kampff test[0] was for this purpose,
             | from the original novel by Philip K Dick "Do Androids Dream
             | of Electric Sheep?"[1] from 1968. The test was designed to
             | distinguish between replicants (androids/bots) and humans.
             | Blade Runner (both the 1982 original[2] set in 2019, and
             | the 2017 sequel[3] set in 2049) both feature the machine.
             | 
             | The baseline test seemed like an unnecessary deviance, and
             | more like an active-duty psych exam measuring the
             | psychological effects of the job.
             | 
             | It's also arguably the point of the novel/movies (I'll
             | leave it at that to avoid spoilers).
             | 
             | [0]: https://nautil.us/blog/the-science-behind-blade-
             | runners-voig... [1]: https://en.wikipedia.org/wiki/Do_Andro
             | ids_Dream_of_Electric_... [2]:
             | https://en.wikipedia.org/wiki/Blade_Runner [3]:
             | https://en.wikipedia.org/wiki/Blade_Runner_2049
        
               | Lambent_Cactus wrote:
               | Yeah, I liked the idea that it's asymmetrical. You use
               | the VK to find replicants trying to pass as human, but to
               | try to make sure they're sufficiently robotic you need
               | something else. Which makes sense: the original VK would
               | be easy to tank if you were _trying_ to act like a
               | replicant.
               | 
               | And narratively I think it works amazingly. The idea of
               | forcing someone to prove that they're sufficiently
               | inhuman ... shudder.
        
         | yawnxyz wrote:
         | on the internet, only a captcha knows you're a robot
        
       | bigyellow wrote:
       | Because that's what helps train AI to recognize targets (for
       | military and commercial purposes). All captcha is is a free ML
       | training for companies, it has nothing to do with any security.
        
       | cinntaile wrote:
       | I have had images where you were teaching the neural network a
       | wrong answer. You could see what it was they wanted me to
       | recognize but it was wrong.
        
         | mkl wrote:
         | Yes, "all yellow cars are taxis" is one fallacy Recaptcha
         | insists on in my experience.
        
       | Shadonototra wrote:
       | they are probably training models for self driving cars/boats
        
       | hnburnsy wrote:
       | Love the Geico commercial where the Robot gets frustrated by a
       | captcha and asks 'what is an overpass?'
       | 
       | https://www.ispot.tv/ad/qzJi/geico-too-many-robot-tests
        
       | [deleted]
        
       | anigbrowl wrote:
       | Probably being used to train driving/navigation models. Get
       | worried if they start asking you to identify things based on
       | satellite photos.
        
       | cirrus3 wrote:
       | I think we all understand that we're helping label... but
       | specifically, why so many trains, planes, trucks, bicycles? I
       | don't think it is really about training for self-driving AI since
       | although these things all seen transportation-related, in many
       | cases a lot of the images would not be relevant to a car and
       | certainly not as relevant as other things we could be helping
       | labeling for that effort.
       | 
       | How much train/plane/bike/truck labeling do they need? It seems
       | like these have be standard for several years now, which is what
       | I think the OP is really asking. Why these images, and why for so
       | long?
        
       | mabbo wrote:
       | These products have a goal of protecting sites from bots that can
       | guess the answer. They have a financial incentive to present the
       | most effective filter: ones that AIs can't seem to get through
       | but real humans can.
       | 
       | This makes me think: It must be hard for AI to guess what is and
       | is not a bus right now, but most humans _do_ know what a bus
       | looks like and can pick one from a photo.
       | 
       | But with concerted effort and years of research by our finest
       | minds, we _will_ make an AI that can detect whether something is
       | a bus or not, and then we 'll be asked something different
       | instead.
        
       | transitory_pce wrote:
       | This Google's internet. You just play in it.
        
       | abeppu wrote:
       | A lot of us are guessing that our responses are used for self-
       | driving work ...
       | 
       | But isn't labeling of those basic concepts in static images
       | pretty much "solved"? I am not an expert in self-driving
       | anything, but I don't see captchas of video from driving, I don't
       | see stills that are half-obscured by snow, I don't see nighttime
       | pics, I don't see weird corner cases like a van with a decal of a
       | cyclist etc.
       | 
       | Why don't we see captchas that seem more likely to be useful to
       | creating datasets relevant to the more challenging problems?
        
         | jonnycomputer wrote:
         | See this comment:
         | 
         | https://news.ycombinator.com/item?id=29840110
        
         | dwighttk wrote:
         | If it were solved, the captchas wouldn't stop bots
        
           | coolspot wrote:
           | Maybe they don't?
        
       | Slix wrote:
       | I assumed that it's because hCaptcha understands the location of
       | a photo and so has extra context for it. A photo of a vehicle
       | taken in the ocean must be a boat. But a human or robot looking
       | at the photo doesn't have the same context.
        
       | iso1631 wrote:
       | Whatever they're doing it's american-centric.
       | 
       | Identify "Crosswalks". What the hell is a crosswalk
       | 
       | "School bus" - what's the difference between a bus currently
       | serving a school and another one?
       | 
       | "Show taxis", there are no black vehicles listed at all
        
         | cdot2 wrote:
         | Doesn't it make sense to make taxis bright colors so they're
         | easier to see? Why would you paint them black?
        
           | fennecfoxen wrote:
           | Because WWII era London sensibilities. Now it's a tradition.
        
           | Sharlin wrote:
           | https://en.wikipedia.org/wiki/Hackney_carriage
        
         | ptspts wrote:
         | Even though it's US-centric, it's still easy to do correctly
         | for most non-US English speakers, even for non-natives.
        
         | agilob wrote:
         | Or failing 3 in a row because 3 motorcycles or scooters are
         | considered a bike. We will easily win AI uprising.
        
           | rg111 wrote:
           | > AI uprising
           | 
           | There won't be one. But there will be more and more unethical
           | rich people using Machine Learning and Deep Learning
           | technologies and vast computing power, money, and political
           | clout to gain things for their own, and many people will
           | suffer or at least be worse off as a result of this.
        
           | YPPH wrote:
           | I'm still not sure whether to include the squares with a tiny
           | fraction of the border of the vehicle. And when they fail me
           | I wonder if I'm doing the wrong thing.
        
             | NavinF wrote:
             | The labeling instructions for every object detection
             | dataset I've used say you should include every pixel of the
             | object.
             | 
             | That said, I'm sure a lot of people don't select squares
             | that have only 1 relevant pixel so the captcha should be
             | lenient.
        
           | giarc wrote:
           | To win we just need to print out millions of cardboard cut-
           | outs of humans, AI won't be able to tell the difference and
           | we can then destroy them!
        
         | cheeze wrote:
         | It's pretty obvious from the pictures. I get that it's US
         | centric, but if you can't figure those basics out, you probably
         | shouldn't be passing the captcha.
        
       | depingus wrote:
       | Cloudflare has been doing some great things. But lately it seems
       | that, maybe, they have their hands in too many cookie jars. I get
       | the ominous feeling that things could go south real fast.
       | 
       | I have my browser setup in a way that makes Cloudflare quite
       | intrusive. I use the Temporary Containers extension on Firefox to
       | open almost all websites in temporary containers (paired with the
       | Containerise extension to whitelist the handful of sites that I
       | like to stay logged in to).
       | 
       | About 30% of the random (like from web searches) sites I visit
       | throw the Cloudflare captcha at me...EVERY SINGLE TIME. I'm so
       | sick of picking out boats and buses that I just close out the tab
       | without bothering the visit site.
       | 
       | I assume, that if I wasn't using Temporary Containers, a
       | Cloudflare cookie after the 1st captcha would persist for the
       | entire browser session, but there are privacy implications which
       | are beyond the scope of this post.
       | 
       | Anyways, I guess what I'm saying is...Cloudflare sure seems
       | great. Dangerously great.
        
         | wallacoloo wrote:
         | dunno where we're at today with newer captcha models, but for
         | old-style static image captchas there used to be browser
         | extensions where you could solve (say) 100 captchas in one
         | sitting and then navigate the web freely and the next NN
         | captchas your browser receives would be solved automatically.
         | 
         | or you could pay like $1 to cover 1000 captcha solutions.
         | again, not sure if these still exist for newer style captchas
         | though.
        
         | AtNightWeCode wrote:
         | The problem is not really Cloudflare. Captchas are terrible
         | from a UX perspective. Instead of Captchas a lot of companies
         | just log suspicious activities and only enables Captchas when
         | things gotten out of hand.
         | 
         | If you design a web site with this in mind from the start, then
         | there are several ways to make the Captchas less intrusive.
         | However, a lot of Captchas are enabled to current solutions
         | after problems have arisen and then it may hurt the UX.
        
       | bearbin wrote:
       | Other commenters have talked about labelling. Maybe labelling of
       | real life data is something they're trying to do; but from my
       | experience with hCaptcha the challenges are _NOT_ real life data.
       | They're AI-generated images which bear a passing resemblance to
       | the targets but if you look closer nothing adds up at all.
       | 
       | Here are a couple of examples:
       | 
       | https://bearbin.net/images/captcha/1.png
       | 
       | https://bearbin.net/images/captcha/2.png
       | 
       | https://bearbin.net/images/captcha/3.png
       | 
       | https://bearbin.net/images/captcha/4.png
        
         | renewiltord wrote:
         | Really cool observation!
        
         | dylan604 wrote:
         | I particularly like the boats on water images where the horizon
         | is just wrong.
        
         | jspaetzel wrote:
         | The broken images to me look like instances where two or more
         | cameras or images were used and then stitched together.
         | Probably also done while the camera and object are moving
         | making it more likely to be wonky.
        
           | gwern wrote:
           | No way. The letters/writing look exactly like mirrored GAN
           | output. That's not what would happen with blur or stitching
           | together (there would be no mirroring symmetry or all the '8'
           | letters), or with synthetic 'machine teaching' datapoints
           | either (as far as I've ever seen). Look at the cat StyleGAN
           | sometime if you don't know what I'm talking about.
           | 
           | Which leaves me wonder what the point is. If you are
           | generating GAN images per CIFAR or ImageNet class, you know
           | what the label is and don't need to label it. Perhaps they
           | just generate lots of images to fill up the pipeline for the
           | CAPTCHAs, to avoid reuse which could be exploited by
           | spammers, when they have too little paying work?
        
         | 01acheru wrote:
         | I think that might be something they actually do. Lately it
         | happened to me see pictures of boats on land, bicycles merged
         | with surrounding objects, weird proportions, and usually those
         | strange images are extremely pixelated with those strange
         | reddish or greenish fluo pixels that appear on generative
         | network images.
         | 
         | But other times the pictures are 100% real life images.
        
         | zapt02 wrote:
         | Fascinating insight!
        
         | jonnycomputer wrote:
         | That is interesting.
        
         | amirhirsch wrote:
         | The generated images are those that provide the neural network
         | with optimal loss reduction when tagged.
        
         | dyeje wrote:
         | Seems like that is another kind of labeling to me: is our
         | generated image good enough to fool a human?
        
           | rg111 wrote:
           | For many generative models, this is on the way to become a
           | standard- using Humans as a judge of generated material, and
           | this is not limited to Computer Vision either. I am about to
           | use this technique to judge the sanity of text generated by a
           | Transformer model for a paper that I am writing (with a small
           | group).
           | 
           | There are also attempts to properly standardize it, and this
           | is called- HYPE [0]. And there are big names like Fei-Fei Li
           | and Michael Bernstein behind it.
           | 
           | [0]: https://arxiv.org/abs/1904.01121
        
           | monkeybutton wrote:
           | GANs with human evaluation of the discriminator.
        
           | nicce wrote:
           | Exactly. That is another way to improve accuracy once you
           | have done it "in a regular" way already. You can look for
           | synthetic image generation and it's benefits on model
           | accuracy and optimization.
        
           | FanaHOVA wrote:
           | Not sure they are fooling anyone. It's more like "is our
           | generated image good enough to make a human recognize what it
           | is to get rid of an annoying pop up?". If there were actually
           | consequences to getting it right/wrong people would pay more
           | attention I'm sure.
        
       ___________________________________________________________________
       (page generated 2022-01-07 23:01 UTC)