[HN Gopher] Cloudflare's CAPTCHA replacement with FIDO2/WebAuthn...
       ___________________________________________________________________
        
       Cloudflare's CAPTCHA replacement with FIDO2/WebAuthn is a bad idea
        
       Author : herrjemand
       Score  : 191 points
       Date   : 2021-05-14 11:57 UTC (11 hours ago)
        
 (HTM) web link (herrjemand.medium.com)
 (TXT) w3m dump (herrjemand.medium.com)
        
       | grishka wrote:
       | Cloudflare captchas in particular, and any checks and roadblocks
       | to _see something publicly available_ in general, are terrible,
       | period. It doesn 't matter which form they take. Every time you
       | see one you feel like a second-class citizen and get reminded
       | that the internet is no longer what it used to be.
       | 
       | I personally simply close the tab when I see a cloudflare "one
       | more step" page.
        
         | fierro wrote:
         | what? Have you ever dealt with a DDoS attack and the
         | consequences on your availability and infra health?
        
           | stingraycharles wrote:
           | Of course from the perspective of the website operator it's
           | great, but from the perspective of the user it's frustrating.
           | 
           | I'm not sure whether this is true, but it seems like with
           | Firefox I get these captchas much more often than with
           | Chrome. Sometimes they're so difficult to solve it really
           | takes a minute or two to do so, and it's incredibly
           | disturbing / an unfriendly interaction.
           | 
           | Surely there must be a better way to deal with this? Why do I
           | have to keep proving again and again and again to Cloudflare
           | I am, in fact, a person?
        
           | grishka wrote:
           | Are ddos attacks a common enough occurrence to warrant
           | putting half the internet behind ddos protection? In my
           | impression you need to do something really wrong to deserve
           | one.
        
             | tick_tock_tick wrote:
             | Yes, they absolutely are. Hell just getting a few random
             | bots scraping stuck in a loop or being overly aggressive on
             | your site is enough to double your bill. So yeah it's 100%
             | required.
        
             | midev wrote:
             | Yes, attacks on the web are very common for any decent
             | sized site. Just because something is publicly available
             | doesn't mean you get unfettered access to do whatever you
             | want.
        
             | livueta wrote:
             | That's unfortunately not true at all in my experience.
             | Maybe if you're an anodyne SAAS, but if you host any user-
             | generated content, especially if it's adjacent to gaming
             | (my personal experience was mostly with gaming-related
             | forums and IRC networks), politics or any other charged
             | topic, expect to get hammered on a pretty frequent basis.
             | IoT botnets are pretty easy to rent at this point, so the
             | attack is accessible to every skid known to mankind.
             | 
             | I actually agree with your overall point as I try to use
             | Tor for a lot of "normal" browsing, but I'm not sure what
             | the correct solution to accommodate both is. It's a hard
             | problem, and having been in that position myself I have a
             | hard time faulting small website operators who have no
             | alternative defenses.
             | 
             | e: just to add to this, I see the existence of ddosing as a
             | significant driver towards centralized monolithic services.
             | If your blog on Palestinian rights or whatever is getting
             | hit, that's an incentive to move it to a platform that
             | takes care of networking for you. It's a little absurd to
             | go all-in on decentralized self-hosting without at least an
             | acknowledgement that with current tech and typical
             | personal-computing budgets, doing so is giving a heckler's
             | veto to literally everyone. Cloudflare isn't the only
             | dimension things can be centralized along.
        
         | midev wrote:
         | This is completely wrong. Site administrators can put any
         | controls they want in place to limit access. I don't know where
         | you get the idea that things on the Internet need to be
         | publicly available or without restriction.
         | 
         | Unless you're an original ARPANET contributor, there have
         | always been attempts to control access and stop attacks. You're
         | making the same mistake every conservative does. Longing for a
         | nostalgia that never existed.
        
         | ehutch79 wrote:
         | How do you mitigate ddos attacks and other bad actors hitting a
         | page?
         | 
         | What does your cdn solution look like?
         | 
         | Route optimization from your (single) endpoint to clients
         | literally half a world away?
        
           | grishka wrote:
           | As I user, I simply don't care. I repeatedly get punished for
           | doing nothing wrong. It's almost like airport security.
           | 
           | > What does your cdn solution look like?
           | 
           | > Route optimization from your (single) endpoint to clients
           | literally half a world away?
           | 
           | And as a developer, I don't understand this newfangled
           | obsession over CDNs either. Yes, there will be 200 ms RTT in
           | some cases. So what? Get over it. Optimize your website to
           | load in fewer round-trips. TCP congestion control adapts well
           | enough to any latency. RTT only really matters in gaming and
           | VoIP.
        
             | ehutch79 wrote:
             | I don't think you understand why that captcha is there in
             | the first place then.
             | 
             | Cloudflare prevents a bunch of crap that site operators
             | just don't want to deal with. Especially for smaller sites
             | that are run by one person. Dealing with a wordpress site
             | getting hacked because you missed an update by a day, or a
             | bulletin bored getting swarmed with bots, or some asshat
             | ddos'ing your site because you banned them. Suddenly that
             | site just isn't worth running.
             | 
             | Complaining that about a thing that prevents that headache
             | because it's a minor inconvenience to you is so self
             | centered it boggles the mind.
        
               | grishka wrote:
               | Yeah, so centralizing the entire internet around a black
               | box that sees all your traffic in cleartext is clearly
               | the right solution. /s
               | 
               | > Dealing with a wordpress site getting hacked because
               | you missed an update by a day
               | 
               | Maybe don't use something this vulnerable then and rely
               | on a third party to protect you from exploits.
               | 
               | > or a bulletin bored getting swarmed with bots
               | 
               | Maybe require email verification and/or a captcha _when
               | signing up or posting_. Don 't punish people for
               | _passive_ actions.
               | 
               | Somehow, there are many forums that aren't behind
               | cloudflare, yet there are no spam bots.
               | 
               | > or some asshat ddos'ing your site because you banned
               | them
               | 
               | Sure ddos is such an everyday occurrence?
               | 
               | I just don't understand. I run a personal website.
               | There's literally nothing to "deal" with. I set it all up
               | once and it works. I only have to pay for the server and
               | for the domains on time.
        
               | ehutch79 wrote:
               | Monocultures are always bad, but I don't see any
               | alternative services with this level of ease of use.
               | 
               | You're definitely overestimating the technical
               | expertise/available time of a lot small time admins out
               | there.
               | 
               | You don't see bots and spam on those forums either
               | because they are actually using cloudflare, and you're
               | just not seeing the captcha, or because in the backend
               | they're feeding all their posts through akismet (in plain
               | text). I don't think you're considering how many services
               | see your posts, even when you don't trip a captcha.
               | 
               | email accounts are trivial to sign up for, especially for
               | bots. I always recommend charging $1 (or local
               | equivalent) for an account, that's a lot harder to fake.
               | 
               | My point in all this is that bitching that site is using
               | cloudflare to not have to deal with crap, is a self
               | centered view.
               | 
               | Saying "well it never happened to me, so it must never
               | happen" is similarly self absorbed.
               | 
               | Maybe consider that your experience is not everyones
               | experience
        
               | jjav wrote:
               | > My point in all this is that bitching that site is
               | using cloudflare to not have to deal with crap, is a self
               | centered view.
               | 
               | Who is serving whom here?
               | 
               | If a business thinks it's ok to impose cloudflare
               | inconvenience on me, the customer, for the priviledge of
               | giving them my money, who is self centered here?
               | 
               | The simple answer is I'll close the tab and go buy it
               | from a competitor. I'm not playing captcha games to buy
               | something.
        
               | [deleted]
        
               | ehutch79 wrote:
               | Why do I feel like you ask to speak to managers a lot?
        
             | ehutch79 wrote:
             | that 200ms rtt does matter to users. it becomes very
             | noticeable. especially when you're writing an app, not a
             | brochure site. You need to tree shake so you're not serving
             | a huge spa all at once.
             | 
             | I've see much worse times for users, and a cdn absolutely
             | help with our staff in asia dealing with our internal apps.
             | 
             | Of course they're not always tripping up cloudflare and
             | being shown captchas. I almost _never_ see a cloudflare
             | captcha either... huh...
        
               | gsich wrote:
               | What use has an app that requires a full RTT fot every
               | buttonpress?
        
               | ehutch79 wrote:
               | None? Why would every button press require a full rtt?
        
               | gsich wrote:
               | >that 200ms rtt does matter to users. it becomes very
               | noticeable. especially when you're writing an app, not a
               | brochure site. You need to tree shake so you're not
               | serving a huge spa all at once.
               | 
               | Because this implies that.
        
               | ehutch79 wrote:
               | No it doesn't. at all.
               | 
               | Page load speeds by themselves can be painful. Open
               | devtools and have your browser throttle to poor 3g
               | speeds.
               | 
               | Try browsing around. even well optimized sites.
               | 
               | Now try uploading a couple dozen files through an api.
               | 
               | This is legit what some users deal with. In New York
               | state even, you don't need to go that far to find poor
               | connectivity.
               | 
               | Even if all your users have awesome home connections,
               | think sales people taveling to a client. or on site
               | inspections of a manufacturer in a warehouse that's
               | mostly metal and has bad wifi.
        
               | gsich wrote:
               | Then what does the 200ms have do with "writing an app"?
        
               | ehutch79 wrote:
               | Apologies, your username looks like the one that tossed
               | that number out as a what they assumed was a high number.
               | 
               | 200ms latency isn't that bad, but I'm seeing more
               | 800-2000ms latencies with some users depending on
               | physical location. at some point latency kills usability.
               | Especially when trying to get through a complicated QA or
               | inventory process.
        
               | Dylan16807 wrote:
               | If you're on 3G I would expect sites to load in a
               | similarly bad way with or without an extra most of 200ms
               | of RTT.
        
               | ehutch79 wrote:
               | The throttling in dev tools is meant to represent that
               | latency...
        
               | Dylan16807 wrote:
               | If setting it to 3G is supposed to represent _just_ 200ms
               | latency, that 's going to give you a very exaggerated and
               | misleading impression of how bad it is. It's a
               | meaningless test.
               | 
               | I thought you were giving an example of how bad
               | connections can get, and saying that the extra latency
               | would make it worse, but in that situation it's a drop in
               | the bucket.
        
             | aseipp wrote:
             | > Yes, there will be 200 ms RTT in some cases. So what? Get
             | over it.
             | 
             | You're missing a zero in that RTT for users in places like
             | Asia if your server is anywhere in the west. (It's actually
             | somewhat revealing when someone throws out a number like
             | this without any qualification; what exactly made you
             | conclude 200ms is the magic number?)
             | 
             | > Optimize your website to load in fewer round-trips. TCP
             | congestion control adapts well enough to any latency. RTT
             | only really matters in gaming and VoIP.
             | 
             | You don't need a very big imagination to think about cases
             | where RTT will have significant impacts e.g. in the event
             | you need to issue multiple sequential requests that are
             | dependent on one another. These are unavoidable and occur
             | often in more than just websites, but anything that e.g.
             | uses HTTP as an API (a very simple one is something like
             | recursively downloading dependencies.)
             | 
             | This comes across as a classic "I don't actually understand
             | the problem domain very well at all, but get off my lawn"
             | answer to the problem.
        
               | grishka wrote:
               | > You're missing a zero in that RTT for users in places
               | like Asia if your server is anywhere in the west.
               | 
               | Well, it does say 130 ms in here:
               | https://www.quora.com/How-long-would-it-take-for-light-
               | to-fl...
               | 
               | And that's _around_ the planet, to go around and end up
               | at the same spot. In practice, with sanely-configured
               | routes, your packets should never need to traverse more
               | than half that distance. So, divide it by 2, then that
               | cancels out because RTT is a measure of how long it takes
               | for a signal to travel back and forth. You then add some
               | time on top of that to account for buffering and
               | processing in the various equipment along the way.
               | 
               | > These are unavoidable and occur often in more than just
               | websites, but anything that e.g. uses HTTP as an API (a
               | very simple one is something like recursively downloading
               | dependencies.)
               | 
               | If you mean REST API requests, the kind that trigger some
               | code to dynamically generate a response, how would a CDN
               | solution like cloudflare help? The request still needs to
               | get to the server and the response still needs to come
               | back, all the way, because that's where that code runs.
               | CDNs only really work for cacheable static content, don't
               | they? I mean it's in the name.
               | 
               | A blog or a news website certainly doesn't need a CDN.
        
               | bayindirh wrote:
               | > Well, it does say 130 ms in here.
               | 
               | If you have a fiber backed, all-switched network with no
               | routing, buffers, congestions, or detours, you may get
               | that value, _if you 're lucky_.
               | 
               | Pinging tty.sdf.org which is a direct access shell
               | service in USA from somewhere between Europe and Asia,
               | from an _academic network backbone_ roundtrips in ~190ms.
               | I 'm traversing a little less than half a globe with the
               | whole journey. In your terms, it should be around ~60ms,
               | but it's not.
               | 
               | > If you mean REST API requests, the kind that trigger
               | some code to dynamically generate a response, how would a
               | CDN solution like cloudflare help?
               | 
               | By using Cloudflare workers, so your code is also
               | distributed around the globe?
               | 
               | > CDNs only really work for cacheable static content,
               | don't they? I mean it's in the name.
               | 
               | JS files are also static content. Unless you don't use
               | code distribution like Cloudflare workers, using a simple
               | CDN can cache 90% of your site if not more. CSS, images,
               | JS, HTML, you name it.
               | 
               | > A blog or a news website certainly doesn't need a CDN.
               | 
               | Actually, CDN is the most basic optimization for
               | distributing heavy assets like videos and images, which
               | news websites use way more than text. Why not use a CDN?
        
               | grishka wrote:
               | > Pinging tty.sdf.org which is a direct access shell
               | service in USA from somewhere between Europe and Asia,
               | from an academic network backbone roundtrips in ~190ms.
               | 
               | Actually I get around 200 from Russia which is also
               | "somewhere between Europe and Asia":
               | round-trip min/avg/max/stddev =
               | 196.504/197.581/199.833/1.360 ms
               | 
               | > By using Cloudflare workers, so your code is also
               | distributed around the globe?
               | 
               | Great, let's give that company _even more_ control. That
               | 's sure gonna end well.
               | 
               | > CSS, images, JS, HTML, you name it.
               | 
               | It all gets loaded once and then cached in the browser.
               | The initial load takes long regardless of whether there's
               | a CDN. Oh, and many websites also use stuff from like 10
               | different domains, which doesn't help this either.
               | 
               | And, it doesn't matter whether a JS file loads in 50 ms
               | or 300 ms, if it then takes 5 seconds to parse and start
               | running.
               | 
               | > Actually, CDN is the most basic optimization for
               | distributing heavy assets like videos and images, which
               | news websites use way more than text. Why not use a CDN?
               | 
               | So put them on a separate domain and serve that from a
               | CDN if you really care whether that stock photo no one
               | notices loads in 500 ms instead of 2000. That doesn't
               | explain much why anyone would put their main domain
               | behind cloudflare.
        
               | bayindirh wrote:
               | > Great, let's give that company even more control.
               | That's sure gonna end well.
               | 
               | We have enough evil companies who invade our lives
               | through the platform develop. Cloudflare is not one of
               | them. Using them is voluntary (by the service providers),
               | and I think they're one of the more useful companies
               | around.
               | 
               | BTW, I'm not a web developer or Cloudflare employee. I
               | have no skin in this stuff, however they build some cool
               | stuff inside the Linux kernel, which is interesting from
               | my PoV.
               | 
               | > It all gets loaded once and then cached in the browser.
               | 
               | Then cleared and/or invalidated by the user or browser's
               | logic itself due to plethora of reasons.
               | 
               | > The initial load takes long regardless of whether
               | there's a CDN.
               | 
               | Actually, no. A reasonably fast internet connection (>12
               | Mbps we can say) can load a lot of things very very fast.
               | The biggest overhead is DNS, even with CDNs. With a good
               | local, network-wide DNSMasq installation, if the server
               | is close, I can load big sites almost instantly.
               | 
               | > And, it doesn't matter whether a JS file loads in 50 ms
               | or 300 ms, if it then takes 5 seconds to parse and start
               | running.
               | 
               | I think 5 seconds is long time for even the old Netscape
               | Navigator's JS parser. You need to run something akin
               | skynet to parse the JS file for straight 5 seconds. How's
               | that even possible?
               | 
               | > So put them on a separate domain and serve that from a
               | CDN if you really care whether that stock photo no one
               | notices loads in 500 ms instead of 2000.
               | 
               | I don't know you, but world news generally contains
               | live/new footage or fresh photos from the ground, not
               | some stock photos, also we humans are visual animals.
               | Many people want to see the images first, read the text
               | later.
               | 
               | > That doesn't explain much why anyone would put their
               | main domain behind cloudflare.
               | 
               | Load balancing, DDoS protection, CDN, workers,
               | Bot/Scraping protection, cost reduction, rate limiting,
               | you name it. Even my DSL router implements some of the
               | protections, to my surprise.
               | 
               | Internet is not the same beast now when compared to
               | 90s/00s. I miss the simpler times, but alas.
        
               | comex wrote:
               | There's no need to guess based on the speed of light.
               | Test it yourself:
               | 
               | https://www.cloudping.info
               | 
               | For me, the highest was 310ms round trip to Singapore, so
               | higher than your estimate but not too bad.
               | 
               | But this is completely beside the point. As far as I
               | know, if you're using a CDN effectively (i.e. a large
               | proportion of requests are hitting cache), it should be
               | _cheaper_ than having all requests hit your server, not
               | more expensive. So even if you don 't "need" a CDN, you
               | might want one. This is orthogonal to the issue of bot
               | traffic, which exists whether or not you use a CDN. If
               | you want to use a CDN but don't mind the costs of bot
               | traffic, you can configure CloudFlare to not show the
               | CAPTCHAs, or use a different CDN.
        
               | grishka wrote:
               | 338 ms to Sydney is my worst. 234 ms to Singapore.
               | 
               | AWS does offer a CDN, right? Somehow they do it without
               | captchas and without me ever noticing. So I'm somewhat
               | right at cursing at cloudflare because it's the only one
               | actually announcing its presence by actively disrupting
               | your browsing.
        
               | ehutch79 wrote:
               | Cloudfront, the aws cdn, is not equivalent to the part
               | that's showing a captcha. Only when it would have hit
               | your server do you see the captcha, because it's proxying
               | it, not serving cached results.
        
               | joshuamorton wrote:
               | Of course, light doesn't travel at vacuum speed in fiber,
               | it travels a bit slower, but it also bounces around the
               | cable, so ends up traveling a significantly longer
               | distance. Multiply by around 1.5 for more real world
               | numbers (copper is similar), just measuring raw distance.
               | 
               | And switching adds significant delay, especially when you
               | get out to the edge.
               | 
               | > If you mean REST API requests, the kind that trigger
               | some code to dynamically generate a response, how would a
               | CDN solution like cloudflare help? The request still
               | needs to get to the server and the response still needs
               | to come back, all the way, because that's where that code
               | runs. CDNs only really work for cacheable static content,
               | don't they? I mean it's in the name.
               | 
               | For a simple example, imagine assets that are dynamically
               | loaded based on feature detection in JS. All of the
               | assets (js included) can be cached on the cdn.
        
               | robocat wrote:
               | I live in New Zealand, definitely a first world country
               | with first world infrastructure. Ping times to US East or
               | Europe are over 300ms right now from my home[1].
               | 
               | My old business had users in countries around the world,
               | and the assets were _highly_ optimised for speed. However
               | adding CloudFlare (a) significantly sped up our service
               | to clients, especially those in Asian countries, and (b)
               | significantly improved reliability of connections because
               | CloudFlare have their own dedicated network links between
               | countries and /or optimised for reliability.
               | 
               | [1] https://www.cloudping.info/
        
             | [deleted]
        
           | jjav wrote:
           | > How do you mitigate ddos attacks and other bad actors
           | hitting a page?
           | 
           | Not sure what "bad actors hitting a page" even means. I host
           | public info so people can see it, be it good or "bad" people.
           | Let them see it.
           | 
           | DDoS is different and can be devastating of course. Also,
           | very rare. In decades hosting content (started my first
           | hosting business in 1994) I've never experienced anything
           | remotely like a DDos. I know it happens, but definitely very
           | rare for most people. Driving tons of legitimate users away
           | with relentless captcha annoyances for the once in a liftime
           | possibility of a DDoS is not a good tradeoff.
           | 
           | If you're in a business that attracts DDoS like flies then
           | deal with that, otherwise lay off the captchas.
        
             | midev wrote:
             | > Not sure what "bad actors hitting a page" even means
             | 
             | I would refrain from commenting on this topic then.
             | 
             | > I host public info so people can see it, be it good or
             | "bad" people. Let them see it.
             | 
             | If your site ever gets big enough, you'll understand.
        
           | floatboth wrote:
           | For regular _public web pages_ , serving the actual fucking
           | page should not be more expensive than serving the captcha
           | page! What the hell is a "bad actor" in relation to GET
           | requests to a _public page_? To a _public page_ , all actors
           | should be inherently neutral.
        
             | ehutch79 wrote:
             | There's a lot of potential side effects. Not every GET
             | request retrieves a static asset. A list view with filters
             | for instance. Either way, you could be dumping any
             | arbitrary data along with a request. Or just try fuzzing
             | parameters. some pages might be poorly done graphQL
             | endpoints, and your might find your db tied up. There are
             | MANY ways a legit a get request can cause issues, let alone
             | someone with bad intentions.
        
               | ipaddr wrote:
               | All users are hostile users and all users are preferred
               | users. Define what your system will allow through rate
               | limits and caching. Assume users would destory your site
               | if given the chance because they will. If you are
               | exposing private data through graphgl config it or drop
               | the private data or drop graphgl and user a backend.
        
               | ehutch79 wrote:
               | I 100% agree you should be looking at all incoming
               | traffic as hostile or at least potentially hostile. The
               | other comments are contending that there's nothing a
               | hostile user could do to a 'public' page.
               | 
               | One of the things using cloudflare gets you is all those
               | protections without having to know how to do them
               | yourself. Which a lot of the developers don't know how to
               | do.
               | 
               | There's also something to be said for catching a lot of
               | this at the network layer on a cluster of machines that
               | can handle any incoming traffic that your one poor
               | neglected vm can't.
        
               | ipaddr wrote:
               | If you use cloudflare you give up freedom not to force
               | captcha. If you avoid them you can choose where you want
               | to captcha. When cloudflare goes down you go down
               | unnecessarily.
               | 
               | If worried about a ddos attack cloudflare or another
               | provider might be a good choice. But adding ddos support
               | by default seems unncessary. In my 20 years of running
               | 100s of sites I haven't run into a situation where I need
               | ddos support. The vast majority will never be the target.
               | Once in awhile google or bing will ddos you but using
               | cloudflare to block that seems like overkill.
        
               | ehutch79 wrote:
               | There are tradeoffs, yes.
               | 
               | We might weight those tradeoffs differently. That's ok.
        
           | D-Nice wrote:
           | Why is every and any TOR and sometimes VPN user deemed a DoS
           | attack... it discriminates against users who value privacy by
           | forcing hCaptcha on them by default. Worst of all... it could
           | be a de-anonymization attack as well, hence why I as a
           | regular TOR user, just exit the page immediately when that
           | happens.
           | 
           | For any of my pages that do happen to use Cloudflare, I am
           | luckily able to disable this discrimination in the CP so
           | kudos for that at least, but terrible defaults imo.
        
             | sroussey wrote:
             | From experience, traffic via Tor was always 99%+ fraud.
        
               | tibiapejagala wrote:
               | Well, if you keep throwing impossible captchas at them,
               | no wonder that normal users just close the tab, but bots
               | and fraudsters keep trying.
        
               | jlokier wrote:
               | You can conduct fraud by accessing public, read-only web
               | pages? You can conduct fraud by searching on Google?
               | 
               | Those are the two I find repeatedly blocked when
               | accessing via Tor. The former by Cloudflare, the latter
               | by Google.
               | 
               | I use Tor to lookup phone numbers that have just called
               | me, to decide whether it's a good idea to answer. Since I
               | don't want to be personally associated with such numbers
               | I prefer to search anonymously. But often it's impossible
               | to get a result.
               | 
               | Sometimes even spending 5 minutes solving captchas isn't
               | enough. (I'd only spend that long to see if it's just an
               | outlier. No, it's quite common.)
               | 
               | This creates an immense pressure to tell various services
               | exactly who is phoning me, which is a terrible attitude
               | to privacy.
        
               | ehutch79 wrote:
               | Then don't use sites that are behind cloudflare?
               | 
               | It's not your choice if the site owners/admins use
               | cloudflare. It IS your choice not to use those sites.
        
             | ehutch79 wrote:
             | Because that's a not insignificant portion of traffic they
             | see from tor and vpns?
             | 
             | tor has some absolutely valid and import use cases, but
             | what percent of tor exit traffic is actually someone trying
             | to keep their traffic anonymous from the eyes of an
             | oppressive regime, and what percent are script kiddies, or
             | someone hiding torrenting from their isp?
        
       | AtNightWeCode wrote:
       | There are a lot services in Sweden that requires that you provide
       | a real authentication by using something called BankId. Basically
       | a personal digital id. This is the way to go. 100% secure
       | validated users. If there was a function added to make the users
       | anonymous to third party services it would be great.
       | 
       | I work with Cloudflare sites and it is clear that thier current
       | enterprise offerings are hard to tweak to solve attacks without
       | spamming the users with captchas. The captchas are already too
       | complicated for the average user so it is mostly turned off even
       | though it has other consequences. I have to look into this new
       | thing though.
       | 
       | As much as I like the idea of an open Internet to be used by
       | anyone from anywhere. It does simply not work today for a lot of
       | enterprises.
        
         | poisonborz wrote:
         | Ah yeah, instantly zeroing anonimity for most users, while it
         | can be still abused by malicious actors, a great worst of both
         | worlds solution.
        
           | AtNightWeCode wrote:
           | Not sure how you came to that conclusion from what I said.
           | What I am saying is that in the long run it will be
           | impossible to rely on companies like Cloudflare for security
           | when it comes to users. Over time all services will for
           | security reasons need to either directly or indirectly
           | authenticate all their users. That does not mean that each
           | users identity is provided to each consumer.
           | 
           | The open Internet is already dead thanks to Cloudflare,
           | Akamai etc. A lot of European companies use theses services
           | to block China, Russia, TOR, VPN-services and so on.
        
       | stickfigure wrote:
       | Reading CF's blog announcement [1], this is really horrifying. It
       | trains users to insert security keys and accept biometric
       | identification requests when visiting random web pages, on random
       | untrusted domains.
       | 
       | This cannot possibly end well.
       | 
       | [1]: https://blog.cloudflare.com/introducing-cryptographic-
       | attest...
        
         | ehutch79 wrote:
         | Isn't part of the point is that a phishing site wouldnt get the
         | same response as a legit site, and therefore it's be useless to
         | do that, so this behavior is ok?
        
           | stickfigure wrote:
           | It's not about phishing, it's about getting the user to
           | blindly accept security checks. If users are trained to
           | insert their usb key / scan their fingerprint whenever they
           | see a cloudflare page, bad actors can present a mockup of
           | this page to exploit that reaction.
           | 
           | Physical keys (and biometrics) work well because they are
           | rarely called for, and the user knows they are doing
           | something security sensitive. "This random page asked me to
           | insert my security key" can't be healthy.
        
             | ehutch79 wrote:
             | But wouldn't the response the mockup gets only work for
             | that page, not something they could pass through?
             | 
             | If any page can request any other pages response that'd
             | make the whole system pointless
        
             | tialaramex wrote:
             | No. Ignorance is a big problem. What makes Security Keys
             | work well isn't that they are "rarely called for" but
             | almost the opposite, they're so easy that you can add them
             | with little friction all over the place. Tapping to sign
             | into a remote server over SSH is no problem, it's scarcely
             | more effort than thumping "enter" on the command is.
             | 
             | What the user is doing is _not_ security sensitive. They
             | are, in fact, themselves, and that 's all the Security Key
             | is confirming. "Yup, still me".
             | 
             | One of the easy ways fools trip themselves up here is that
             | they think this is identifying information. But it isn't.
             | "Yup, still me" doesn't identify anyone. The identity was
             | already known to your interlocutor, which is why "Yup,
             | still me" is enough.
             | 
             | And that's what's so clever about the FIDO design. A
             | Security Key has no idea who it "is", it just knows it's
             | still the same as before. If you're already authenticated
             | as Jim Smith, you can enroll a security key "Yup, still me"
             | -> the Relying Party stores the information, and then you
             | can later sign in using it to verify your identity, "I'm
             | Jim Smith". "Is this still you Jim Smith?" "Yup, still me".
             | 
             | So that's why this doesn't help bad guys. "Are you still
             | er... you?" "Yup, still me". Completely useless. Of course
             | you are, that doesn't help them at all.
        
       | ryanlol wrote:
       | This seems to assume that existing captchas are much better than
       | they actually are.
        
       | sdfhbdf wrote:
       | The idea to replace CAPTCHA with FIDO doesn't seem sound, isn't
       | it trivial to imitate it with DevTools in Chrome or some other
       | software?
       | 
       | https://developer.chrome.com/docs/devtools/webauthn/
        
         | BillinghamJ wrote:
         | The attestation process is capable of cryptographically
         | checking the device manufacturer etc
         | 
         | (although practically I'm unsure as to whether that's really a
         | good idea or would work well)
        
         | arsome wrote:
         | I believe the idea here is you need to buy actual FIDO U2F keys
         | and they could then be revoked on a per-key basis if you're
         | caught abusing them as they're signed by a 3rd party so can't
         | just be emulated.
         | 
         | Meaning you need to buy more. Makes it expensive at least.
        
           | Nextgrid wrote:
           | How can you revoke on a per-key basis without at the same
           | time being able to track keys uniquely?
        
             | arsome wrote:
             | I'm not terribly familiar with U2F itself, but I assume the
             | site has a way to identify you're using the right key that
             | can be reused for this purpose?
        
               | tialaramex wrote:
               | When you enroll a token with a site, the token mints a
               | random new key pair and sends the site an ID and the
               | public key, signed with the private key.
               | 
               | The site records the ID and public key.
               | 
               | When you return, to confirm it's really you, the site
               | sends one or more IDs you've enrolled and says, sign this
               | fresh random data with one of the associated private
               | keys.
               | 
               | Your tokens can look at the site and an ID and decide if
               | they made that ID for that site, if they did they sign
               | the message with the private key, proving you are still
               | you. If they didn't make it, they pass and maybe you own
               | a different token that can sign, or maybe you show them a
               | different ID they do recognise.
               | 
               | To reuse this capability for tracking, the site would
               | need to _guess_ who you are first.  "I guess this is
               | arsome, they have this U2F key". But if they can guess
               | who you are, they already don't need such tracking.
        
             | y7 wrote:
             | Yup, you can't. Keys are perfectly trackable by Cloudflare,
             | but they promise they won't do this.
             | 
             | Edit: I was wrong. Cloudflare claims they could track
             | people, but it would require tracking via cookies. [1] The
             | hardware security keys have an "attestation key pair" that
             | is shared among all units in one production batch (which
             | contains at least 100K units). [2]
             | 
             | 1: https://blog.cloudflare.com/introducing-cryptographic-
             | attest...
             | 
             | 2: https://www.w3.org/TR/webauthn-2/#sctn-attestation-
             | privacy
        
               | dboreham wrote:
               | No, they can't do this. It's the U2F key vendor that
               | promises not to release a device-unique key to someone
               | like CF.
        
               | y7 wrote:
               | Thanks, I stand corrected.
        
             | dboreham wrote:
             | You can't. The device attestation protocol specifically
             | excludes the ability to uniquely identify devices (the key
             | has to be re-used in at least 99999 other devices).
        
           | sdfhbdf wrote:
           | But i'm specifically asking about software. I know Touch ID
           | can be used with WebAuthn and also see the DevTools in chrome
           | WebAuthn debugger. Just seems easy to fool when I regenerate
           | a key on every visit unless there is additional step I don't
           | get.
        
             | arsome wrote:
             | You can't generate an attested key with the devtools. It
             | won't be signed by a 3rd party that CloudFlare has
             | approved.
        
               | sdfhbdf wrote:
               | Ok, you're right. I did miss a step!
        
       | ibeckermayer wrote:
       | Could this be solved (in large part) if key makers like YubiKey
       | did I.D. verification on purchase? Then, to do the type of
       | "farming" that's mentioned in this article, you'd need to
       | organize a large group of people to all buy the keys rather than
       | just submit a bulk order to Alibaba.
       | 
       | Of course this idea raises privacy and authority concerns,
       | similar to certificate authorities.
        
       | Animats wrote:
       | Once they have your ID info, they'll later change the terms to
       | sell it to advertisers. Are they contractually committing to
       | never doing that? No. So they will.
        
         | CogitoCogito wrote:
         | Even if they are contractually committed not to sell your info,
         | that still might not save you:
         | 
         | "Yesterday, the bankruptcy court approved the sale over the
         | objections of several parties, including the Federal Trade
         | Commission (FTC) and third party manufacturers Apple and AT&T
         | who sold products to the bankrupt retailers.
         | 
         | ...
         | 
         | The FTC's objection was made to the court-appointed consumer
         | privacy ombudsman in the RadioShack bankruptcy. Specifically,
         | the FTC's letter alleged the sale of personal information
         | constitutes a deceptive practice because in its privacy policy,
         | RadioShack promised never to share the customer's personal
         | information with third parties."
         | 
         | https://www.jdsupra.com/legalnews/radioshack-bankruptcy-cour...
         | 
         | In that case the judge allowed the sale of the information in
         | contradiction to its commitments.
        
           | dragonwriter wrote:
           | > In that case the judge allowed the sale of the information
           | in contradiction to its commitments.
           | 
           | Note that bankruptcy _always_ allows things in contradiction
           | to commitments; bankruptcy is all about balancing which
           | commitments will not be fulfilled, and by how much, when a
           | party is no longer capable of fulfilling all of its
           | commitments.
           | 
           | If you don't like _particular_ commitments being voided in
           | bankruptcy, you want legislation specifically protecting them
           | so that there is a clear legal barrier to voiding those
           | specific kinds of obligations.
        
       | ignoramous wrote:
       | The article here ignores the view of the _Web_ that Cloudflare
       | has, which coupled with  "something you have" (the U2F keys)
       | makes for a compelling alternative to CAPTCHAs.
       | 
       | Sure, bots can automate keys, but those keys could also be banned
       | just as well. Cloudflare only needs to know which ones are the
       | good keys and track those forever. This means, for every non-bot
       | out there, the CAPTCHAs are as good as gone.
       | 
       | The genius of Cloudflare here is that they (ab)use WebAuthn,
       | which can also be implemented on Android and iOS natively. Before
       | you know it, Cloudflare has built an identity platform (where it
       | may not be helpful for KYC) is plenty useful for websites
       | Cloudflare fronts. Imagine never having to bother with user
       | registration and authentication and bots... that's the next
       | extension I see to all of this.
        
         | weird-eye-issue wrote:
         | They already have that but it's just for internal teams. I used
         | it recently to lockdown Wordpress installations
        
         | wfleming wrote:
         | Each key is associated with a batch of devices, though. If you
         | ban a key, you risk banning a bunch of legitimate users.
         | 
         | It's an interesting trade-off. It seems like batch keys for
         | device attestation was designed to help protect individual
         | privacy (good), but if you can't ban a key without potentially
         | a lot of splash damage when you detect a bad actor, that seems
         | like a very limiting choice.
        
           | tialaramex wrote:
           | The intent of attestation is that a business could decide,
           | OK, we think FooCorp are doing a proper job and we trust
           | their FIDO tokens, but we don't like all these dozens of
           | cheap alternatives. So for our corporate site we'll require
           | FooCorp tokens, and we'll just issue every employee a FooCorp
           | token on our dime.
           | 
           |  _Maybe_ it could make sense for a bank to do this, sending
           | account holders a special custom Security Key with the bank
           | 's branding on it. I personally think that's stupid, but I
           | can imagine it appealing to bank executives and it's not so
           | stupid as to be worse than SMS or TOTP 2FA that banks do
           | today.
           | 
           | But it clearly isn't relevant for no-cost services like
           | Facebook or Gmail, and so sure enough you can just tell them
           | you don't want to give them attestation and they work anyway
           | (I don't know if either of them ask, I just reflexively deny
           | attestation if it's requested).
           | 
           | It isn't intended to be useful for trying to do stuff like
           | Cloudflare are attempting here. Which doesn't mean Cloudflare
           | can't succeed in their goals, but in the FIDO threat models a
           | "bad actor" would be a whole _vendor_ , maybe some outfit is
           | using fixed long term secret keys inside their Security Key
           | products and they just sell the NSA a list of those keys -
           | you might decide to just refuse all the products from this
           | vendor. Whereas for Cloudflare the "bad actor" they're
           | worried about just buys a half dozen of whatever was cheapest
           | from eBay and then plugs them into a Raspberry Pi.
           | 
           | Or, do they? That's the gamble I think Cloudflare is taking.
           | Maybe the value of defeating this intervention is so low that
           | bad guys will not, in fact, build a Raspberry Pi Security Key
           | clicker proxy to make their thing work.
        
           | ignoramous wrote:
           | > _Each key is associated with a batch of devices, though. If
           | you ban a key, you risk banning a bunch of legitimate users._
           | 
           | You're right. I meant Cloudflare could ban the generated
           | public-key and not the device's public-key itself. Besides,
           | they could also mark the batch as being taken over by bots
           | and increase the level on challenges issued to the batch.
           | Note though, a single secure module can only generate / store
           | so many public-keys. For instance, _Yubi Key 5_ supports up
           | to 25 keys, though those could be reset to generate a newer
           | set of 25, but repeat registration of a number of keys from a
           | single batch is bound to trigger some statistical anomalies.
           | 
           | From Cloudflare's blog about _Cryptographic attestation of
           | personhood_ https://archive.is/4EbER
           | 
           | > For our challenge, we leverage the WebAuthn registration
           | process. It has been designed to perform multiple
           | authentications, which we do not have a use for. Therefore,
           | we do assign the same constant value to the required username
           | field. It protects users from deanonymization.
           | 
           | Currently, the user-name field is constant for all users. I
           | wanted to point out that that they could amend the
           | registration ceremony to register any user in particular.
        
             | TimWolla wrote:
             | > For instance, Yubi Key 5 supports up to 25 keys
             | 
             | This is for resident keys. A YubiKey 5 supports an infinite
             | number of non-resident WebAuthn keys, because the returned
             | key handle will simply be the private key encrypted with a
             | master key stored on the YubiKey. For authentication the
             | service will send the stored key handle back to the YubiKey
             | which then can decrypt it and use the decrypted private key
             | to sign the challenge.
        
               | [deleted]
        
               | ignoramous wrote:
               | TIL.
               | 
               | Envelope encryption. Neat. Can WebAuthn keys be (made) a
               | resident key? If so, is that preferred instead?
               | 
               | Conversely, what use case is there for resident keys in
               | context of WebAuthn? For example, if there are multiple
               | master keys, can I switch between them per browser /
               | website (assuming the master key itself is a resident key
               | and not burnt into the element)? Thanks.
        
         | robalfonso wrote:
         | I agree, I saw the author mention for $25k you could have 1000
         | keys. I immediately though that is not nearly enough. Given the
         | sheer volume they have, they would start putting a picture
         | together very quickly of ip/key/sites. There is no where near
         | enough uniqueness.
         | 
         | I also thought the idea of the key exchange being fast was a
         | red herring. That's a bad thing. If I'm them I'm paying
         | attention to how long from prompt to exchange it takes a human
         | to touch the button. On my own setup my key is on my laptop,
         | which is in a dock. I must stand up and tap it over my monitor.
         | It's just a few seconds but it's a) consistent in timing b) not
         | < 1s. Imagine the aggregate timing data they have.
         | 
         | Overall they make some good points if you are teeny tiny player
         | and ignore completely the scale cloudflare is operating at.
        
       | kylehotchkiss wrote:
       | I'd rather take these tradeoffs than doing 5 steps of Recaptcha
       | because I'm using a VPN to work, which as Cloudflares
       | announcement said, is very localized to North America and likely
       | extra complicated for those outside the region.
       | 
       | In theory, couldn't Yubikey begin reducing batch sizes to 1,000
       | and Cloudflare mark specific batch numbers as requiring one extra
       | step to verify? The vast majority of Yubikey sales will be for
       | real people in any case.
        
         | kylehotchkiss wrote:
         | And if Yubikey could reduce batch sizes - could they require
         | bulk non-wholesale orders to retain the same batch ID to reduce
         | likelihood of abuse?
        
           | SXX wrote:
           | This won't work because guess what? Bad actors have money and
           | means to buy as many devices as needed through individuals.
           | 
           | Most of the abuse that CloudFlare protects from is also
           | usually illegal. And those taking risk of doing something
           | illegal usually do it because it's highly profitable so
           | making some authentification devices a bit more expensive
           | won't make any difference.
        
         | livre wrote:
         | Wouldn't reducing batch sizes make privacy even more of a
         | problem? Now instead of a 1/100000 chance of the user being the
         | same person on another website there would be a 1/1000 chance.
        
           | makomk wrote:
           | Yeah, except that because the other 999 users probably
           | wouldn't be using Tor to access the same websites, in
           | practice this would be pretty much guaranteed to give a
           | highly accurate, persistent tracking identifier.
        
         | md_ wrote:
         | The batch size requirement is imposed by the FIDO spec, to
         | ensure that batch IDs are not so high entropy as to pose a
         | privacy problem.
         | 
         | "In this Full Basic Attestation model, a large number of
         | authenticators must share the same Attestation certificate and
         | Attestation Private Key in order to provide non-linkability
         | (see Protocol Core Design Considerations). Authenticators can
         | only be identified on a production batch level or an AAID level
         | by their Attestation Certificate, and not individually. A large
         | number of authenticators sharing the same Attestation
         | Certificate provides better privacy, but also makes the related
         | private key a more attractive attack target."
        
       | rvz wrote:
       | How does this so called 'CAPTCHA replacement' idea compare to
       | Sign In With Apple? which also does not use any CAPTCHAs and aims
       | to prevent bot sign ups.
        
         | md_ wrote:
         | Apple Sign-In is just an OpenID federated login; these don't
         | inherently provide any anti-automation or rate limiting; they
         | just push the problem to the Identity Provider.
         | 
         | IdPs like Apple/Google/Microsoft might do a fine job of
         | limiting you to "one account per $unit-of-hardware"; Apple in
         | particular can do this via iOS attestation. But then you're
         | limited to either their heuristics (in the case of MSFT/Google)
         | or their hardware (in the case of Apple).
         | 
         | Ultimately this is apples-to-oranges, though, since Cloudflare
         | is not offering an IdP product but simply an anti-automation
         | solution. If you use federated auth, you're getting (and giving
         | up) a lot of other stuff beyond just anti-automation.
        
       | arsome wrote:
       | There's always CAPTCHA bypasses if you're willing to pay,
       | there've been sites operating for decades that will take a
       | captcha URL and spit out the appropriate response by just feeding
       | it to humans. This is just a different way to make you pay - and
       | arguably to something of less ill-repute, buying more U2F keys
       | once yours get banned.
       | 
       | This provides effective rate limiting and you can still get every
       | key you automate banned very easily.
        
         | [deleted]
        
         | fierro wrote:
         | this is what the blog post fails to address. Is the FIDO2
         | hardware key approach abusable? Yes. More so than regular image
         | based CAPTCHA? No, like you said, there are well established
         | services for "mechanical-turking" away the problem.
        
         | garaetjjte wrote:
         | If I understand it correctly, you cannot ban attestation key
         | without potentially banning lots of legitimate users.
        
         | WORMS_EAT_WORMS wrote:
         | Second this.
         | 
         | For a major sporting event, one of our sites was heavily
         | targeted by "free TV streaming services" self promoting their
         | stuff.
         | 
         | No amount of Google CAPTCHA or Cloudflare could stop it while
         | keeping it online. Never seen anything like it in my life.
        
           | apple4ever wrote:
           | That makes me so frustrated.
           | 
           | I HATE CAPTCHA's with a passion. They are everywhere and
           | constantly slow me down. And you mentioned, they are likely
           | not helpful in stopping bots.
        
             | temp667 wrote:
             | I almost never see a captcha. On a static fiber IP - 1GB.
             | Use chrome. Not sure if that matters.
        
       | thejosh wrote:
       | Cloudflare is both a great thing and a terrible thing that has
       | happened to the internet in recent years.
       | 
       | Great in that they have a fantastic UI to add your site in,
       | basically shielding the average user from attacks.
       | 
       | Bad from a standpoint of that now only Google, Bing, and maybe
       | other big search engines have the capabilities to actually crawl
       | the internet now.
       | 
       | I don't see us getting a massive innovation in search on the
       | internet now that Google has such a massive foothold, and
       | companies like Cloudflare stop innovation from happening.
        
         | Beaver117 wrote:
         | Maybe I'm cynical, but I don't see any innovation to be done in
         | search. Google results have become much less useful over the
         | past few years. If they cannot solve search with basically
         | unlimited resources, how is a tiny company going to?
         | 
         | 1. Filtering ever increasing trillions of spam/clickbait pages
         | 
         | 2. Figuring out which results are useful information vs
         | corporates trying to sell something.
         | 
         | Those problems are not solveable by a couple guys in a garage
        
           | spiderfarmer wrote:
           | Maybe not, but the garage guys should look into becoming the
           | best search engine in a niche and expand from there.
        
           | janeroe wrote:
           | Maybe I'm cynical, but I don't see any innovation to be done
           | in {AREANAME}. {PRODUCTNAME} has become much less useful over
           | the past few years. If {MONOPOLISTNAME} cannot solve
           | {AREANAME} with basically unlimited resources, how a tiny
           | company going to?
           | 
           | Nice pasta, can't believe someone would use it unironically.
           | 
           | Maybe I'm cynical, but I don't see any progress to be done in
           | government transparency. USA's government has become way more
           | overreaching and much less transparent over the decades. If
           | USA cannot solve this issue, how a small country going to?
        
           | JeremyBanks wrote:
           | It's not clear to me that Google still gives a fuck about
           | solving search/organizing the world's information and make it
           | useful. The mess they've incentivized the web to become is
           | very profitable for them.
        
           | SXX wrote:
           | > If they cannot solve search with basically unlimited
           | resources, how is a tiny company going to?
           | 
           | Reminder: Google Is An Ad Company. Are you sure they actually
           | want to solve search? Their primary interest is to be that
           | corporation selling you something.
        
         | jgrahamc wrote:
         | _I don 't see us getting a massive innovation in search on the
         | internet now that Google has such a massive foothold, and
         | companies like Cloudflare stop innovation from happening._
         | 
         | How are we "stopping search innovation"?
        
           | thejosh wrote:
           | Hi!
           | 
           | Thanks for taking the time to reply.
           | 
           | You mentioned about "legit" crawlers, what defines a "legit"
           | crawler in the eyes of Cloudflare, and what happens when
           | Cloudflare suddenly decides it does not want to honour that
           | "agreement"? What happens if/when Cloudflare is sold, or the
           | contact who greenlit these smaller "legit" crawlers moves on
           | and decides that it no longer agrees with said website
           | anymore?
           | 
           | Is a price comparison site a "legit" crawler? What defines a
           | "bot" vs a "crawler" in the eyes of Cloudflare?
           | 
           | Would you need to notify your customers that you now also
           | allow additional crawlers access to their sites, or would
           | they need to opt into it via the Cloudflare dashboard? What
           | happens when you have a falling out with my said company (it
           | happens, relationships sour) and suddenly we can't make
           | contact, then suddenly customers websites aren't being
           | scraped because we're bots?
        
             | dannyw wrote:
             | There are a lot of hypotheticals here. I think you'll
             | convince CloudFlare, and their customers, if you could name
             | names and mention specific examples?
             | 
             | If you are a price comparison site getting blocked by
             | cloudflare, site owners may be losing sales, and that's
             | good feedback.
        
               | jopsen wrote:
               | Or site owners may actively want to block a price
               | comparison site..
               | 
               | Depending on the industry, etc..
        
               | thejosh wrote:
               | Agreed on this point, and most companies who would want
               | to do these sorts of things would restrict it even if
               | Cloudflare didn't exist.
               | 
               | I really can't think of a good solution. But that's the
               | tricky position Cloudflare is in - how does it balance
               | everything.
        
               | hysan wrote:
               | Price comparison example: https://shucks.top/ sometimes
               | gets blocked by cloudflare. Most recent was getting
               | blocked from checking B&H.
        
           | surround wrote:
           | I trust that cloudflare will act responsibly in allowing
           | small search engines through, but I really, _really_ would
           | rather _not_ have to trust cloudflare. I don 't believe that
           | any organization can or will always act responsibly, which is
           | why it's concerning that cloudflare controls so much of the
           | internet.
        
             | EvanAnderson wrote:
             | Yes. This. I believe that John Graham-Cumming is genuine in
             | his statements in this thread re: "contact me if you're
             | running afoul of our controls", for example. If he leaves
             | Cloudflare, Cloudflare "turns evil", etc, then that's all
             | out the window.
             | 
             | Individual companies having so much power gives me the
             | willies.
        
           | pdimitar wrote:
           | By being gatekeepers on which website crawling is okay and
           | which is not.
           | 
           | No such filters should exist. Is it really _that_ awfully bad
           | without anything but basic filters (ban an IP for flooding)?
           | Are there like, operations that try and spam every single
           | Cloudflare-hosted website 24 /7?
           | 
           | Legitimately curious if your anti-bot measures come from
           | actual bad experience with the internet or is it just a
           | liability limitation move? (Namely to reduce potential suing
           | surface by angry data owners and/or three-letter agencies.)
           | 
           | ---
           | 
           | Basically, if I am experimenting with a basic crawling
           | program and I hit websites A and B 20 times each in a space
           | of one hour, is that really deserving of a captcha or extra
           | auth methods?
           | 
           | Not flaming but I am really curious. Do you have any data and
           | rationale posted somewhere that go into deeper detail about
           | why Cloudflare's bot detection is how it is?
        
             | fivre wrote:
             | Cloudflare's customers request and then enable those
             | features. Cloudflare itself doesn't give a damn about that
             | traffic; they have bandwidth to spare. They will, however,
             | happily sell tools to people that do care.
             | 
             | That isn't to say that the customers are savvy and have a
             | good understanding of different types of automated traffic
             | and which automated traffic is harmful and which is benign.
             | Many have a quite naive understanding that doesn't extend
             | beyond "bots = bad, unless it's Google" and dial protection
             | settings to the max for no good reason.
        
           | o-__-o wrote:
           | How does my startup crawl Cloudflare sites without paying a
           | hefty fee to Cloudflare?
        
             | jgrahamc wrote:
             | https://news.ycombinator.com/item?id=27153635
        
               | [deleted]
        
               | o-__-o wrote:
               | This will scale wonderfully!
        
               | jgrahamc wrote:
               | No, what scales is us making our DDoS and bot detection
               | not disrupt the crawling of legit search engines that
               | respect robots.txt, don't crawl at ridiculous speeds,
               | don't do dumb stuff like pretend they are the Googlebot.
               | We have teams who work on that. You can read more here:
               | https://blog.cloudflare.com/tag/bots/
               | 
               | But let's suppose someone is building a new cool search
               | engine and our ML stuff is blocking them. Then... contact
               | us/me.
        
               | timlardner wrote:
               | That doesn't sound unreasonable. Out of interest, what
               | would you consider a ridiculous speed to be crawling at?
        
               | LinuxBender wrote:
               | I can't speak for Cloudflare, but crawling speed should
               | be dictated by the site owner via the robots.txt crawl-
               | delay. [1] A site owner could also rate-limit
               | unauthenticated requests by IP _via the cloudflare
               | header_ using a 429 _too many requests_ error page.
               | 
               | [1] - https://en.wikipedia.org/wiki/Robots_exclusion_stan
               | dard#Craw...
        
               | o-__-o wrote:
               | This here is the problem. It's a new time no one wants to
               | be Rfc compliant, just go behind a service and problem is
               | solved.
               | 
               | So no problem, time to move on web search is no longer
               | exciting
        
               | o-__-o wrote:
               | So for my startup to crawl sites I must now adhere to
               | Cloudflare's Requirements of the Web(TM) or reach out to
               | individual engineer, who may leave at any moment. Gotcha
               | 
               | (but Google is allowed because Google was first to
               | market)
        
               | midev wrote:
               | Why would you possibly think you can do whatever you want
               | to someone else's site?
               | 
               | Yes, you must adhere to the controls that site
               | administrators put in place, like Cloudflare.... You
               | don't get to blast my site with requests, just because
               | you want to...
        
               | o-__-o wrote:
               | (a) Who said I was blasting your site with requests?
               | Cloudflare stops much more than just blasts
               | 
               | (b) But you're a-ok with Google doing this. Gated
               | communities aren't really good for anybody but I see what
               | you are saying.
        
               | midev wrote:
               | Gated communities are great. They lower the risk of crime
               | significantly: https://www.sciencedaily.com/releases/2013
               | /03/130320115113.h...
               | 
               | The same is true online. Apple's walled garden has kept
               | hundreds of millions of people safe on their device. It's
               | why iOS malware isn't a thing.
               | 
               | > Cloudflare stops much more than just blasts
               | 
               | Exactly. There's even more benefit to Cloudflare than
               | just DDoS. Captcha's for stopping credential stuffing,
               | for example.
        
               | 77pt77 wrote:
               | It seems to be by design.
        
           | navanchauhan wrote:
           | Crawling a Cloudflare powered website is basically impossible
           | without needing to do some bodges as to how to crawl it.
           | 
           | How can you expect someone to crawl a bunch of websites if
           | they are actively blocked from accessing it? Now, you might
           | say users can whitelist bots in their robots.txt file but
           | then again will the person creating the engine individually
           | ask companies to allow them to crawl?
           | 
           | Also, slightly unrelated but Cloudflare protected websites
           | are almost impossible to access via tor, the captcha never
           | succeeds.
        
             | jgrahamc wrote:
             | If you are building a search engine and getting blocked you
             | can always contact me and I'll make sure that the teams
             | that work on bot detection and DDoS are aware. We would
             | like to know because we should _not_ be blocking a legit
             | crawler like this.
        
               | throwaway3699 wrote:
               | For every developer that sees this message, a few dozen
               | will have given up.
        
               | new_guy wrote:
               | Exactly this. Cloudflare actively blocks legit crawlers.
               | It shouldn't be dependent on seeing some random hn
               | comment from some random at cloudflare to get that fixed.
        
               | PaulHoule wrote:
               | What makes a web crawler "legit?"
               | 
               | When I had a site that had millions of pages, I found
               | that sites like Baidu would crawl my site as often, if
               | not more often than Google.
               | 
               | I already felt the relationship with Google was
               | parasitic, but I looked through my logs and never found a
               | single hit that came from Baidu and many of the other
               | search engines that would overload my site.
               | 
               | I was looking at a substantial part of the site running
               | costs going to supporting web crawlers that were not
               | doing anything (1) to help me, or (2) to help end users
               | (if they don't want to send Chinese users to an English-
               | speaking web site, why crawl the site?)
               | 
               | So like it or not I am inclined to only allow Google and
               | Bing in the robots.txt because Google is the only site
               | that sends a significant amount of traffic and because
               | Bing sends some, and Google needs some competition.
               | 
               | There are web crawler behaviors that are annoying:
               | harvesting email addresses, overloading your site, etc.
               | But how do you know who is doing something wrong with the
               | data and who is just collecting it do do nothing with it?
               | (Probably 95% of web crawling ex. Google.)
        
               | acdha wrote:
               | > So like it or not I am inclined to only allow Google
               | and Bing in the robots.txt because Google is the only
               | site that sends a significant amount of traffic and
               | because Bing sends some, and Google needs some
               | competition.
               | 
               | This sounds like you're onto a reasonable "legit" factor:
               | does the crawler honor robots.txt? Baidu would be legit
               | because they don't lie about their identity and if you
               | put a rule in your robots.txt file they'll honor it.
        
               | oefrha wrote:
               | Say I'm interested in building a small scale domain-
               | specific search engine and only just started development.
               | There's no prototype yet and may never be. In this
               | situation, how do you determine it's a legit crawler?
               | 
               | And what about crawlers with even more limited scopes
               | (targeting only a handful of sites) that they can't
               | possibly be called search engines? Are they ever
               | considered legit?
        
               | jgrahamc wrote:
               | Be a good netizen? Respect robots.txt. Don't lie in your
               | User-Agent. Don't crawl at a ridiculous rate. All those
               | are a good starting point.
        
               | joepie91_ wrote:
               | Do you not realize the more fundamental problem with you,
               | as a company, essentially being the one who gatekeeps
               | crawler access to the web?
        
               | dannyw wrote:
               | I use cloudflare out of my free will because there's
               | malicious traffic out there, and I have enough control
               | over everything.
               | 
               | They're only a gatekeeper because sites voluntarily enter
               | into commercial agreements with them. There's no coercion
               | or manipulation like Google AMP.
        
               | midev wrote:
               | Customers pay for this as a feature. Why would they feel
               | it's a fundamental problem? There's nothing that says
               | admins need to let you crawl their site.
        
               | vidarh wrote:
               | If people intend this to happen, sure . But how many
               | people who put their sites behind Cloudflare is aware
               | that this might be a side effect?
        
               | midev wrote:
               | I would wager that most people that purchase Cloudflare
               | are probably aware of the features it offers
        
               | vidarh wrote:
               | Note the distinction. I'd wager that the vast majority of
               | sites behind Cloudflare are not paying customers, and
               | have not paid much attention beyond "hides my server IP
               | slightly and stops DDOS's", without having thought more -
               | or at all - about the wider implications.
        
               | foobiekr wrote:
               | What is a "ridiculous rate"? Where is it documented?
        
               | oefrha wrote:
               | I think the problem is some IPs just straight-up always
               | get CAPTCHAs from Cloudflare even if one's a good
               | netizen, respect robots.txt, not crawl at ridiculous
               | rate, and not lie in the user agent. One reason is shared
               | IP, which disproportionally affects people from third
               | world countries as their ISPs don't have enough IPv4 for
               | everyone; but it also happened mysteriously to at least
               | one dedicated IP I used in the past. Your confrontational
               | tone is rather unfortunate, and the problem of course is
               | that you don't guarantee anything even if the user has
               | done nothing wrong, as is manifest from the choice of the
               | phrase "starting point".
        
               | ehutch79 wrote:
               | Then the problem has nothing to do with your crawling.
        
               | Xamayon wrote:
               | Even then, I've run into issues with scraping several
               | sites for the reverse image search engine I operate.
               | Luckily, in most cases I have been able to get in touch
               | with the people running those sites to get a rule added
               | for my IPs to allow them through. That's not scalable
               | though, and limits where I can scrape/crawl from. Even
               | something as simple as checking a site for updates every
               | hour or two tends to get blocked after a few times. TBH,
               | one of the only things I have found which helps is lying
               | in the user agent and copying CF cookies. Luckily, I
               | haven't had to play with that for a few months due to
               | whitelisting, so not sure it it would still help. Things
               | change rapidly.
        
               | jmg03 wrote:
               | What's the best way to contact you?
        
               | jgrahamc wrote:
               | jgc @ cloudflare
        
             | freshair wrote:
             | > _Also, slightly unrelated but Cloudflare protected
             | websites are almost impossible to access via tor, the
             | captcha never succeeds._
             | 
             | Yes, I've never understood why it's seemingly so important
             | to CAPTCHA me before serving me less than 100kb of read
             | only plain jane HTML. What sort of "attack" is this
             | stopping? I'm pretty sure the CAPTCHA itself is bigger than
             | half the sites it blocks me from reading.
        
           | SXX wrote:
           | For instance there is no way for distributed search engines
           | to work with CloudFlare. No, "contact me and we'll help" is
           | not always a solution.
        
             | jgrahamc wrote:
             | Please explain the problem (here or via email to me).
        
               | catillac wrote:
               | I'm really sorry, but you appear to be the CTO of
               | Cloudflare, which makes your not knowing the ins and outs
               | of the problem already and basic questioning of it seem
               | like sealioning.[1]
               | 
               | [1] https://en.m.wikipedia.org/wiki/Sealioning
        
               | jgrahamc wrote:
               | I do not understand what the parent means by a
               | "distributed search engine" and I do not know what
               | problem they are facing.
        
               | viraptor wrote:
               | https://yacy.net/ for example. Each interested node does
               | indexing and serving some chunk of the results.
               | 
               | Or in practice - each node quickly runs into a CloudFlare
               | captcha preventing it from indexing content for a few
               | hours/days. Since CF fronts a lot of the useful internet
               | these days, it means it's effectively working against
               | distributed indexing with its current captcha solution.
        
               | jgrahamc wrote:
               | Thanks. I'll bring this to the attention of the bots and
               | DDoS teams.
        
               | jimktrains2 wrote:
               | Yacy is 18 years old and not exactly obscure. If your
               | bots team is unaware of it it's because they've chosen to
               | be ignorant of it.
        
               | joepie91_ wrote:
               | A search engine which is not run centrally by one
               | organization on infrastructure in a known network, but
               | rather something like YaCy where individual users run
               | crawler nodes on networks that vary over time.
               | 
               | Which makes "contact us for an exception" a no-go, as the
               | relevant source IPs will constantly be changing.
        
               | vntok wrote:
               | Yeah, I certainly don't want those crawlers anywhere near
               | my servers. Block by default and allow site admins to
               | unblock should they want to seems like the best way. It
               | is also already how it works with Cloudflare.
        
               | SXX wrote:
               | It is great that you care and I guess others are already
               | provided some examples, but I'll add my own 2c here.
               | Obvious problem is that centralized service like
               | CloudFlare do create entry barrier and make large players
               | on search and data mining markets even more entrenched
               | than ever.
               | 
               | Recently your company announced partnership with Internet
               | Archive, but if CloudFlare want to continue play a role
               | as behevolent party everyone should have equal access to
               | this data. Yeah it means that some bad actors will be
               | able to easily scrap the web too, but...
               | 
               | CloudFlare service can't prevent scrapping anyway. There
               | are shady residential proxy networks, services to bypass
               | captcha and scrapping software like Zennoposter. It's
               | possible to make scrapping more expensive, but bad actors
               | don't care because they have money. Unfortunately
               | enthusiasts, open source projects and small companies
               | don't have enough resources to do the same.
        
               | dannyw wrote:
               | Making scraping harder definitely reduces scraping. Some
               | bad actors will get through, but others will be deterred.
               | 
               | I think you might not understand that it's site owners
               | like me who want to stop scraping. It usually comes from
               | specific bad incidents, like copycat sites stealing our
               | content and work.
               | 
               | Cloudflare wouldn't block scraping if website owners
               | didn't want it. And website owners can easily disable
               | this protection.
        
               | SXX wrote:
               | Scrapping protection is not a problem: defaults that
               | CloudFlare promote are. Saying that website owners can
               | disable it is akin saying website owners should go and
               | whitelist Tor nodes. Most of website owners don't
               | understand either of issues and they never gonna opt-out.
               | 
               | Also I'm talking from experience because I been on both
               | sides of the fence: doing scrapping and implementing
               | protection. So yeah your CloudFlare protection will deter
               | 10% of bad actors, but will also cut off 99% of
               | enthusiast / research efforts or users of niche software
               | or browsers. Still anyone with $1000+ budget will scrap
               | whatever they want.
        
             | foobiekr wrote:
             | That response is just a way to move the discussion out of
             | the public domain without actually addressing it. It's a
             | scam.
        
               | SXX wrote:
               | Calling CTO of a big company who come to talk with us
               | "scam" is very counter-productive. Some of CloudFlare bad
               | sides are certainly by design and cannot be changed, but
               | they can still change their default filtering policies in
               | a way that will help open web greatly.
        
               | vidarh wrote:
               | jgrahamc is one of the most active HN users. He's in the
               | top 20 on the "leaderboard". He has a good reputation.
               | I'd hesitate to call this offer a scam.
               | 
               | That said, I don't think it's a good situation that this
               | is the solution rather than a proper, documented position
               | that people can work to.
        
             | PaulHoule wrote:
             | I've never been able to "reach a human" at Google, Facebook
             | and other web giants and I'm skeptical that you can at a
             | place like Cloudflare. In fact, I'd be really astonished it
             | was possible, because otherwise their business isn't
             | scalable.
        
               | alternize wrote:
               | while I share your sentiments regarding some other
               | companies, I was able to get in touch with an actual
               | cloudflare technician (and not some outsourced first
               | level support with standard boilerplate replies) in a
               | timely manner even on their free tier when I ran into a
               | problem with one of their system. every support case with
               | them has do far been a real pleasure compared to what you
               | experience with other companies. I only hope they will be
               | able to keep this level...
        
               | yazaddaruvala wrote:
               | The grandparent, jgrahamc, you're responding to is the
               | CTO of Cloudflare.
               | 
               | If this doesn't at-least meet your definition of "reach a
               | human at Cloudflare", I'm not sure what will.
        
               | dannyw wrote:
               | Cloudflare support has been exceptional to me as a
               | website owner.
        
               | SXX wrote:
               | I personally love CloudFlare and (just like with e.g
               | DigitalOcean) I always found a way to contact human
               | there. Unfortunately it's doesn't fix fundamental issue
               | of how they make internet much more centralized and easy
               | to MiTM or apply censorship.
        
               | adspedia wrote:
               | Did you contact anyone at Cloudflare for an issue not
               | explained in the support docs and you got no response?
        
         | ex_amazon_sde wrote:
         | ...also Cloudflare has been a disaster for Tor. It really harms
         | Tor's usability.
        
         | nabla9 wrote:
         | There was once cs professor who claimed that the internet does
         | not scale around 2000. Either things choke up or you need huge
         | investments into networks.
         | 
         | It turns out that he was right, sort of. The vanilla attach
         | server into the internet, server-to-client IP-network is pretty
         | much dead. It has been replaced with CDN's , private delivery
         | networks, cache on top of cache. Cloudfare, Amazon, Google and
         | MS are the connection points for the IP. Their internal network
         | infrastructure transfer most of the data.
        
           | 542458 wrote:
           | Is it pretty much dead? Yeah, if you're moving FAANG level
           | traffic you need something more fancy than LAMP + an internet
           | connection, but I've seen dozens and dozens of sites with a
           | plain old no-cdn, no-pdn, LAMP tech stack. Working with
           | startups might bias your view - lots of companies are running
           | extremely boring setups and they work just great.
        
             | nabla9 wrote:
             | Respectfully, I think you miss the point.
             | 
             | The fact that 98% of traffic goes trough this new
             | infrastructure, allows some still plug their server to the
             | net raw and their traffic still gets trough.
        
               | 1_person wrote:
               | The traffic goes through this "new infrastructure" not
               | out of necessity but because it's free and allows the
               | user to pretend like a number of problems don't exist.
               | 
               | Terrestrial optical networks operate far, far below
               | capacity to create artificial scarcity, which is to a
               | certain extent necessary to recoup the capital
               | expenditures in a competitive market, and is to a certain
               | extent an abuse of an under-regulated natural monopoly.
               | 
               | If you could eliminate all adversarial factors and put
               | every data service subscriber's payment for a single
               | month into a pool, and that pool purchased only
               | transceivers, passive optics and switches, and this
               | hardware was distributed to every network operator
               | perfectly fairly based on its contribution to the global
               | maximization of available network capacity, then the
               | delivered capacity to the end user could increase by
               | something like 4.5 orders of magnitude with no
               | substantial change in topology or subscriber or provider
               | costs afterwards using the existing fiber, with a few
               | more orders of magnitude possible with a fatter tree
               | before the backbone costs explode.
               | 
               | With DWDM you can carry 100+ channels of 100Gbps over a
               | single fiber today, with commodity, off the shelf
               | components. Most fiber in the ground today is probably
               | still lit with a single wave of 10G, if it's not just
               | dark.
               | 
               | This distribution model is not even remotely a technical
               | necessity, it's an arbitrary local minima reached largely
               | by exploitative market distortions and adversarial
               | economics.
        
               | starfallg wrote:
               | >The fact that 98% of traffic goes trough this new
               | infrastructure
               | 
               | Doesn't mean that 98% of the value of the Internet
               | results from this traffic.
               | 
               | Even if you discount all of the web, there still lots of
               | applications using the federated model (e.g. SMTP) or
               | peer-to-peer (e.g. Crypto, VOIP), that require end-to-end
               | connectivity.
        
               | nabla9 wrote:
               | You expand this discussion into completely new direction.
               | 
               | My original point is about capacity and if the old
               | internet could work today. It seems like I'm correct, but
               | I'm not so sure. I would like to see other opinions.
               | 
               | Every response so far is "there exists". The real issue
               | is if the internet could do everything without caching
               | data near the edge.
        
           | johnklos wrote:
           | No. The marketing of all that extra crap has gotten better.
           | 
           | It's just like Windows - just because 95% of the Internet
           | does something one way doesn't mean it doesn't suck, isn't
           | more complicated than it needs to be, and doesn't cost more
           | than it needs to cost.
           | 
           | Anyone with a little bit of bandwidth and a Raspberry Pi can
           | run a web server, even with dynamic content.
        
         | fierro wrote:
         | This feature is enabled at the behest of the site owner. I feel
         | like site owners and operators should decide who gets to visit
         | their site. Am I missing something obvious here?
        
       | clukic wrote:
       | Cloudflare is the professional wall builder you hire to protect
       | your garden.
       | 
       | Tech monopolies have always had a vested interest in locking up
       | user data, dictating the policies, and enforcing their own
       | ownership rights. It used to be that only the largest and most
       | sophisticated companies had the resources to shield that data,
       | but Cloudflare changed all that. Walls are now trivial to set up,
       | and virtually unbreachable, and that has forever changed the
       | character of the internet by enforcing monopolistic policies with
       | such technical precision that they're virtually impossible to
       | overcome.
        
         | WORMS_EAT_WORMS wrote:
         | No offense, this framing is so dumb. I hate it.
         | 
         | The 'Internet 3.0' isn't coming because of Cloudflare. It's
         | coming because these monolith big tech companies have an army
         | of engineers who have been centralizing and building it this
         | way for years.
         | 
         | Cloudflare didn't build these walls, it's more of a giant boat
         | now navigating it because other companies have no choice.
         | 
         | I like to think of them as giant data ferryman in this regard
         | versus "a wall builder".
         | 
         | I'm not saying frustrations aren't warranted but -- like come
         | on -- have a little perspective of what's really happening with
         | the Internet and who is actually driving it.
        
           | clukic wrote:
           | Clearly Cloudflare isn't responsible for the data
           | centralization that is corrupting the internet. They are
           | however, a very sophisticated and efficient enforcer of those
           | policies. They've helped ensure large portions of the web is
           | no longer crawlable, and that serves to consolidate
           | information and power in those tech monopolies.
        
             | WORMS_EAT_WORMS wrote:
             | Aka, SMB's now have access to the same tools the tech
             | monopolies do.
             | 
             | GDPR-like policies will continue to flood as governments
             | partition their Internets and data making it harder and
             | harder to run international Internet businesses.
             | 
             | I'm not particularly happy about things either (especially
             | crawling access), but it will be a net positive whenever
             | you can level the playing field with competition.
             | 
             | When the biggest infringers of data are driving the
             | creation of government policies that only they can
             | circumvent and navigate -- that's a serious, serious
             | problem.
        
             | fierro wrote:
             | why is it assumed the web ought to be crawleable?
        
       | xfer wrote:
       | I mean there are people making $1 for 1000s of recaptcha solved.
       | So not sure how a $40 device is not an improvement if your goal
       | is to enforce some rate-limiting against scripts using these
       | services.
        
       | mwcampbell wrote:
       | There is no perfect solution, but I'm in favor of anything that's
       | a net improvement in accessibility for disabled people, even if
       | it's not ideal in some other way. So I'm disappointed to see this
       | solution being shot down before it even gets deployed on a large
       | scale.
        
         | emteycz wrote:
         | Right before large scale deployment might be the last moment
         | it's possible to prevent the large scale deployment.
         | 
         | Unfortunately corporations are not good at going a step back if
         | the step forward is good for their business.
        
           | mwcampbell wrote:
           | The trouble is that this change could be good not just for
           | Cloudflare's business, but for people. If it turns out that
           | this new CAPTCHA alternative is an improvement for users, but
           | hurts some businesses who have to put up with a new form of
           | abuse, I think that's a net win. Let's not stop it before it
           | has a chance.
        
             | emteycz wrote:
             | I agree, but what if it's not and going a step back is then
             | refused? Is there really no other way of testing than large
             | scale deployment?
        
       | eatbots wrote:
       | Yep. https://www.hcaptcha.com/why-captchas-will-be-with-us-always
       | 
       | (disclosure: work there)
        
       | rudedogg wrote:
       | I'm so fed up with reCAPTCHA. ~90% of the time it doesn't work on
       | desktop Safari (I can see CORS errors in the console), so I have
       | to use a different browser. Even Gumroad won't let me buy things
       | due to this. It really feels like an anti-competitive "bug" (read
       | feature), and is so annoying it's hard to not just give up and
       | use Chrome.
       | 
       | I feel like I'm crazy - no one else complains. I've mentioned
       | @GumRoad on twitter but nothing.
        
       | ComodoHacker wrote:
       | Is CAPTCHA a necessity only in ad-sponsored web? Is there other
       | compelling use-case for it?
       | 
       | Can we make CAPTCHA obsolete with decent micropayments solution,
       | when you pay for every transaction with every website, just like
       | we pay for every drop of water we use? Perhaps ISPs could handle
       | it for us?
        
         | tomjen3 wrote:
         | Captchas, or some anti-bot software is still needed whenever we
         | deal with credit cards, because we are still using the obsolete
         | version where, if you get your hands on the numbers , you can
         | charge any amount you want to whomever you want, instead of a
         | model where your card digitally signs the payment request for
         | the given amount and receiver, which would mean that any theft
         | is pointless.
         | 
         | Anti-bot measures are also used to try to prevent password
         | guessing on e.g the login site to gmail.
         | 
         | Finally, sometimes some places offer things like tickets that
         | go very quickly, in which case having a bot reload the page
         | means the tickets are likely to go to somebody owning a bot
         | rather than a fan of the performer.
         | 
         | None of these cases are solved by payments, they are solved by
         | client side certificates and, in the last case, by requiring
         | the name of the people who are to use the ticket.
        
         | Ayesh wrote:
         | How do you know who to pay to?
         | 
         | Sure, you are paying for the every drop of water, but what if
         | you really wanted to pay for water from a specific region,
         | doesn't want water from another region, and trust that the
         | water company does not keep a cut or rip off either of you?
        
           | infogulch wrote:
           | Https with the origin?
        
         | osmarks wrote:
         | I can't see that being very popular. Even if it doesn't
         | actually cost you much in absolute terms, billing per page will
         | make people a lot more reluctant to explore new content.
        
           | kevincox wrote:
           | This is an often neglected benefit of "Unlimited" plans. It
           | changes the feeling of consuming. You have already paid so
           | you may as well enjoy instead of asking "Do I really want to
           | pay for this?" at every use.
           | 
           | From a technical point of view it is possible. Assuming that
           | the payments were mediated by some party that party could
           | issues statements like "this user has used their monthly
           | allowance but they would pay". Assuming that this provider is
           | widely trusted websites may treat this as a "real user" and
           | allow the visit. (This is roughly how https://coil.com/
           | works.) Of course there are negative implications such as
           | making it very difficult for new providers to get started.
           | 
           | You can also imagine some type of smart-contract where the
           | subscription fee is split at the end of the day or month
           | amongst the visited sites. Upon visit the sites just get a
           | token for one share. Of course this would need to be very
           | carefully designed to prevent abuse. (Example one malicious
           | client splitting their subscription across millions of pages)
        
       | lsaferite wrote:
       | Privacy seems like a bad argument considering CF already has the
       | technical ability to easily track you across all of the sites
       | they front if they so desire.
        
         | wbkang wrote:
         | With each domain u2f generates a different key (conceptually)
         | so this should be harder to track potentiality.
        
       | djoldman wrote:
       | Can a FIDO key be implemented in software? Can you write a
       | program to register a FIDO key as a multi-factor authentication
       | device with a Google account?
       | 
       | Or is there some repository of all allowed devices with
       | identifiers? Intuitively that'd be the only way to prevent
       | infinite virtual devices..
        
         | floatboth wrote:
         | Attestation (what they use) is orthogonal to authentication.
         | Token manufacturers have per-batch keys, private key being in
         | the devices of that batch, so sites can verify that your device
         | is from that batch of that vendor. You "can" implement
         | attestation with your own key in software or in whatever, but
         | Cloudflare won't trust your key :D
        
           | djoldman wrote:
           | So how does cloudflare know to not trust your key? They know
           | some secrets from the token manufacturers and test your
           | response against them?
           | 
           | If so, what if the secrets get out? Then all the keys in
           | those batches are poisoned?
           | 
           | Isn't this just some sort of side channel certificate
           | authority?
        
         | wereHamster wrote:
         | Yes. Example: https://github.com/github/SoftU2F
        
       | sdfhbdf wrote:
       | See also the original announcement:
       | https://news.ycombinator.com/item?id=27141593
        
       | PaulHoule wrote:
       | The build quality of those Yubikeys freaks me out. I wonder how
       | many insertions it takes until something shorts and my
       | motherboard gets damaged.
        
       | swiley wrote:
       | So visiting cloudflare sites with TOR requires you to identify
       | yourself? That's not great.
        
         | ComodoHacker wrote:
         | Can you see the difference between 'identify yourself' and
         | 'prove you're human'?
        
           | swiley wrote:
           | No?
        
         | fuckyouriotshit wrote:
         | Visiting many CloudFlare sites with Tor was impossible the last
         | time I checked because their CAPTCHA is broken and has been for
         | a long time.
         | 
         | I know for a fact that there are staff at CloudFlare who are
         | aware of this problem but nothing has changed, so I guess that
         | they don't care that they are making some sites unavailable to
         | anyone who has to use Tor.
        
         | fuckyouriotshit wrote:
         | Edit: I re-read the section on the U2F batch keys and
         | understand that the design intent is to be unable to track
         | individual tokens across sites (only batches of a size decided
         | by the token manufacturer). It's not completely clear to me if
         | the crypto involved is resistant to an attacker who can collect
         | the handshakes and then later gets access to the key(s) that
         | meant to be private to the manufacturer(s), but I acknowledge
         | that the intent is decent. My points still stand, however.
         | 
         | This sort of "we can solve that problem; we just need to kill
         | your privacy" seems to be par-for-the-course in SV-style
         | companies.
         | 
         | I really wonder if anyone involved with building these systems
         | has ever seriously thought about what could happen if the data
         | collected (or that could be collected) by these systems was
         | obtained by an adversary.
         | 
         | Not to mention the incredible incentive problems that are
         | created by designing things that are designed in a way that
         | _requires_ that individuals are tracked across the internet.
         | 
         | I know that CloudFlare is just one of many companies that is
         | moving in this direction and they're certainly not the worst
         | offenders when it comes to slowly murdering individual privacy
         | (Facebook and Google are obviously far worse) but they have a
         | uniquely powerful position due to the number of sites that use
         | their DDoS protection and seem to be taking a casual disregard
         | to the damage that they can do to people's privacy.
        
         | ignoramous wrote:
         | They implemented _Privacy Pass_ for that, which is kind of neat
         | [0] and related to another standard for authn viz. OPAQUE that
         | I really like [1].
         | 
         | [0] https://github.com/privacypass
         | 
         | [1] https://news.ycombinator.com/item?id=25346632
        
       | FriedrichN wrote:
       | I hate this new web where you're automatically assumed to be some
       | malicious actor only because you don't accept cookies and strange
       | third party code and then have to jump through hoops to show that
       | you're not some evil bot. To be honest, if a website immediately
       | throws some Cloudflare anti-DDoS thing in my face I don't even
       | bother anymore.
        
         | apple4ever wrote:
         | I agree. Browsing the web can be super frustrating when there
         | is a CAPTCHA every other page.
        
         | jfengel wrote:
         | We all do. Everybody who's ever worked in security hates that
         | even the tiniest hole in your security will be squeezed
         | through. If you don't distrust every single packet, then sooner
         | or later one of those packets is going to destroy you.
         | 
         | It's basically the same both ways. You don't trust them with
         | your private info. They don't trust you, either. The easiest
         | way is, indeed, to just call the whole thing off.
         | 
         | Everybody would love an alternative that lets more get done
         | with less trust. Sometimes they find them, for limited cases.
         | But nobody's solved it for the general case.
        
       | gsich wrote:
       | What am I missing? I get this device and can crawl all I want?
        
       | [deleted]
        
       | danShumway wrote:
       | Complete aside, but I'm still not certain I understand the
       | technical details of why Cloudflare can't uniquely identify
       | users. I thought I knew how hardware keys worked, but apparently
       | I don't.
       | 
       | If the key being shared is embedded in the device, even in a
       | secure enclave or something, then my understanding was that would
       | open the door for key extraction. If the key is unique per-
       | device, then that's not a problem. But if the key is unique
       | per-10,000 and stored statically on the device, then hacking one
       | device means that key can be released to anyone and the entire
       | pool can be imitated.
       | 
       | So if the above is correct, it can't be that a single private key
       | shared across the entire company is stored on the device because
       | that key would be getting constantly extracted and leaked by some
       | determined hacker somewhere. But if it's a unique key per-device,
       | then... I just can't figure out how validating that key wouldn't
       | require transmitting unique information to _somebody_ , whether
       | it's Cloudflare or the device manufacturer.
       | 
       | Where am I going wrong? I feel like I'm misunderstanding
       | something fundamental about how signing works on these devices,
       | but I can't figure out what it is. If I buy a Yubikey, is it
       | connecting to the manufacturer's servers and getting a new key
       | each time it's used? I thought they worked offline.
       | 
       | Or are secure enclaves just much more secure than I think they
       | are? Are we assuming that it's impossible to extract a private
       | key from one of these devices?
        
         | fuckyouriotshit wrote:
         | > Are we assuming that it's impossible to extract a private key
         | from one of these devices?
         | 
         | Nope, it's just "expensive" to extract keys from secure
         | hardware like this. The problem with an approach like this is
         | that when the secret keys are identical for a large number of
         | devices then the cost of revoking a compromised key goes up
         | significantly which, for a spammer, would increase the value of
         | obtaining said key because of the likelyhood that the key would
         | be usable for a much longer time than if the key was unique to
         | each device (and could be very easily blacklisted).
         | 
         | Techniques to extract data from these sorts of secure devices
         | include various forms of side-channel analysis, decapping and
         | microprobing the IC, using SEM, etc. to physically damage parts
         | of the circuit to try to force it to disclose the key and
         | various forms of power and clock glitching.
         | 
         | Most decent hardware-based cryptosystems are designed to ensure
         | that each device has a unique key so that the cost of
         | extracting one key (lets say around $100k) is too high for a
         | potential attacker if the key can just be blacklisted, but if
         | the key is expensive/impossible to blacklist then that cost
         | might be worthwhile to an attacker.
        
         | md_ wrote:
         | https://fidoalliance.org/fido-technotes-the-truth-about-atte...
         | explains this pretty well.
         | 
         | Basically:
         | 
         | * Attestation keys are not unique per authenticator; they're
         | shared among batches of authenticators.
         | 
         | * If you extract the batch's attestation key, you _can_ imitate
         | authenticators from that batch. That doesn 't mean you can
         | authenticate as a registered authenticator, of course; it just
         | means you can pretend to be a "Yubico XYZ" device.
         | 
         | * Yes, I think Cloudflare is assuming it's hard to extract the
         | attestation key. I think this is basically a safe assumption,
         | but if it isn't, they can always choose to distrust batches
         | known to be compromised.
        
           | danShumway wrote:
           | Thanks, that's really helpful. Followup questions though:
           | 
           | - Does this mean if I buy 2 of these devices at the same
           | time, it's possible for me to get the same attestation keys
           | on both devices? I guess depends on how many batches at a
           | time a company is producing.
           | 
           | - Doesn't this mean that attestation keys will get more
           | unique over time as devices from the pool fall out of
           | circulation and become rarer? Are keys rotated to prevent
           | that (ie, would a manufacturer ever re-release a new pool
           | with the same keys as an old one)?
        
             | gjvr wrote:
             | I understand from [0] that the attestation key is shared
             | across all instances (SNs) of the same _model_ (PN):
             | "...For example, all YubiKey 4 devices would have the same
             | attestation certificate; or all Samsung Galaxy S8's would
             | have the same attestation certificate". So you would not
             | need to to buy them at the same time.
             | 
             | But of course, despite this, still a unique key is
             | generated for each identity upon sign up [0]. I am not sure
             | (as in 'have no knowledge of') the entropy for these
             | devices.
             | 
             | [0] https://fidoalliance.org/fido-technotes-the-truth-
             | about-atte...
        
             | md_ wrote:
             | - Yes.
             | 
             | - To my (limited) knowledge, yes, you are right that keys
             | will get more unique over time. That's a very good point.
             | Keys are not rotated nor (generally) are they rotatable;
             | they are usually read-only. If you are using a very old
             | FIDO device and worried it has too much entropy now--like,
             | if it's a "rare" or "vintage" device!--then you should buy
             | a new one, I guess?
             | 
             | (I honestly have not thought about your second point
             | before, but I am not really deep in the FIDO stuff. So take
             | my answer with a grain of salt.)
        
               | fuckyouriotshit wrote:
               | Manufacturers don't necessarily have to rotate the keys
               | on older devices; they could rotate the keys on newer
               | devices such that it's difficult to reliably tell what
               | batch/generation a newer device is from, because it could
               | be using a newer or older key.
               | 
               | Such behavior would require some way of revoking old keys
               | from newer devices to prevent a situation where a
               | compromised and blacklisted old key is selected and
               | causes the CAPTCHA to fail, seemingly at random.
        
           | dane-pgp wrote:
           | > they can always choose to distrust batches known to be
           | compromised.
           | 
           | Which effectively means bricking the devices of 9999 innocent
           | users each time.
           | 
           | Why are we creating a world where users will be told they
           | can't visit a website or access their account any more
           | because they didn't spend enough money on a hardware DRM
           | device which tries to hide a key from them?
        
         | ignoramous wrote:
         | > _Where am I going wrong? I feel like I 'm misunderstanding
         | something fundamental about how signing works on these devices,
         | but I can't figure out what it is._
         | 
         | Whilst Fast Identity Online (FIDO) is much more than WebAuthn,
         | Cloudflare's proposal here is to use WebAuthn to get rid of
         | CAPTCHAs. The official WebAuthn doc is surprisingly accessible
         | with neat illustrations for key topics:
         | https://w3c.github.io/webauthn/ (ref registration and
         | authentication ceremonies, in particular)
        
       | ve55 wrote:
       | The way I'd put it is that Cloudflare's suggested implementation
       | may have its issues, but the general idea of trying to verify
       | that someone is a human and then providing this verification to
       | services in a way that is 1) anonymous and 2) cross-compatible
       | with other services, is the correct way to go about things (or at
       | least has some very appealing features).
       | 
       | I hope that we have something in the future that does this job
       | very well so that services do not need to verify phone numbers,
       | Google accounts, and even IDs and facial imagery just to allow
       | someone to use them (as this is much easier to do than coming up
       | with new captcha styles that humans can quickly and easily solve,
       | but that basic machine learning and scripting cannot).
       | 
       | Being able to use the Internet with the slightest bit of privacy
       | is already ~impossible for the average user and extremely
       | difficult and tedious for very knowledgeable and experienced
       | ones, so anything that tries to improve the current trend sounds
       | like it's at least attacking a problem worthy of our attention.
        
         | anothergram wrote:
         | Alternatively, if services demand a fee then there is no need
         | for human verification.
         | 
         | Instead of trying to solve anonymous human verification we can
         | as well make micro-payment an option.
        
           | bo1024 wrote:
           | I really hate it when I'm trying to spend money at a company
           | and get hit with a captcha box right as I click "checkout". I
           | could see it for selling scarce items like concert tickets,
           | but in general it's very insulting, annoying, and off-putting
           | to me.
        
             | kevincox wrote:
             | Credit card fraud that results in chargebacks is a very
             | significant cost to a lot of online stores. So while it
             | does suck it isn't the shop that is to blame.
        
           | sroussey wrote:
           | Once logged in perhaps. But credential stuffing is a thing.
        
           | derefr wrote:
           | Sometimes serving 429s/403s to unauthed users is already
           | costing you too much in egress bandwidth bills. That's one of
           | Cloudflare's main propositions: stop that "idiot bot that
           | will never get what it wants, but keeps requesting it anyway"
           | traffic outside your network. (Note: not the same as a DoS!
           | Usually not intentional, and usually not actually bringing
           | your infra down. Just costing you money, while not making you
           | any money.)
        
           | TimothyBJacobs wrote:
           | A small micropayment makes for a great way for bad actors to
           | test stolen credit card numbers.
        
             | eikenberry wrote:
             | You can't do micro payments with credit cards, they have to
             | much $$ overhead. You'd need to pay into a service that'd
             | handle the micro-payments and they'd have minimal packages
             | to buy to mitigate these issues.
        
       | StavrosK wrote:
       | Is this post missing the point, or am I? I thought that the point
       | of requiring attestation is to prove that someone actually did go
       | out and buy a legitimate Yubikey (or whatnot) and ban that key if
       | they're spamming.
       | 
       | With those two considerations, this actually seems like a really
       | good idea to me.
        
         | stickfigure wrote:
         | It doesn't appear to identify any specific key, just that the
         | user has _a_ yubikey. You could only ban a whole key
         | manufacturer (or key batch, however large that is).
        
       | sammy2244 wrote:
       | This article is not very well written.. "security keys are quiet
       | fast"
        
       | djoldman wrote:
       | It seems to me that forcing the user to go through captcha is a
       | big negative user experience.
       | 
       | Google must be docking points from websites that employ captcha
       | then, right?
        
         | kevincox wrote:
         | It doesn't matter. Googlebot is whitelisted.
         | 
         | (Partially sarcasm, I do know that Google does do some anti-
         | cloaking crawling)
        
       | viraptor wrote:
       | Even if we ignore the technical reasons, for me CloudFlare's
       | proposal fails at their "Associate a unique ID to your key"
       | property, where they say CloudFlare could, but won't do it. If
       | they implement this scheme they start normalising this approach.
       | Once it gets to FB and Google implementation, their answer will
       | be: we could, but we... look! a squirrel!
        
         | tialaramex wrote:
         | Their document says, correctly, that the means by which they
         | could try to do this would be to shove the arbitrary random ID
         | they get into a cookie.
         | 
         | You may have noticed that both Facebook and Google already use
         | cookies. Did you know Hacker News has a cookie too?
        
       ___________________________________________________________________
       (page generated 2021-05-14 23:01 UTC)