[HN Gopher] Cloudflare's CAPTCHA replacement with FIDO2/WebAuthn...
___________________________________________________________________
Cloudflare's CAPTCHA replacement with FIDO2/WebAuthn is a bad idea
Author : herrjemand
Score : 191 points
Date : 2021-05-14 11:57 UTC (11 hours ago)
(HTM) web link (herrjemand.medium.com)
(TXT) w3m dump (herrjemand.medium.com)
| grishka wrote:
| Cloudflare captchas in particular, and any checks and roadblocks
| to _see something publicly available_ in general, are terrible,
| period. It doesn 't matter which form they take. Every time you
| see one you feel like a second-class citizen and get reminded
| that the internet is no longer what it used to be.
|
| I personally simply close the tab when I see a cloudflare "one
| more step" page.
| fierro wrote:
| what? Have you ever dealt with a DDoS attack and the
| consequences on your availability and infra health?
| stingraycharles wrote:
| Of course from the perspective of the website operator it's
| great, but from the perspective of the user it's frustrating.
|
| I'm not sure whether this is true, but it seems like with
| Firefox I get these captchas much more often than with
| Chrome. Sometimes they're so difficult to solve it really
| takes a minute or two to do so, and it's incredibly
| disturbing / an unfriendly interaction.
|
| Surely there must be a better way to deal with this? Why do I
| have to keep proving again and again and again to Cloudflare
| I am, in fact, a person?
| grishka wrote:
| Are ddos attacks a common enough occurrence to warrant
| putting half the internet behind ddos protection? In my
| impression you need to do something really wrong to deserve
| one.
| tick_tock_tick wrote:
| Yes, they absolutely are. Hell just getting a few random
| bots scraping stuck in a loop or being overly aggressive on
| your site is enough to double your bill. So yeah it's 100%
| required.
| midev wrote:
| Yes, attacks on the web are very common for any decent
| sized site. Just because something is publicly available
| doesn't mean you get unfettered access to do whatever you
| want.
| livueta wrote:
| That's unfortunately not true at all in my experience.
| Maybe if you're an anodyne SAAS, but if you host any user-
| generated content, especially if it's adjacent to gaming
| (my personal experience was mostly with gaming-related
| forums and IRC networks), politics or any other charged
| topic, expect to get hammered on a pretty frequent basis.
| IoT botnets are pretty easy to rent at this point, so the
| attack is accessible to every skid known to mankind.
|
| I actually agree with your overall point as I try to use
| Tor for a lot of "normal" browsing, but I'm not sure what
| the correct solution to accommodate both is. It's a hard
| problem, and having been in that position myself I have a
| hard time faulting small website operators who have no
| alternative defenses.
|
| e: just to add to this, I see the existence of ddosing as a
| significant driver towards centralized monolithic services.
| If your blog on Palestinian rights or whatever is getting
| hit, that's an incentive to move it to a platform that
| takes care of networking for you. It's a little absurd to
| go all-in on decentralized self-hosting without at least an
| acknowledgement that with current tech and typical
| personal-computing budgets, doing so is giving a heckler's
| veto to literally everyone. Cloudflare isn't the only
| dimension things can be centralized along.
| midev wrote:
| This is completely wrong. Site administrators can put any
| controls they want in place to limit access. I don't know where
| you get the idea that things on the Internet need to be
| publicly available or without restriction.
|
| Unless you're an original ARPANET contributor, there have
| always been attempts to control access and stop attacks. You're
| making the same mistake every conservative does. Longing for a
| nostalgia that never existed.
| ehutch79 wrote:
| How do you mitigate ddos attacks and other bad actors hitting a
| page?
|
| What does your cdn solution look like?
|
| Route optimization from your (single) endpoint to clients
| literally half a world away?
| grishka wrote:
| As I user, I simply don't care. I repeatedly get punished for
| doing nothing wrong. It's almost like airport security.
|
| > What does your cdn solution look like?
|
| > Route optimization from your (single) endpoint to clients
| literally half a world away?
|
| And as a developer, I don't understand this newfangled
| obsession over CDNs either. Yes, there will be 200 ms RTT in
| some cases. So what? Get over it. Optimize your website to
| load in fewer round-trips. TCP congestion control adapts well
| enough to any latency. RTT only really matters in gaming and
| VoIP.
| ehutch79 wrote:
| I don't think you understand why that captcha is there in
| the first place then.
|
| Cloudflare prevents a bunch of crap that site operators
| just don't want to deal with. Especially for smaller sites
| that are run by one person. Dealing with a wordpress site
| getting hacked because you missed an update by a day, or a
| bulletin bored getting swarmed with bots, or some asshat
| ddos'ing your site because you banned them. Suddenly that
| site just isn't worth running.
|
| Complaining that about a thing that prevents that headache
| because it's a minor inconvenience to you is so self
| centered it boggles the mind.
| grishka wrote:
| Yeah, so centralizing the entire internet around a black
| box that sees all your traffic in cleartext is clearly
| the right solution. /s
|
| > Dealing with a wordpress site getting hacked because
| you missed an update by a day
|
| Maybe don't use something this vulnerable then and rely
| on a third party to protect you from exploits.
|
| > or a bulletin bored getting swarmed with bots
|
| Maybe require email verification and/or a captcha _when
| signing up or posting_. Don 't punish people for
| _passive_ actions.
|
| Somehow, there are many forums that aren't behind
| cloudflare, yet there are no spam bots.
|
| > or some asshat ddos'ing your site because you banned
| them
|
| Sure ddos is such an everyday occurrence?
|
| I just don't understand. I run a personal website.
| There's literally nothing to "deal" with. I set it all up
| once and it works. I only have to pay for the server and
| for the domains on time.
| ehutch79 wrote:
| Monocultures are always bad, but I don't see any
| alternative services with this level of ease of use.
|
| You're definitely overestimating the technical
| expertise/available time of a lot small time admins out
| there.
|
| You don't see bots and spam on those forums either
| because they are actually using cloudflare, and you're
| just not seeing the captcha, or because in the backend
| they're feeding all their posts through akismet (in plain
| text). I don't think you're considering how many services
| see your posts, even when you don't trip a captcha.
|
| email accounts are trivial to sign up for, especially for
| bots. I always recommend charging $1 (or local
| equivalent) for an account, that's a lot harder to fake.
|
| My point in all this is that bitching that site is using
| cloudflare to not have to deal with crap, is a self
| centered view.
|
| Saying "well it never happened to me, so it must never
| happen" is similarly self absorbed.
|
| Maybe consider that your experience is not everyones
| experience
| jjav wrote:
| > My point in all this is that bitching that site is
| using cloudflare to not have to deal with crap, is a self
| centered view.
|
| Who is serving whom here?
|
| If a business thinks it's ok to impose cloudflare
| inconvenience on me, the customer, for the priviledge of
| giving them my money, who is self centered here?
|
| The simple answer is I'll close the tab and go buy it
| from a competitor. I'm not playing captcha games to buy
| something.
| [deleted]
| ehutch79 wrote:
| Why do I feel like you ask to speak to managers a lot?
| ehutch79 wrote:
| that 200ms rtt does matter to users. it becomes very
| noticeable. especially when you're writing an app, not a
| brochure site. You need to tree shake so you're not serving
| a huge spa all at once.
|
| I've see much worse times for users, and a cdn absolutely
| help with our staff in asia dealing with our internal apps.
|
| Of course they're not always tripping up cloudflare and
| being shown captchas. I almost _never_ see a cloudflare
| captcha either... huh...
| gsich wrote:
| What use has an app that requires a full RTT fot every
| buttonpress?
| ehutch79 wrote:
| None? Why would every button press require a full rtt?
| gsich wrote:
| >that 200ms rtt does matter to users. it becomes very
| noticeable. especially when you're writing an app, not a
| brochure site. You need to tree shake so you're not
| serving a huge spa all at once.
|
| Because this implies that.
| ehutch79 wrote:
| No it doesn't. at all.
|
| Page load speeds by themselves can be painful. Open
| devtools and have your browser throttle to poor 3g
| speeds.
|
| Try browsing around. even well optimized sites.
|
| Now try uploading a couple dozen files through an api.
|
| This is legit what some users deal with. In New York
| state even, you don't need to go that far to find poor
| connectivity.
|
| Even if all your users have awesome home connections,
| think sales people taveling to a client. or on site
| inspections of a manufacturer in a warehouse that's
| mostly metal and has bad wifi.
| gsich wrote:
| Then what does the 200ms have do with "writing an app"?
| ehutch79 wrote:
| Apologies, your username looks like the one that tossed
| that number out as a what they assumed was a high number.
|
| 200ms latency isn't that bad, but I'm seeing more
| 800-2000ms latencies with some users depending on
| physical location. at some point latency kills usability.
| Especially when trying to get through a complicated QA or
| inventory process.
| Dylan16807 wrote:
| If you're on 3G I would expect sites to load in a
| similarly bad way with or without an extra most of 200ms
| of RTT.
| ehutch79 wrote:
| The throttling in dev tools is meant to represent that
| latency...
| Dylan16807 wrote:
| If setting it to 3G is supposed to represent _just_ 200ms
| latency, that 's going to give you a very exaggerated and
| misleading impression of how bad it is. It's a
| meaningless test.
|
| I thought you were giving an example of how bad
| connections can get, and saying that the extra latency
| would make it worse, but in that situation it's a drop in
| the bucket.
| aseipp wrote:
| > Yes, there will be 200 ms RTT in some cases. So what? Get
| over it.
|
| You're missing a zero in that RTT for users in places like
| Asia if your server is anywhere in the west. (It's actually
| somewhat revealing when someone throws out a number like
| this without any qualification; what exactly made you
| conclude 200ms is the magic number?)
|
| > Optimize your website to load in fewer round-trips. TCP
| congestion control adapts well enough to any latency. RTT
| only really matters in gaming and VoIP.
|
| You don't need a very big imagination to think about cases
| where RTT will have significant impacts e.g. in the event
| you need to issue multiple sequential requests that are
| dependent on one another. These are unavoidable and occur
| often in more than just websites, but anything that e.g.
| uses HTTP as an API (a very simple one is something like
| recursively downloading dependencies.)
|
| This comes across as a classic "I don't actually understand
| the problem domain very well at all, but get off my lawn"
| answer to the problem.
| grishka wrote:
| > You're missing a zero in that RTT for users in places
| like Asia if your server is anywhere in the west.
|
| Well, it does say 130 ms in here:
| https://www.quora.com/How-long-would-it-take-for-light-
| to-fl...
|
| And that's _around_ the planet, to go around and end up
| at the same spot. In practice, with sanely-configured
| routes, your packets should never need to traverse more
| than half that distance. So, divide it by 2, then that
| cancels out because RTT is a measure of how long it takes
| for a signal to travel back and forth. You then add some
| time on top of that to account for buffering and
| processing in the various equipment along the way.
|
| > These are unavoidable and occur often in more than just
| websites, but anything that e.g. uses HTTP as an API (a
| very simple one is something like recursively downloading
| dependencies.)
|
| If you mean REST API requests, the kind that trigger some
| code to dynamically generate a response, how would a CDN
| solution like cloudflare help? The request still needs to
| get to the server and the response still needs to come
| back, all the way, because that's where that code runs.
| CDNs only really work for cacheable static content, don't
| they? I mean it's in the name.
|
| A blog or a news website certainly doesn't need a CDN.
| bayindirh wrote:
| > Well, it does say 130 ms in here.
|
| If you have a fiber backed, all-switched network with no
| routing, buffers, congestions, or detours, you may get
| that value, _if you 're lucky_.
|
| Pinging tty.sdf.org which is a direct access shell
| service in USA from somewhere between Europe and Asia,
| from an _academic network backbone_ roundtrips in ~190ms.
| I 'm traversing a little less than half a globe with the
| whole journey. In your terms, it should be around ~60ms,
| but it's not.
|
| > If you mean REST API requests, the kind that trigger
| some code to dynamically generate a response, how would a
| CDN solution like cloudflare help?
|
| By using Cloudflare workers, so your code is also
| distributed around the globe?
|
| > CDNs only really work for cacheable static content,
| don't they? I mean it's in the name.
|
| JS files are also static content. Unless you don't use
| code distribution like Cloudflare workers, using a simple
| CDN can cache 90% of your site if not more. CSS, images,
| JS, HTML, you name it.
|
| > A blog or a news website certainly doesn't need a CDN.
|
| Actually, CDN is the most basic optimization for
| distributing heavy assets like videos and images, which
| news websites use way more than text. Why not use a CDN?
| grishka wrote:
| > Pinging tty.sdf.org which is a direct access shell
| service in USA from somewhere between Europe and Asia,
| from an academic network backbone roundtrips in ~190ms.
|
| Actually I get around 200 from Russia which is also
| "somewhere between Europe and Asia":
| round-trip min/avg/max/stddev =
| 196.504/197.581/199.833/1.360 ms
|
| > By using Cloudflare workers, so your code is also
| distributed around the globe?
|
| Great, let's give that company _even more_ control. That
| 's sure gonna end well.
|
| > CSS, images, JS, HTML, you name it.
|
| It all gets loaded once and then cached in the browser.
| The initial load takes long regardless of whether there's
| a CDN. Oh, and many websites also use stuff from like 10
| different domains, which doesn't help this either.
|
| And, it doesn't matter whether a JS file loads in 50 ms
| or 300 ms, if it then takes 5 seconds to parse and start
| running.
|
| > Actually, CDN is the most basic optimization for
| distributing heavy assets like videos and images, which
| news websites use way more than text. Why not use a CDN?
|
| So put them on a separate domain and serve that from a
| CDN if you really care whether that stock photo no one
| notices loads in 500 ms instead of 2000. That doesn't
| explain much why anyone would put their main domain
| behind cloudflare.
| bayindirh wrote:
| > Great, let's give that company even more control.
| That's sure gonna end well.
|
| We have enough evil companies who invade our lives
| through the platform develop. Cloudflare is not one of
| them. Using them is voluntary (by the service providers),
| and I think they're one of the more useful companies
| around.
|
| BTW, I'm not a web developer or Cloudflare employee. I
| have no skin in this stuff, however they build some cool
| stuff inside the Linux kernel, which is interesting from
| my PoV.
|
| > It all gets loaded once and then cached in the browser.
|
| Then cleared and/or invalidated by the user or browser's
| logic itself due to plethora of reasons.
|
| > The initial load takes long regardless of whether
| there's a CDN.
|
| Actually, no. A reasonably fast internet connection (>12
| Mbps we can say) can load a lot of things very very fast.
| The biggest overhead is DNS, even with CDNs. With a good
| local, network-wide DNSMasq installation, if the server
| is close, I can load big sites almost instantly.
|
| > And, it doesn't matter whether a JS file loads in 50 ms
| or 300 ms, if it then takes 5 seconds to parse and start
| running.
|
| I think 5 seconds is long time for even the old Netscape
| Navigator's JS parser. You need to run something akin
| skynet to parse the JS file for straight 5 seconds. How's
| that even possible?
|
| > So put them on a separate domain and serve that from a
| CDN if you really care whether that stock photo no one
| notices loads in 500 ms instead of 2000.
|
| I don't know you, but world news generally contains
| live/new footage or fresh photos from the ground, not
| some stock photos, also we humans are visual animals.
| Many people want to see the images first, read the text
| later.
|
| > That doesn't explain much why anyone would put their
| main domain behind cloudflare.
|
| Load balancing, DDoS protection, CDN, workers,
| Bot/Scraping protection, cost reduction, rate limiting,
| you name it. Even my DSL router implements some of the
| protections, to my surprise.
|
| Internet is not the same beast now when compared to
| 90s/00s. I miss the simpler times, but alas.
| comex wrote:
| There's no need to guess based on the speed of light.
| Test it yourself:
|
| https://www.cloudping.info
|
| For me, the highest was 310ms round trip to Singapore, so
| higher than your estimate but not too bad.
|
| But this is completely beside the point. As far as I
| know, if you're using a CDN effectively (i.e. a large
| proportion of requests are hitting cache), it should be
| _cheaper_ than having all requests hit your server, not
| more expensive. So even if you don 't "need" a CDN, you
| might want one. This is orthogonal to the issue of bot
| traffic, which exists whether or not you use a CDN. If
| you want to use a CDN but don't mind the costs of bot
| traffic, you can configure CloudFlare to not show the
| CAPTCHAs, or use a different CDN.
| grishka wrote:
| 338 ms to Sydney is my worst. 234 ms to Singapore.
|
| AWS does offer a CDN, right? Somehow they do it without
| captchas and without me ever noticing. So I'm somewhat
| right at cursing at cloudflare because it's the only one
| actually announcing its presence by actively disrupting
| your browsing.
| ehutch79 wrote:
| Cloudfront, the aws cdn, is not equivalent to the part
| that's showing a captcha. Only when it would have hit
| your server do you see the captcha, because it's proxying
| it, not serving cached results.
| joshuamorton wrote:
| Of course, light doesn't travel at vacuum speed in fiber,
| it travels a bit slower, but it also bounces around the
| cable, so ends up traveling a significantly longer
| distance. Multiply by around 1.5 for more real world
| numbers (copper is similar), just measuring raw distance.
|
| And switching adds significant delay, especially when you
| get out to the edge.
|
| > If you mean REST API requests, the kind that trigger
| some code to dynamically generate a response, how would a
| CDN solution like cloudflare help? The request still
| needs to get to the server and the response still needs
| to come back, all the way, because that's where that code
| runs. CDNs only really work for cacheable static content,
| don't they? I mean it's in the name.
|
| For a simple example, imagine assets that are dynamically
| loaded based on feature detection in JS. All of the
| assets (js included) can be cached on the cdn.
| robocat wrote:
| I live in New Zealand, definitely a first world country
| with first world infrastructure. Ping times to US East or
| Europe are over 300ms right now from my home[1].
|
| My old business had users in countries around the world,
| and the assets were _highly_ optimised for speed. However
| adding CloudFlare (a) significantly sped up our service
| to clients, especially those in Asian countries, and (b)
| significantly improved reliability of connections because
| CloudFlare have their own dedicated network links between
| countries and /or optimised for reliability.
|
| [1] https://www.cloudping.info/
| [deleted]
| jjav wrote:
| > How do you mitigate ddos attacks and other bad actors
| hitting a page?
|
| Not sure what "bad actors hitting a page" even means. I host
| public info so people can see it, be it good or "bad" people.
| Let them see it.
|
| DDoS is different and can be devastating of course. Also,
| very rare. In decades hosting content (started my first
| hosting business in 1994) I've never experienced anything
| remotely like a DDos. I know it happens, but definitely very
| rare for most people. Driving tons of legitimate users away
| with relentless captcha annoyances for the once in a liftime
| possibility of a DDoS is not a good tradeoff.
|
| If you're in a business that attracts DDoS like flies then
| deal with that, otherwise lay off the captchas.
| midev wrote:
| > Not sure what "bad actors hitting a page" even means
|
| I would refrain from commenting on this topic then.
|
| > I host public info so people can see it, be it good or
| "bad" people. Let them see it.
|
| If your site ever gets big enough, you'll understand.
| floatboth wrote:
| For regular _public web pages_ , serving the actual fucking
| page should not be more expensive than serving the captcha
| page! What the hell is a "bad actor" in relation to GET
| requests to a _public page_? To a _public page_ , all actors
| should be inherently neutral.
| ehutch79 wrote:
| There's a lot of potential side effects. Not every GET
| request retrieves a static asset. A list view with filters
| for instance. Either way, you could be dumping any
| arbitrary data along with a request. Or just try fuzzing
| parameters. some pages might be poorly done graphQL
| endpoints, and your might find your db tied up. There are
| MANY ways a legit a get request can cause issues, let alone
| someone with bad intentions.
| ipaddr wrote:
| All users are hostile users and all users are preferred
| users. Define what your system will allow through rate
| limits and caching. Assume users would destory your site
| if given the chance because they will. If you are
| exposing private data through graphgl config it or drop
| the private data or drop graphgl and user a backend.
| ehutch79 wrote:
| I 100% agree you should be looking at all incoming
| traffic as hostile or at least potentially hostile. The
| other comments are contending that there's nothing a
| hostile user could do to a 'public' page.
|
| One of the things using cloudflare gets you is all those
| protections without having to know how to do them
| yourself. Which a lot of the developers don't know how to
| do.
|
| There's also something to be said for catching a lot of
| this at the network layer on a cluster of machines that
| can handle any incoming traffic that your one poor
| neglected vm can't.
| ipaddr wrote:
| If you use cloudflare you give up freedom not to force
| captcha. If you avoid them you can choose where you want
| to captcha. When cloudflare goes down you go down
| unnecessarily.
|
| If worried about a ddos attack cloudflare or another
| provider might be a good choice. But adding ddos support
| by default seems unncessary. In my 20 years of running
| 100s of sites I haven't run into a situation where I need
| ddos support. The vast majority will never be the target.
| Once in awhile google or bing will ddos you but using
| cloudflare to block that seems like overkill.
| ehutch79 wrote:
| There are tradeoffs, yes.
|
| We might weight those tradeoffs differently. That's ok.
| D-Nice wrote:
| Why is every and any TOR and sometimes VPN user deemed a DoS
| attack... it discriminates against users who value privacy by
| forcing hCaptcha on them by default. Worst of all... it could
| be a de-anonymization attack as well, hence why I as a
| regular TOR user, just exit the page immediately when that
| happens.
|
| For any of my pages that do happen to use Cloudflare, I am
| luckily able to disable this discrimination in the CP so
| kudos for that at least, but terrible defaults imo.
| sroussey wrote:
| From experience, traffic via Tor was always 99%+ fraud.
| tibiapejagala wrote:
| Well, if you keep throwing impossible captchas at them,
| no wonder that normal users just close the tab, but bots
| and fraudsters keep trying.
| jlokier wrote:
| You can conduct fraud by accessing public, read-only web
| pages? You can conduct fraud by searching on Google?
|
| Those are the two I find repeatedly blocked when
| accessing via Tor. The former by Cloudflare, the latter
| by Google.
|
| I use Tor to lookup phone numbers that have just called
| me, to decide whether it's a good idea to answer. Since I
| don't want to be personally associated with such numbers
| I prefer to search anonymously. But often it's impossible
| to get a result.
|
| Sometimes even spending 5 minutes solving captchas isn't
| enough. (I'd only spend that long to see if it's just an
| outlier. No, it's quite common.)
|
| This creates an immense pressure to tell various services
| exactly who is phoning me, which is a terrible attitude
| to privacy.
| ehutch79 wrote:
| Then don't use sites that are behind cloudflare?
|
| It's not your choice if the site owners/admins use
| cloudflare. It IS your choice not to use those sites.
| ehutch79 wrote:
| Because that's a not insignificant portion of traffic they
| see from tor and vpns?
|
| tor has some absolutely valid and import use cases, but
| what percent of tor exit traffic is actually someone trying
| to keep their traffic anonymous from the eyes of an
| oppressive regime, and what percent are script kiddies, or
| someone hiding torrenting from their isp?
| AtNightWeCode wrote:
| There are a lot services in Sweden that requires that you provide
| a real authentication by using something called BankId. Basically
| a personal digital id. This is the way to go. 100% secure
| validated users. If there was a function added to make the users
| anonymous to third party services it would be great.
|
| I work with Cloudflare sites and it is clear that thier current
| enterprise offerings are hard to tweak to solve attacks without
| spamming the users with captchas. The captchas are already too
| complicated for the average user so it is mostly turned off even
| though it has other consequences. I have to look into this new
| thing though.
|
| As much as I like the idea of an open Internet to be used by
| anyone from anywhere. It does simply not work today for a lot of
| enterprises.
| poisonborz wrote:
| Ah yeah, instantly zeroing anonimity for most users, while it
| can be still abused by malicious actors, a great worst of both
| worlds solution.
| AtNightWeCode wrote:
| Not sure how you came to that conclusion from what I said.
| What I am saying is that in the long run it will be
| impossible to rely on companies like Cloudflare for security
| when it comes to users. Over time all services will for
| security reasons need to either directly or indirectly
| authenticate all their users. That does not mean that each
| users identity is provided to each consumer.
|
| The open Internet is already dead thanks to Cloudflare,
| Akamai etc. A lot of European companies use theses services
| to block China, Russia, TOR, VPN-services and so on.
| stickfigure wrote:
| Reading CF's blog announcement [1], this is really horrifying. It
| trains users to insert security keys and accept biometric
| identification requests when visiting random web pages, on random
| untrusted domains.
|
| This cannot possibly end well.
|
| [1]: https://blog.cloudflare.com/introducing-cryptographic-
| attest...
| ehutch79 wrote:
| Isn't part of the point is that a phishing site wouldnt get the
| same response as a legit site, and therefore it's be useless to
| do that, so this behavior is ok?
| stickfigure wrote:
| It's not about phishing, it's about getting the user to
| blindly accept security checks. If users are trained to
| insert their usb key / scan their fingerprint whenever they
| see a cloudflare page, bad actors can present a mockup of
| this page to exploit that reaction.
|
| Physical keys (and biometrics) work well because they are
| rarely called for, and the user knows they are doing
| something security sensitive. "This random page asked me to
| insert my security key" can't be healthy.
| ehutch79 wrote:
| But wouldn't the response the mockup gets only work for
| that page, not something they could pass through?
|
| If any page can request any other pages response that'd
| make the whole system pointless
| tialaramex wrote:
| No. Ignorance is a big problem. What makes Security Keys
| work well isn't that they are "rarely called for" but
| almost the opposite, they're so easy that you can add them
| with little friction all over the place. Tapping to sign
| into a remote server over SSH is no problem, it's scarcely
| more effort than thumping "enter" on the command is.
|
| What the user is doing is _not_ security sensitive. They
| are, in fact, themselves, and that 's all the Security Key
| is confirming. "Yup, still me".
|
| One of the easy ways fools trip themselves up here is that
| they think this is identifying information. But it isn't.
| "Yup, still me" doesn't identify anyone. The identity was
| already known to your interlocutor, which is why "Yup,
| still me" is enough.
|
| And that's what's so clever about the FIDO design. A
| Security Key has no idea who it "is", it just knows it's
| still the same as before. If you're already authenticated
| as Jim Smith, you can enroll a security key "Yup, still me"
| -> the Relying Party stores the information, and then you
| can later sign in using it to verify your identity, "I'm
| Jim Smith". "Is this still you Jim Smith?" "Yup, still me".
|
| So that's why this doesn't help bad guys. "Are you still
| er... you?" "Yup, still me". Completely useless. Of course
| you are, that doesn't help them at all.
| ryanlol wrote:
| This seems to assume that existing captchas are much better than
| they actually are.
| sdfhbdf wrote:
| The idea to replace CAPTCHA with FIDO doesn't seem sound, isn't
| it trivial to imitate it with DevTools in Chrome or some other
| software?
|
| https://developer.chrome.com/docs/devtools/webauthn/
| BillinghamJ wrote:
| The attestation process is capable of cryptographically
| checking the device manufacturer etc
|
| (although practically I'm unsure as to whether that's really a
| good idea or would work well)
| arsome wrote:
| I believe the idea here is you need to buy actual FIDO U2F keys
| and they could then be revoked on a per-key basis if you're
| caught abusing them as they're signed by a 3rd party so can't
| just be emulated.
|
| Meaning you need to buy more. Makes it expensive at least.
| Nextgrid wrote:
| How can you revoke on a per-key basis without at the same
| time being able to track keys uniquely?
| arsome wrote:
| I'm not terribly familiar with U2F itself, but I assume the
| site has a way to identify you're using the right key that
| can be reused for this purpose?
| tialaramex wrote:
| When you enroll a token with a site, the token mints a
| random new key pair and sends the site an ID and the
| public key, signed with the private key.
|
| The site records the ID and public key.
|
| When you return, to confirm it's really you, the site
| sends one or more IDs you've enrolled and says, sign this
| fresh random data with one of the associated private
| keys.
|
| Your tokens can look at the site and an ID and decide if
| they made that ID for that site, if they did they sign
| the message with the private key, proving you are still
| you. If they didn't make it, they pass and maybe you own
| a different token that can sign, or maybe you show them a
| different ID they do recognise.
|
| To reuse this capability for tracking, the site would
| need to _guess_ who you are first. "I guess this is
| arsome, they have this U2F key". But if they can guess
| who you are, they already don't need such tracking.
| y7 wrote:
| Yup, you can't. Keys are perfectly trackable by Cloudflare,
| but they promise they won't do this.
|
| Edit: I was wrong. Cloudflare claims they could track
| people, but it would require tracking via cookies. [1] The
| hardware security keys have an "attestation key pair" that
| is shared among all units in one production batch (which
| contains at least 100K units). [2]
|
| 1: https://blog.cloudflare.com/introducing-cryptographic-
| attest...
|
| 2: https://www.w3.org/TR/webauthn-2/#sctn-attestation-
| privacy
| dboreham wrote:
| No, they can't do this. It's the U2F key vendor that
| promises not to release a device-unique key to someone
| like CF.
| y7 wrote:
| Thanks, I stand corrected.
| dboreham wrote:
| You can't. The device attestation protocol specifically
| excludes the ability to uniquely identify devices (the key
| has to be re-used in at least 99999 other devices).
| sdfhbdf wrote:
| But i'm specifically asking about software. I know Touch ID
| can be used with WebAuthn and also see the DevTools in chrome
| WebAuthn debugger. Just seems easy to fool when I regenerate
| a key on every visit unless there is additional step I don't
| get.
| arsome wrote:
| You can't generate an attested key with the devtools. It
| won't be signed by a 3rd party that CloudFlare has
| approved.
| sdfhbdf wrote:
| Ok, you're right. I did miss a step!
| ibeckermayer wrote:
| Could this be solved (in large part) if key makers like YubiKey
| did I.D. verification on purchase? Then, to do the type of
| "farming" that's mentioned in this article, you'd need to
| organize a large group of people to all buy the keys rather than
| just submit a bulk order to Alibaba.
|
| Of course this idea raises privacy and authority concerns,
| similar to certificate authorities.
| Animats wrote:
| Once they have your ID info, they'll later change the terms to
| sell it to advertisers. Are they contractually committing to
| never doing that? No. So they will.
| CogitoCogito wrote:
| Even if they are contractually committed not to sell your info,
| that still might not save you:
|
| "Yesterday, the bankruptcy court approved the sale over the
| objections of several parties, including the Federal Trade
| Commission (FTC) and third party manufacturers Apple and AT&T
| who sold products to the bankrupt retailers.
|
| ...
|
| The FTC's objection was made to the court-appointed consumer
| privacy ombudsman in the RadioShack bankruptcy. Specifically,
| the FTC's letter alleged the sale of personal information
| constitutes a deceptive practice because in its privacy policy,
| RadioShack promised never to share the customer's personal
| information with third parties."
|
| https://www.jdsupra.com/legalnews/radioshack-bankruptcy-cour...
|
| In that case the judge allowed the sale of the information in
| contradiction to its commitments.
| dragonwriter wrote:
| > In that case the judge allowed the sale of the information
| in contradiction to its commitments.
|
| Note that bankruptcy _always_ allows things in contradiction
| to commitments; bankruptcy is all about balancing which
| commitments will not be fulfilled, and by how much, when a
| party is no longer capable of fulfilling all of its
| commitments.
|
| If you don't like _particular_ commitments being voided in
| bankruptcy, you want legislation specifically protecting them
| so that there is a clear legal barrier to voiding those
| specific kinds of obligations.
| ignoramous wrote:
| The article here ignores the view of the _Web_ that Cloudflare
| has, which coupled with "something you have" (the U2F keys)
| makes for a compelling alternative to CAPTCHAs.
|
| Sure, bots can automate keys, but those keys could also be banned
| just as well. Cloudflare only needs to know which ones are the
| good keys and track those forever. This means, for every non-bot
| out there, the CAPTCHAs are as good as gone.
|
| The genius of Cloudflare here is that they (ab)use WebAuthn,
| which can also be implemented on Android and iOS natively. Before
| you know it, Cloudflare has built an identity platform (where it
| may not be helpful for KYC) is plenty useful for websites
| Cloudflare fronts. Imagine never having to bother with user
| registration and authentication and bots... that's the next
| extension I see to all of this.
| weird-eye-issue wrote:
| They already have that but it's just for internal teams. I used
| it recently to lockdown Wordpress installations
| wfleming wrote:
| Each key is associated with a batch of devices, though. If you
| ban a key, you risk banning a bunch of legitimate users.
|
| It's an interesting trade-off. It seems like batch keys for
| device attestation was designed to help protect individual
| privacy (good), but if you can't ban a key without potentially
| a lot of splash damage when you detect a bad actor, that seems
| like a very limiting choice.
| tialaramex wrote:
| The intent of attestation is that a business could decide,
| OK, we think FooCorp are doing a proper job and we trust
| their FIDO tokens, but we don't like all these dozens of
| cheap alternatives. So for our corporate site we'll require
| FooCorp tokens, and we'll just issue every employee a FooCorp
| token on our dime.
|
| _Maybe_ it could make sense for a bank to do this, sending
| account holders a special custom Security Key with the bank
| 's branding on it. I personally think that's stupid, but I
| can imagine it appealing to bank executives and it's not so
| stupid as to be worse than SMS or TOTP 2FA that banks do
| today.
|
| But it clearly isn't relevant for no-cost services like
| Facebook or Gmail, and so sure enough you can just tell them
| you don't want to give them attestation and they work anyway
| (I don't know if either of them ask, I just reflexively deny
| attestation if it's requested).
|
| It isn't intended to be useful for trying to do stuff like
| Cloudflare are attempting here. Which doesn't mean Cloudflare
| can't succeed in their goals, but in the FIDO threat models a
| "bad actor" would be a whole _vendor_ , maybe some outfit is
| using fixed long term secret keys inside their Security Key
| products and they just sell the NSA a list of those keys -
| you might decide to just refuse all the products from this
| vendor. Whereas for Cloudflare the "bad actor" they're
| worried about just buys a half dozen of whatever was cheapest
| from eBay and then plugs them into a Raspberry Pi.
|
| Or, do they? That's the gamble I think Cloudflare is taking.
| Maybe the value of defeating this intervention is so low that
| bad guys will not, in fact, build a Raspberry Pi Security Key
| clicker proxy to make their thing work.
| ignoramous wrote:
| > _Each key is associated with a batch of devices, though. If
| you ban a key, you risk banning a bunch of legitimate users._
|
| You're right. I meant Cloudflare could ban the generated
| public-key and not the device's public-key itself. Besides,
| they could also mark the batch as being taken over by bots
| and increase the level on challenges issued to the batch.
| Note though, a single secure module can only generate / store
| so many public-keys. For instance, _Yubi Key 5_ supports up
| to 25 keys, though those could be reset to generate a newer
| set of 25, but repeat registration of a number of keys from a
| single batch is bound to trigger some statistical anomalies.
|
| From Cloudflare's blog about _Cryptographic attestation of
| personhood_ https://archive.is/4EbER
|
| > For our challenge, we leverage the WebAuthn registration
| process. It has been designed to perform multiple
| authentications, which we do not have a use for. Therefore,
| we do assign the same constant value to the required username
| field. It protects users from deanonymization.
|
| Currently, the user-name field is constant for all users. I
| wanted to point out that that they could amend the
| registration ceremony to register any user in particular.
| TimWolla wrote:
| > For instance, Yubi Key 5 supports up to 25 keys
|
| This is for resident keys. A YubiKey 5 supports an infinite
| number of non-resident WebAuthn keys, because the returned
| key handle will simply be the private key encrypted with a
| master key stored on the YubiKey. For authentication the
| service will send the stored key handle back to the YubiKey
| which then can decrypt it and use the decrypted private key
| to sign the challenge.
| [deleted]
| ignoramous wrote:
| TIL.
|
| Envelope encryption. Neat. Can WebAuthn keys be (made) a
| resident key? If so, is that preferred instead?
|
| Conversely, what use case is there for resident keys in
| context of WebAuthn? For example, if there are multiple
| master keys, can I switch between them per browser /
| website (assuming the master key itself is a resident key
| and not burnt into the element)? Thanks.
| robalfonso wrote:
| I agree, I saw the author mention for $25k you could have 1000
| keys. I immediately though that is not nearly enough. Given the
| sheer volume they have, they would start putting a picture
| together very quickly of ip/key/sites. There is no where near
| enough uniqueness.
|
| I also thought the idea of the key exchange being fast was a
| red herring. That's a bad thing. If I'm them I'm paying
| attention to how long from prompt to exchange it takes a human
| to touch the button. On my own setup my key is on my laptop,
| which is in a dock. I must stand up and tap it over my monitor.
| It's just a few seconds but it's a) consistent in timing b) not
| < 1s. Imagine the aggregate timing data they have.
|
| Overall they make some good points if you are teeny tiny player
| and ignore completely the scale cloudflare is operating at.
| kylehotchkiss wrote:
| I'd rather take these tradeoffs than doing 5 steps of Recaptcha
| because I'm using a VPN to work, which as Cloudflares
| announcement said, is very localized to North America and likely
| extra complicated for those outside the region.
|
| In theory, couldn't Yubikey begin reducing batch sizes to 1,000
| and Cloudflare mark specific batch numbers as requiring one extra
| step to verify? The vast majority of Yubikey sales will be for
| real people in any case.
| kylehotchkiss wrote:
| And if Yubikey could reduce batch sizes - could they require
| bulk non-wholesale orders to retain the same batch ID to reduce
| likelihood of abuse?
| SXX wrote:
| This won't work because guess what? Bad actors have money and
| means to buy as many devices as needed through individuals.
|
| Most of the abuse that CloudFlare protects from is also
| usually illegal. And those taking risk of doing something
| illegal usually do it because it's highly profitable so
| making some authentification devices a bit more expensive
| won't make any difference.
| livre wrote:
| Wouldn't reducing batch sizes make privacy even more of a
| problem? Now instead of a 1/100000 chance of the user being the
| same person on another website there would be a 1/1000 chance.
| makomk wrote:
| Yeah, except that because the other 999 users probably
| wouldn't be using Tor to access the same websites, in
| practice this would be pretty much guaranteed to give a
| highly accurate, persistent tracking identifier.
| md_ wrote:
| The batch size requirement is imposed by the FIDO spec, to
| ensure that batch IDs are not so high entropy as to pose a
| privacy problem.
|
| "In this Full Basic Attestation model, a large number of
| authenticators must share the same Attestation certificate and
| Attestation Private Key in order to provide non-linkability
| (see Protocol Core Design Considerations). Authenticators can
| only be identified on a production batch level or an AAID level
| by their Attestation Certificate, and not individually. A large
| number of authenticators sharing the same Attestation
| Certificate provides better privacy, but also makes the related
| private key a more attractive attack target."
| rvz wrote:
| How does this so called 'CAPTCHA replacement' idea compare to
| Sign In With Apple? which also does not use any CAPTCHAs and aims
| to prevent bot sign ups.
| md_ wrote:
| Apple Sign-In is just an OpenID federated login; these don't
| inherently provide any anti-automation or rate limiting; they
| just push the problem to the Identity Provider.
|
| IdPs like Apple/Google/Microsoft might do a fine job of
| limiting you to "one account per $unit-of-hardware"; Apple in
| particular can do this via iOS attestation. But then you're
| limited to either their heuristics (in the case of MSFT/Google)
| or their hardware (in the case of Apple).
|
| Ultimately this is apples-to-oranges, though, since Cloudflare
| is not offering an IdP product but simply an anti-automation
| solution. If you use federated auth, you're getting (and giving
| up) a lot of other stuff beyond just anti-automation.
| arsome wrote:
| There's always CAPTCHA bypasses if you're willing to pay,
| there've been sites operating for decades that will take a
| captcha URL and spit out the appropriate response by just feeding
| it to humans. This is just a different way to make you pay - and
| arguably to something of less ill-repute, buying more U2F keys
| once yours get banned.
|
| This provides effective rate limiting and you can still get every
| key you automate banned very easily.
| [deleted]
| fierro wrote:
| this is what the blog post fails to address. Is the FIDO2
| hardware key approach abusable? Yes. More so than regular image
| based CAPTCHA? No, like you said, there are well established
| services for "mechanical-turking" away the problem.
| garaetjjte wrote:
| If I understand it correctly, you cannot ban attestation key
| without potentially banning lots of legitimate users.
| WORMS_EAT_WORMS wrote:
| Second this.
|
| For a major sporting event, one of our sites was heavily
| targeted by "free TV streaming services" self promoting their
| stuff.
|
| No amount of Google CAPTCHA or Cloudflare could stop it while
| keeping it online. Never seen anything like it in my life.
| apple4ever wrote:
| That makes me so frustrated.
|
| I HATE CAPTCHA's with a passion. They are everywhere and
| constantly slow me down. And you mentioned, they are likely
| not helpful in stopping bots.
| temp667 wrote:
| I almost never see a captcha. On a static fiber IP - 1GB.
| Use chrome. Not sure if that matters.
| thejosh wrote:
| Cloudflare is both a great thing and a terrible thing that has
| happened to the internet in recent years.
|
| Great in that they have a fantastic UI to add your site in,
| basically shielding the average user from attacks.
|
| Bad from a standpoint of that now only Google, Bing, and maybe
| other big search engines have the capabilities to actually crawl
| the internet now.
|
| I don't see us getting a massive innovation in search on the
| internet now that Google has such a massive foothold, and
| companies like Cloudflare stop innovation from happening.
| Beaver117 wrote:
| Maybe I'm cynical, but I don't see any innovation to be done in
| search. Google results have become much less useful over the
| past few years. If they cannot solve search with basically
| unlimited resources, how is a tiny company going to?
|
| 1. Filtering ever increasing trillions of spam/clickbait pages
|
| 2. Figuring out which results are useful information vs
| corporates trying to sell something.
|
| Those problems are not solveable by a couple guys in a garage
| spiderfarmer wrote:
| Maybe not, but the garage guys should look into becoming the
| best search engine in a niche and expand from there.
| janeroe wrote:
| Maybe I'm cynical, but I don't see any innovation to be done
| in {AREANAME}. {PRODUCTNAME} has become much less useful over
| the past few years. If {MONOPOLISTNAME} cannot solve
| {AREANAME} with basically unlimited resources, how a tiny
| company going to?
|
| Nice pasta, can't believe someone would use it unironically.
|
| Maybe I'm cynical, but I don't see any progress to be done in
| government transparency. USA's government has become way more
| overreaching and much less transparent over the decades. If
| USA cannot solve this issue, how a small country going to?
| JeremyBanks wrote:
| It's not clear to me that Google still gives a fuck about
| solving search/organizing the world's information and make it
| useful. The mess they've incentivized the web to become is
| very profitable for them.
| SXX wrote:
| > If they cannot solve search with basically unlimited
| resources, how is a tiny company going to?
|
| Reminder: Google Is An Ad Company. Are you sure they actually
| want to solve search? Their primary interest is to be that
| corporation selling you something.
| jgrahamc wrote:
| _I don 't see us getting a massive innovation in search on the
| internet now that Google has such a massive foothold, and
| companies like Cloudflare stop innovation from happening._
|
| How are we "stopping search innovation"?
| thejosh wrote:
| Hi!
|
| Thanks for taking the time to reply.
|
| You mentioned about "legit" crawlers, what defines a "legit"
| crawler in the eyes of Cloudflare, and what happens when
| Cloudflare suddenly decides it does not want to honour that
| "agreement"? What happens if/when Cloudflare is sold, or the
| contact who greenlit these smaller "legit" crawlers moves on
| and decides that it no longer agrees with said website
| anymore?
|
| Is a price comparison site a "legit" crawler? What defines a
| "bot" vs a "crawler" in the eyes of Cloudflare?
|
| Would you need to notify your customers that you now also
| allow additional crawlers access to their sites, or would
| they need to opt into it via the Cloudflare dashboard? What
| happens when you have a falling out with my said company (it
| happens, relationships sour) and suddenly we can't make
| contact, then suddenly customers websites aren't being
| scraped because we're bots?
| dannyw wrote:
| There are a lot of hypotheticals here. I think you'll
| convince CloudFlare, and their customers, if you could name
| names and mention specific examples?
|
| If you are a price comparison site getting blocked by
| cloudflare, site owners may be losing sales, and that's
| good feedback.
| jopsen wrote:
| Or site owners may actively want to block a price
| comparison site..
|
| Depending on the industry, etc..
| thejosh wrote:
| Agreed on this point, and most companies who would want
| to do these sorts of things would restrict it even if
| Cloudflare didn't exist.
|
| I really can't think of a good solution. But that's the
| tricky position Cloudflare is in - how does it balance
| everything.
| hysan wrote:
| Price comparison example: https://shucks.top/ sometimes
| gets blocked by cloudflare. Most recent was getting
| blocked from checking B&H.
| surround wrote:
| I trust that cloudflare will act responsibly in allowing
| small search engines through, but I really, _really_ would
| rather _not_ have to trust cloudflare. I don 't believe that
| any organization can or will always act responsibly, which is
| why it's concerning that cloudflare controls so much of the
| internet.
| EvanAnderson wrote:
| Yes. This. I believe that John Graham-Cumming is genuine in
| his statements in this thread re: "contact me if you're
| running afoul of our controls", for example. If he leaves
| Cloudflare, Cloudflare "turns evil", etc, then that's all
| out the window.
|
| Individual companies having so much power gives me the
| willies.
| pdimitar wrote:
| By being gatekeepers on which website crawling is okay and
| which is not.
|
| No such filters should exist. Is it really _that_ awfully bad
| without anything but basic filters (ban an IP for flooding)?
| Are there like, operations that try and spam every single
| Cloudflare-hosted website 24 /7?
|
| Legitimately curious if your anti-bot measures come from
| actual bad experience with the internet or is it just a
| liability limitation move? (Namely to reduce potential suing
| surface by angry data owners and/or three-letter agencies.)
|
| ---
|
| Basically, if I am experimenting with a basic crawling
| program and I hit websites A and B 20 times each in a space
| of one hour, is that really deserving of a captcha or extra
| auth methods?
|
| Not flaming but I am really curious. Do you have any data and
| rationale posted somewhere that go into deeper detail about
| why Cloudflare's bot detection is how it is?
| fivre wrote:
| Cloudflare's customers request and then enable those
| features. Cloudflare itself doesn't give a damn about that
| traffic; they have bandwidth to spare. They will, however,
| happily sell tools to people that do care.
|
| That isn't to say that the customers are savvy and have a
| good understanding of different types of automated traffic
| and which automated traffic is harmful and which is benign.
| Many have a quite naive understanding that doesn't extend
| beyond "bots = bad, unless it's Google" and dial protection
| settings to the max for no good reason.
| o-__-o wrote:
| How does my startup crawl Cloudflare sites without paying a
| hefty fee to Cloudflare?
| jgrahamc wrote:
| https://news.ycombinator.com/item?id=27153635
| [deleted]
| o-__-o wrote:
| This will scale wonderfully!
| jgrahamc wrote:
| No, what scales is us making our DDoS and bot detection
| not disrupt the crawling of legit search engines that
| respect robots.txt, don't crawl at ridiculous speeds,
| don't do dumb stuff like pretend they are the Googlebot.
| We have teams who work on that. You can read more here:
| https://blog.cloudflare.com/tag/bots/
|
| But let's suppose someone is building a new cool search
| engine and our ML stuff is blocking them. Then... contact
| us/me.
| timlardner wrote:
| That doesn't sound unreasonable. Out of interest, what
| would you consider a ridiculous speed to be crawling at?
| LinuxBender wrote:
| I can't speak for Cloudflare, but crawling speed should
| be dictated by the site owner via the robots.txt crawl-
| delay. [1] A site owner could also rate-limit
| unauthenticated requests by IP _via the cloudflare
| header_ using a 429 _too many requests_ error page.
|
| [1] - https://en.wikipedia.org/wiki/Robots_exclusion_stan
| dard#Craw...
| o-__-o wrote:
| This here is the problem. It's a new time no one wants to
| be Rfc compliant, just go behind a service and problem is
| solved.
|
| So no problem, time to move on web search is no longer
| exciting
| o-__-o wrote:
| So for my startup to crawl sites I must now adhere to
| Cloudflare's Requirements of the Web(TM) or reach out to
| individual engineer, who may leave at any moment. Gotcha
|
| (but Google is allowed because Google was first to
| market)
| midev wrote:
| Why would you possibly think you can do whatever you want
| to someone else's site?
|
| Yes, you must adhere to the controls that site
| administrators put in place, like Cloudflare.... You
| don't get to blast my site with requests, just because
| you want to...
| o-__-o wrote:
| (a) Who said I was blasting your site with requests?
| Cloudflare stops much more than just blasts
|
| (b) But you're a-ok with Google doing this. Gated
| communities aren't really good for anybody but I see what
| you are saying.
| midev wrote:
| Gated communities are great. They lower the risk of crime
| significantly: https://www.sciencedaily.com/releases/2013
| /03/130320115113.h...
|
| The same is true online. Apple's walled garden has kept
| hundreds of millions of people safe on their device. It's
| why iOS malware isn't a thing.
|
| > Cloudflare stops much more than just blasts
|
| Exactly. There's even more benefit to Cloudflare than
| just DDoS. Captcha's for stopping credential stuffing,
| for example.
| 77pt77 wrote:
| It seems to be by design.
| navanchauhan wrote:
| Crawling a Cloudflare powered website is basically impossible
| without needing to do some bodges as to how to crawl it.
|
| How can you expect someone to crawl a bunch of websites if
| they are actively blocked from accessing it? Now, you might
| say users can whitelist bots in their robots.txt file but
| then again will the person creating the engine individually
| ask companies to allow them to crawl?
|
| Also, slightly unrelated but Cloudflare protected websites
| are almost impossible to access via tor, the captcha never
| succeeds.
| jgrahamc wrote:
| If you are building a search engine and getting blocked you
| can always contact me and I'll make sure that the teams
| that work on bot detection and DDoS are aware. We would
| like to know because we should _not_ be blocking a legit
| crawler like this.
| throwaway3699 wrote:
| For every developer that sees this message, a few dozen
| will have given up.
| new_guy wrote:
| Exactly this. Cloudflare actively blocks legit crawlers.
| It shouldn't be dependent on seeing some random hn
| comment from some random at cloudflare to get that fixed.
| PaulHoule wrote:
| What makes a web crawler "legit?"
|
| When I had a site that had millions of pages, I found
| that sites like Baidu would crawl my site as often, if
| not more often than Google.
|
| I already felt the relationship with Google was
| parasitic, but I looked through my logs and never found a
| single hit that came from Baidu and many of the other
| search engines that would overload my site.
|
| I was looking at a substantial part of the site running
| costs going to supporting web crawlers that were not
| doing anything (1) to help me, or (2) to help end users
| (if they don't want to send Chinese users to an English-
| speaking web site, why crawl the site?)
|
| So like it or not I am inclined to only allow Google and
| Bing in the robots.txt because Google is the only site
| that sends a significant amount of traffic and because
| Bing sends some, and Google needs some competition.
|
| There are web crawler behaviors that are annoying:
| harvesting email addresses, overloading your site, etc.
| But how do you know who is doing something wrong with the
| data and who is just collecting it do do nothing with it?
| (Probably 95% of web crawling ex. Google.)
| acdha wrote:
| > So like it or not I am inclined to only allow Google
| and Bing in the robots.txt because Google is the only
| site that sends a significant amount of traffic and
| because Bing sends some, and Google needs some
| competition.
|
| This sounds like you're onto a reasonable "legit" factor:
| does the crawler honor robots.txt? Baidu would be legit
| because they don't lie about their identity and if you
| put a rule in your robots.txt file they'll honor it.
| oefrha wrote:
| Say I'm interested in building a small scale domain-
| specific search engine and only just started development.
| There's no prototype yet and may never be. In this
| situation, how do you determine it's a legit crawler?
|
| And what about crawlers with even more limited scopes
| (targeting only a handful of sites) that they can't
| possibly be called search engines? Are they ever
| considered legit?
| jgrahamc wrote:
| Be a good netizen? Respect robots.txt. Don't lie in your
| User-Agent. Don't crawl at a ridiculous rate. All those
| are a good starting point.
| joepie91_ wrote:
| Do you not realize the more fundamental problem with you,
| as a company, essentially being the one who gatekeeps
| crawler access to the web?
| dannyw wrote:
| I use cloudflare out of my free will because there's
| malicious traffic out there, and I have enough control
| over everything.
|
| They're only a gatekeeper because sites voluntarily enter
| into commercial agreements with them. There's no coercion
| or manipulation like Google AMP.
| midev wrote:
| Customers pay for this as a feature. Why would they feel
| it's a fundamental problem? There's nothing that says
| admins need to let you crawl their site.
| vidarh wrote:
| If people intend this to happen, sure . But how many
| people who put their sites behind Cloudflare is aware
| that this might be a side effect?
| midev wrote:
| I would wager that most people that purchase Cloudflare
| are probably aware of the features it offers
| vidarh wrote:
| Note the distinction. I'd wager that the vast majority of
| sites behind Cloudflare are not paying customers, and
| have not paid much attention beyond "hides my server IP
| slightly and stops DDOS's", without having thought more -
| or at all - about the wider implications.
| foobiekr wrote:
| What is a "ridiculous rate"? Where is it documented?
| oefrha wrote:
| I think the problem is some IPs just straight-up always
| get CAPTCHAs from Cloudflare even if one's a good
| netizen, respect robots.txt, not crawl at ridiculous
| rate, and not lie in the user agent. One reason is shared
| IP, which disproportionally affects people from third
| world countries as their ISPs don't have enough IPv4 for
| everyone; but it also happened mysteriously to at least
| one dedicated IP I used in the past. Your confrontational
| tone is rather unfortunate, and the problem of course is
| that you don't guarantee anything even if the user has
| done nothing wrong, as is manifest from the choice of the
| phrase "starting point".
| ehutch79 wrote:
| Then the problem has nothing to do with your crawling.
| Xamayon wrote:
| Even then, I've run into issues with scraping several
| sites for the reverse image search engine I operate.
| Luckily, in most cases I have been able to get in touch
| with the people running those sites to get a rule added
| for my IPs to allow them through. That's not scalable
| though, and limits where I can scrape/crawl from. Even
| something as simple as checking a site for updates every
| hour or two tends to get blocked after a few times. TBH,
| one of the only things I have found which helps is lying
| in the user agent and copying CF cookies. Luckily, I
| haven't had to play with that for a few months due to
| whitelisting, so not sure it it would still help. Things
| change rapidly.
| jmg03 wrote:
| What's the best way to contact you?
| jgrahamc wrote:
| jgc @ cloudflare
| freshair wrote:
| > _Also, slightly unrelated but Cloudflare protected
| websites are almost impossible to access via tor, the
| captcha never succeeds._
|
| Yes, I've never understood why it's seemingly so important
| to CAPTCHA me before serving me less than 100kb of read
| only plain jane HTML. What sort of "attack" is this
| stopping? I'm pretty sure the CAPTCHA itself is bigger than
| half the sites it blocks me from reading.
| SXX wrote:
| For instance there is no way for distributed search engines
| to work with CloudFlare. No, "contact me and we'll help" is
| not always a solution.
| jgrahamc wrote:
| Please explain the problem (here or via email to me).
| catillac wrote:
| I'm really sorry, but you appear to be the CTO of
| Cloudflare, which makes your not knowing the ins and outs
| of the problem already and basic questioning of it seem
| like sealioning.[1]
|
| [1] https://en.m.wikipedia.org/wiki/Sealioning
| jgrahamc wrote:
| I do not understand what the parent means by a
| "distributed search engine" and I do not know what
| problem they are facing.
| viraptor wrote:
| https://yacy.net/ for example. Each interested node does
| indexing and serving some chunk of the results.
|
| Or in practice - each node quickly runs into a CloudFlare
| captcha preventing it from indexing content for a few
| hours/days. Since CF fronts a lot of the useful internet
| these days, it means it's effectively working against
| distributed indexing with its current captcha solution.
| jgrahamc wrote:
| Thanks. I'll bring this to the attention of the bots and
| DDoS teams.
| jimktrains2 wrote:
| Yacy is 18 years old and not exactly obscure. If your
| bots team is unaware of it it's because they've chosen to
| be ignorant of it.
| joepie91_ wrote:
| A search engine which is not run centrally by one
| organization on infrastructure in a known network, but
| rather something like YaCy where individual users run
| crawler nodes on networks that vary over time.
|
| Which makes "contact us for an exception" a no-go, as the
| relevant source IPs will constantly be changing.
| vntok wrote:
| Yeah, I certainly don't want those crawlers anywhere near
| my servers. Block by default and allow site admins to
| unblock should they want to seems like the best way. It
| is also already how it works with Cloudflare.
| SXX wrote:
| It is great that you care and I guess others are already
| provided some examples, but I'll add my own 2c here.
| Obvious problem is that centralized service like
| CloudFlare do create entry barrier and make large players
| on search and data mining markets even more entrenched
| than ever.
|
| Recently your company announced partnership with Internet
| Archive, but if CloudFlare want to continue play a role
| as behevolent party everyone should have equal access to
| this data. Yeah it means that some bad actors will be
| able to easily scrap the web too, but...
|
| CloudFlare service can't prevent scrapping anyway. There
| are shady residential proxy networks, services to bypass
| captcha and scrapping software like Zennoposter. It's
| possible to make scrapping more expensive, but bad actors
| don't care because they have money. Unfortunately
| enthusiasts, open source projects and small companies
| don't have enough resources to do the same.
| dannyw wrote:
| Making scraping harder definitely reduces scraping. Some
| bad actors will get through, but others will be deterred.
|
| I think you might not understand that it's site owners
| like me who want to stop scraping. It usually comes from
| specific bad incidents, like copycat sites stealing our
| content and work.
|
| Cloudflare wouldn't block scraping if website owners
| didn't want it. And website owners can easily disable
| this protection.
| SXX wrote:
| Scrapping protection is not a problem: defaults that
| CloudFlare promote are. Saying that website owners can
| disable it is akin saying website owners should go and
| whitelist Tor nodes. Most of website owners don't
| understand either of issues and they never gonna opt-out.
|
| Also I'm talking from experience because I been on both
| sides of the fence: doing scrapping and implementing
| protection. So yeah your CloudFlare protection will deter
| 10% of bad actors, but will also cut off 99% of
| enthusiast / research efforts or users of niche software
| or browsers. Still anyone with $1000+ budget will scrap
| whatever they want.
| foobiekr wrote:
| That response is just a way to move the discussion out of
| the public domain without actually addressing it. It's a
| scam.
| SXX wrote:
| Calling CTO of a big company who come to talk with us
| "scam" is very counter-productive. Some of CloudFlare bad
| sides are certainly by design and cannot be changed, but
| they can still change their default filtering policies in
| a way that will help open web greatly.
| vidarh wrote:
| jgrahamc is one of the most active HN users. He's in the
| top 20 on the "leaderboard". He has a good reputation.
| I'd hesitate to call this offer a scam.
|
| That said, I don't think it's a good situation that this
| is the solution rather than a proper, documented position
| that people can work to.
| PaulHoule wrote:
| I've never been able to "reach a human" at Google, Facebook
| and other web giants and I'm skeptical that you can at a
| place like Cloudflare. In fact, I'd be really astonished it
| was possible, because otherwise their business isn't
| scalable.
| alternize wrote:
| while I share your sentiments regarding some other
| companies, I was able to get in touch with an actual
| cloudflare technician (and not some outsourced first
| level support with standard boilerplate replies) in a
| timely manner even on their free tier when I ran into a
| problem with one of their system. every support case with
| them has do far been a real pleasure compared to what you
| experience with other companies. I only hope they will be
| able to keep this level...
| yazaddaruvala wrote:
| The grandparent, jgrahamc, you're responding to is the
| CTO of Cloudflare.
|
| If this doesn't at-least meet your definition of "reach a
| human at Cloudflare", I'm not sure what will.
| dannyw wrote:
| Cloudflare support has been exceptional to me as a
| website owner.
| SXX wrote:
| I personally love CloudFlare and (just like with e.g
| DigitalOcean) I always found a way to contact human
| there. Unfortunately it's doesn't fix fundamental issue
| of how they make internet much more centralized and easy
| to MiTM or apply censorship.
| adspedia wrote:
| Did you contact anyone at Cloudflare for an issue not
| explained in the support docs and you got no response?
| ex_amazon_sde wrote:
| ...also Cloudflare has been a disaster for Tor. It really harms
| Tor's usability.
| nabla9 wrote:
| There was once cs professor who claimed that the internet does
| not scale around 2000. Either things choke up or you need huge
| investments into networks.
|
| It turns out that he was right, sort of. The vanilla attach
| server into the internet, server-to-client IP-network is pretty
| much dead. It has been replaced with CDN's , private delivery
| networks, cache on top of cache. Cloudfare, Amazon, Google and
| MS are the connection points for the IP. Their internal network
| infrastructure transfer most of the data.
| 542458 wrote:
| Is it pretty much dead? Yeah, if you're moving FAANG level
| traffic you need something more fancy than LAMP + an internet
| connection, but I've seen dozens and dozens of sites with a
| plain old no-cdn, no-pdn, LAMP tech stack. Working with
| startups might bias your view - lots of companies are running
| extremely boring setups and they work just great.
| nabla9 wrote:
| Respectfully, I think you miss the point.
|
| The fact that 98% of traffic goes trough this new
| infrastructure, allows some still plug their server to the
| net raw and their traffic still gets trough.
| 1_person wrote:
| The traffic goes through this "new infrastructure" not
| out of necessity but because it's free and allows the
| user to pretend like a number of problems don't exist.
|
| Terrestrial optical networks operate far, far below
| capacity to create artificial scarcity, which is to a
| certain extent necessary to recoup the capital
| expenditures in a competitive market, and is to a certain
| extent an abuse of an under-regulated natural monopoly.
|
| If you could eliminate all adversarial factors and put
| every data service subscriber's payment for a single
| month into a pool, and that pool purchased only
| transceivers, passive optics and switches, and this
| hardware was distributed to every network operator
| perfectly fairly based on its contribution to the global
| maximization of available network capacity, then the
| delivered capacity to the end user could increase by
| something like 4.5 orders of magnitude with no
| substantial change in topology or subscriber or provider
| costs afterwards using the existing fiber, with a few
| more orders of magnitude possible with a fatter tree
| before the backbone costs explode.
|
| With DWDM you can carry 100+ channels of 100Gbps over a
| single fiber today, with commodity, off the shelf
| components. Most fiber in the ground today is probably
| still lit with a single wave of 10G, if it's not just
| dark.
|
| This distribution model is not even remotely a technical
| necessity, it's an arbitrary local minima reached largely
| by exploitative market distortions and adversarial
| economics.
| starfallg wrote:
| >The fact that 98% of traffic goes trough this new
| infrastructure
|
| Doesn't mean that 98% of the value of the Internet
| results from this traffic.
|
| Even if you discount all of the web, there still lots of
| applications using the federated model (e.g. SMTP) or
| peer-to-peer (e.g. Crypto, VOIP), that require end-to-end
| connectivity.
| nabla9 wrote:
| You expand this discussion into completely new direction.
|
| My original point is about capacity and if the old
| internet could work today. It seems like I'm correct, but
| I'm not so sure. I would like to see other opinions.
|
| Every response so far is "there exists". The real issue
| is if the internet could do everything without caching
| data near the edge.
| johnklos wrote:
| No. The marketing of all that extra crap has gotten better.
|
| It's just like Windows - just because 95% of the Internet
| does something one way doesn't mean it doesn't suck, isn't
| more complicated than it needs to be, and doesn't cost more
| than it needs to cost.
|
| Anyone with a little bit of bandwidth and a Raspberry Pi can
| run a web server, even with dynamic content.
| fierro wrote:
| This feature is enabled at the behest of the site owner. I feel
| like site owners and operators should decide who gets to visit
| their site. Am I missing something obvious here?
| clukic wrote:
| Cloudflare is the professional wall builder you hire to protect
| your garden.
|
| Tech monopolies have always had a vested interest in locking up
| user data, dictating the policies, and enforcing their own
| ownership rights. It used to be that only the largest and most
| sophisticated companies had the resources to shield that data,
| but Cloudflare changed all that. Walls are now trivial to set up,
| and virtually unbreachable, and that has forever changed the
| character of the internet by enforcing monopolistic policies with
| such technical precision that they're virtually impossible to
| overcome.
| WORMS_EAT_WORMS wrote:
| No offense, this framing is so dumb. I hate it.
|
| The 'Internet 3.0' isn't coming because of Cloudflare. It's
| coming because these monolith big tech companies have an army
| of engineers who have been centralizing and building it this
| way for years.
|
| Cloudflare didn't build these walls, it's more of a giant boat
| now navigating it because other companies have no choice.
|
| I like to think of them as giant data ferryman in this regard
| versus "a wall builder".
|
| I'm not saying frustrations aren't warranted but -- like come
| on -- have a little perspective of what's really happening with
| the Internet and who is actually driving it.
| clukic wrote:
| Clearly Cloudflare isn't responsible for the data
| centralization that is corrupting the internet. They are
| however, a very sophisticated and efficient enforcer of those
| policies. They've helped ensure large portions of the web is
| no longer crawlable, and that serves to consolidate
| information and power in those tech monopolies.
| WORMS_EAT_WORMS wrote:
| Aka, SMB's now have access to the same tools the tech
| monopolies do.
|
| GDPR-like policies will continue to flood as governments
| partition their Internets and data making it harder and
| harder to run international Internet businesses.
|
| I'm not particularly happy about things either (especially
| crawling access), but it will be a net positive whenever
| you can level the playing field with competition.
|
| When the biggest infringers of data are driving the
| creation of government policies that only they can
| circumvent and navigate -- that's a serious, serious
| problem.
| fierro wrote:
| why is it assumed the web ought to be crawleable?
| xfer wrote:
| I mean there are people making $1 for 1000s of recaptcha solved.
| So not sure how a $40 device is not an improvement if your goal
| is to enforce some rate-limiting against scripts using these
| services.
| mwcampbell wrote:
| There is no perfect solution, but I'm in favor of anything that's
| a net improvement in accessibility for disabled people, even if
| it's not ideal in some other way. So I'm disappointed to see this
| solution being shot down before it even gets deployed on a large
| scale.
| emteycz wrote:
| Right before large scale deployment might be the last moment
| it's possible to prevent the large scale deployment.
|
| Unfortunately corporations are not good at going a step back if
| the step forward is good for their business.
| mwcampbell wrote:
| The trouble is that this change could be good not just for
| Cloudflare's business, but for people. If it turns out that
| this new CAPTCHA alternative is an improvement for users, but
| hurts some businesses who have to put up with a new form of
| abuse, I think that's a net win. Let's not stop it before it
| has a chance.
| emteycz wrote:
| I agree, but what if it's not and going a step back is then
| refused? Is there really no other way of testing than large
| scale deployment?
| eatbots wrote:
| Yep. https://www.hcaptcha.com/why-captchas-will-be-with-us-always
|
| (disclosure: work there)
| rudedogg wrote:
| I'm so fed up with reCAPTCHA. ~90% of the time it doesn't work on
| desktop Safari (I can see CORS errors in the console), so I have
| to use a different browser. Even Gumroad won't let me buy things
| due to this. It really feels like an anti-competitive "bug" (read
| feature), and is so annoying it's hard to not just give up and
| use Chrome.
|
| I feel like I'm crazy - no one else complains. I've mentioned
| @GumRoad on twitter but nothing.
| ComodoHacker wrote:
| Is CAPTCHA a necessity only in ad-sponsored web? Is there other
| compelling use-case for it?
|
| Can we make CAPTCHA obsolete with decent micropayments solution,
| when you pay for every transaction with every website, just like
| we pay for every drop of water we use? Perhaps ISPs could handle
| it for us?
| tomjen3 wrote:
| Captchas, or some anti-bot software is still needed whenever we
| deal with credit cards, because we are still using the obsolete
| version where, if you get your hands on the numbers , you can
| charge any amount you want to whomever you want, instead of a
| model where your card digitally signs the payment request for
| the given amount and receiver, which would mean that any theft
| is pointless.
|
| Anti-bot measures are also used to try to prevent password
| guessing on e.g the login site to gmail.
|
| Finally, sometimes some places offer things like tickets that
| go very quickly, in which case having a bot reload the page
| means the tickets are likely to go to somebody owning a bot
| rather than a fan of the performer.
|
| None of these cases are solved by payments, they are solved by
| client side certificates and, in the last case, by requiring
| the name of the people who are to use the ticket.
| Ayesh wrote:
| How do you know who to pay to?
|
| Sure, you are paying for the every drop of water, but what if
| you really wanted to pay for water from a specific region,
| doesn't want water from another region, and trust that the
| water company does not keep a cut or rip off either of you?
| infogulch wrote:
| Https with the origin?
| osmarks wrote:
| I can't see that being very popular. Even if it doesn't
| actually cost you much in absolute terms, billing per page will
| make people a lot more reluctant to explore new content.
| kevincox wrote:
| This is an often neglected benefit of "Unlimited" plans. It
| changes the feeling of consuming. You have already paid so
| you may as well enjoy instead of asking "Do I really want to
| pay for this?" at every use.
|
| From a technical point of view it is possible. Assuming that
| the payments were mediated by some party that party could
| issues statements like "this user has used their monthly
| allowance but they would pay". Assuming that this provider is
| widely trusted websites may treat this as a "real user" and
| allow the visit. (This is roughly how https://coil.com/
| works.) Of course there are negative implications such as
| making it very difficult for new providers to get started.
|
| You can also imagine some type of smart-contract where the
| subscription fee is split at the end of the day or month
| amongst the visited sites. Upon visit the sites just get a
| token for one share. Of course this would need to be very
| carefully designed to prevent abuse. (Example one malicious
| client splitting their subscription across millions of pages)
| lsaferite wrote:
| Privacy seems like a bad argument considering CF already has the
| technical ability to easily track you across all of the sites
| they front if they so desire.
| wbkang wrote:
| With each domain u2f generates a different key (conceptually)
| so this should be harder to track potentiality.
| djoldman wrote:
| Can a FIDO key be implemented in software? Can you write a
| program to register a FIDO key as a multi-factor authentication
| device with a Google account?
|
| Or is there some repository of all allowed devices with
| identifiers? Intuitively that'd be the only way to prevent
| infinite virtual devices..
| floatboth wrote:
| Attestation (what they use) is orthogonal to authentication.
| Token manufacturers have per-batch keys, private key being in
| the devices of that batch, so sites can verify that your device
| is from that batch of that vendor. You "can" implement
| attestation with your own key in software or in whatever, but
| Cloudflare won't trust your key :D
| djoldman wrote:
| So how does cloudflare know to not trust your key? They know
| some secrets from the token manufacturers and test your
| response against them?
|
| If so, what if the secrets get out? Then all the keys in
| those batches are poisoned?
|
| Isn't this just some sort of side channel certificate
| authority?
| wereHamster wrote:
| Yes. Example: https://github.com/github/SoftU2F
| sdfhbdf wrote:
| See also the original announcement:
| https://news.ycombinator.com/item?id=27141593
| PaulHoule wrote:
| The build quality of those Yubikeys freaks me out. I wonder how
| many insertions it takes until something shorts and my
| motherboard gets damaged.
| swiley wrote:
| So visiting cloudflare sites with TOR requires you to identify
| yourself? That's not great.
| ComodoHacker wrote:
| Can you see the difference between 'identify yourself' and
| 'prove you're human'?
| swiley wrote:
| No?
| fuckyouriotshit wrote:
| Visiting many CloudFlare sites with Tor was impossible the last
| time I checked because their CAPTCHA is broken and has been for
| a long time.
|
| I know for a fact that there are staff at CloudFlare who are
| aware of this problem but nothing has changed, so I guess that
| they don't care that they are making some sites unavailable to
| anyone who has to use Tor.
| fuckyouriotshit wrote:
| Edit: I re-read the section on the U2F batch keys and
| understand that the design intent is to be unable to track
| individual tokens across sites (only batches of a size decided
| by the token manufacturer). It's not completely clear to me if
| the crypto involved is resistant to an attacker who can collect
| the handshakes and then later gets access to the key(s) that
| meant to be private to the manufacturer(s), but I acknowledge
| that the intent is decent. My points still stand, however.
|
| This sort of "we can solve that problem; we just need to kill
| your privacy" seems to be par-for-the-course in SV-style
| companies.
|
| I really wonder if anyone involved with building these systems
| has ever seriously thought about what could happen if the data
| collected (or that could be collected) by these systems was
| obtained by an adversary.
|
| Not to mention the incredible incentive problems that are
| created by designing things that are designed in a way that
| _requires_ that individuals are tracked across the internet.
|
| I know that CloudFlare is just one of many companies that is
| moving in this direction and they're certainly not the worst
| offenders when it comes to slowly murdering individual privacy
| (Facebook and Google are obviously far worse) but they have a
| uniquely powerful position due to the number of sites that use
| their DDoS protection and seem to be taking a casual disregard
| to the damage that they can do to people's privacy.
| ignoramous wrote:
| They implemented _Privacy Pass_ for that, which is kind of neat
| [0] and related to another standard for authn viz. OPAQUE that
| I really like [1].
|
| [0] https://github.com/privacypass
|
| [1] https://news.ycombinator.com/item?id=25346632
| FriedrichN wrote:
| I hate this new web where you're automatically assumed to be some
| malicious actor only because you don't accept cookies and strange
| third party code and then have to jump through hoops to show that
| you're not some evil bot. To be honest, if a website immediately
| throws some Cloudflare anti-DDoS thing in my face I don't even
| bother anymore.
| apple4ever wrote:
| I agree. Browsing the web can be super frustrating when there
| is a CAPTCHA every other page.
| jfengel wrote:
| We all do. Everybody who's ever worked in security hates that
| even the tiniest hole in your security will be squeezed
| through. If you don't distrust every single packet, then sooner
| or later one of those packets is going to destroy you.
|
| It's basically the same both ways. You don't trust them with
| your private info. They don't trust you, either. The easiest
| way is, indeed, to just call the whole thing off.
|
| Everybody would love an alternative that lets more get done
| with less trust. Sometimes they find them, for limited cases.
| But nobody's solved it for the general case.
| gsich wrote:
| What am I missing? I get this device and can crawl all I want?
| [deleted]
| danShumway wrote:
| Complete aside, but I'm still not certain I understand the
| technical details of why Cloudflare can't uniquely identify
| users. I thought I knew how hardware keys worked, but apparently
| I don't.
|
| If the key being shared is embedded in the device, even in a
| secure enclave or something, then my understanding was that would
| open the door for key extraction. If the key is unique per-
| device, then that's not a problem. But if the key is unique
| per-10,000 and stored statically on the device, then hacking one
| device means that key can be released to anyone and the entire
| pool can be imitated.
|
| So if the above is correct, it can't be that a single private key
| shared across the entire company is stored on the device because
| that key would be getting constantly extracted and leaked by some
| determined hacker somewhere. But if it's a unique key per-device,
| then... I just can't figure out how validating that key wouldn't
| require transmitting unique information to _somebody_ , whether
| it's Cloudflare or the device manufacturer.
|
| Where am I going wrong? I feel like I'm misunderstanding
| something fundamental about how signing works on these devices,
| but I can't figure out what it is. If I buy a Yubikey, is it
| connecting to the manufacturer's servers and getting a new key
| each time it's used? I thought they worked offline.
|
| Or are secure enclaves just much more secure than I think they
| are? Are we assuming that it's impossible to extract a private
| key from one of these devices?
| fuckyouriotshit wrote:
| > Are we assuming that it's impossible to extract a private key
| from one of these devices?
|
| Nope, it's just "expensive" to extract keys from secure
| hardware like this. The problem with an approach like this is
| that when the secret keys are identical for a large number of
| devices then the cost of revoking a compromised key goes up
| significantly which, for a spammer, would increase the value of
| obtaining said key because of the likelyhood that the key would
| be usable for a much longer time than if the key was unique to
| each device (and could be very easily blacklisted).
|
| Techniques to extract data from these sorts of secure devices
| include various forms of side-channel analysis, decapping and
| microprobing the IC, using SEM, etc. to physically damage parts
| of the circuit to try to force it to disclose the key and
| various forms of power and clock glitching.
|
| Most decent hardware-based cryptosystems are designed to ensure
| that each device has a unique key so that the cost of
| extracting one key (lets say around $100k) is too high for a
| potential attacker if the key can just be blacklisted, but if
| the key is expensive/impossible to blacklist then that cost
| might be worthwhile to an attacker.
| md_ wrote:
| https://fidoalliance.org/fido-technotes-the-truth-about-atte...
| explains this pretty well.
|
| Basically:
|
| * Attestation keys are not unique per authenticator; they're
| shared among batches of authenticators.
|
| * If you extract the batch's attestation key, you _can_ imitate
| authenticators from that batch. That doesn 't mean you can
| authenticate as a registered authenticator, of course; it just
| means you can pretend to be a "Yubico XYZ" device.
|
| * Yes, I think Cloudflare is assuming it's hard to extract the
| attestation key. I think this is basically a safe assumption,
| but if it isn't, they can always choose to distrust batches
| known to be compromised.
| danShumway wrote:
| Thanks, that's really helpful. Followup questions though:
|
| - Does this mean if I buy 2 of these devices at the same
| time, it's possible for me to get the same attestation keys
| on both devices? I guess depends on how many batches at a
| time a company is producing.
|
| - Doesn't this mean that attestation keys will get more
| unique over time as devices from the pool fall out of
| circulation and become rarer? Are keys rotated to prevent
| that (ie, would a manufacturer ever re-release a new pool
| with the same keys as an old one)?
| gjvr wrote:
| I understand from [0] that the attestation key is shared
| across all instances (SNs) of the same _model_ (PN):
| "...For example, all YubiKey 4 devices would have the same
| attestation certificate; or all Samsung Galaxy S8's would
| have the same attestation certificate". So you would not
| need to to buy them at the same time.
|
| But of course, despite this, still a unique key is
| generated for each identity upon sign up [0]. I am not sure
| (as in 'have no knowledge of') the entropy for these
| devices.
|
| [0] https://fidoalliance.org/fido-technotes-the-truth-
| about-atte...
| md_ wrote:
| - Yes.
|
| - To my (limited) knowledge, yes, you are right that keys
| will get more unique over time. That's a very good point.
| Keys are not rotated nor (generally) are they rotatable;
| they are usually read-only. If you are using a very old
| FIDO device and worried it has too much entropy now--like,
| if it's a "rare" or "vintage" device!--then you should buy
| a new one, I guess?
|
| (I honestly have not thought about your second point
| before, but I am not really deep in the FIDO stuff. So take
| my answer with a grain of salt.)
| fuckyouriotshit wrote:
| Manufacturers don't necessarily have to rotate the keys
| on older devices; they could rotate the keys on newer
| devices such that it's difficult to reliably tell what
| batch/generation a newer device is from, because it could
| be using a newer or older key.
|
| Such behavior would require some way of revoking old keys
| from newer devices to prevent a situation where a
| compromised and blacklisted old key is selected and
| causes the CAPTCHA to fail, seemingly at random.
| dane-pgp wrote:
| > they can always choose to distrust batches known to be
| compromised.
|
| Which effectively means bricking the devices of 9999 innocent
| users each time.
|
| Why are we creating a world where users will be told they
| can't visit a website or access their account any more
| because they didn't spend enough money on a hardware DRM
| device which tries to hide a key from them?
| ignoramous wrote:
| > _Where am I going wrong? I feel like I 'm misunderstanding
| something fundamental about how signing works on these devices,
| but I can't figure out what it is._
|
| Whilst Fast Identity Online (FIDO) is much more than WebAuthn,
| Cloudflare's proposal here is to use WebAuthn to get rid of
| CAPTCHAs. The official WebAuthn doc is surprisingly accessible
| with neat illustrations for key topics:
| https://w3c.github.io/webauthn/ (ref registration and
| authentication ceremonies, in particular)
| ve55 wrote:
| The way I'd put it is that Cloudflare's suggested implementation
| may have its issues, but the general idea of trying to verify
| that someone is a human and then providing this verification to
| services in a way that is 1) anonymous and 2) cross-compatible
| with other services, is the correct way to go about things (or at
| least has some very appealing features).
|
| I hope that we have something in the future that does this job
| very well so that services do not need to verify phone numbers,
| Google accounts, and even IDs and facial imagery just to allow
| someone to use them (as this is much easier to do than coming up
| with new captcha styles that humans can quickly and easily solve,
| but that basic machine learning and scripting cannot).
|
| Being able to use the Internet with the slightest bit of privacy
| is already ~impossible for the average user and extremely
| difficult and tedious for very knowledgeable and experienced
| ones, so anything that tries to improve the current trend sounds
| like it's at least attacking a problem worthy of our attention.
| anothergram wrote:
| Alternatively, if services demand a fee then there is no need
| for human verification.
|
| Instead of trying to solve anonymous human verification we can
| as well make micro-payment an option.
| bo1024 wrote:
| I really hate it when I'm trying to spend money at a company
| and get hit with a captcha box right as I click "checkout". I
| could see it for selling scarce items like concert tickets,
| but in general it's very insulting, annoying, and off-putting
| to me.
| kevincox wrote:
| Credit card fraud that results in chargebacks is a very
| significant cost to a lot of online stores. So while it
| does suck it isn't the shop that is to blame.
| sroussey wrote:
| Once logged in perhaps. But credential stuffing is a thing.
| derefr wrote:
| Sometimes serving 429s/403s to unauthed users is already
| costing you too much in egress bandwidth bills. That's one of
| Cloudflare's main propositions: stop that "idiot bot that
| will never get what it wants, but keeps requesting it anyway"
| traffic outside your network. (Note: not the same as a DoS!
| Usually not intentional, and usually not actually bringing
| your infra down. Just costing you money, while not making you
| any money.)
| TimothyBJacobs wrote:
| A small micropayment makes for a great way for bad actors to
| test stolen credit card numbers.
| eikenberry wrote:
| You can't do micro payments with credit cards, they have to
| much $$ overhead. You'd need to pay into a service that'd
| handle the micro-payments and they'd have minimal packages
| to buy to mitigate these issues.
| StavrosK wrote:
| Is this post missing the point, or am I? I thought that the point
| of requiring attestation is to prove that someone actually did go
| out and buy a legitimate Yubikey (or whatnot) and ban that key if
| they're spamming.
|
| With those two considerations, this actually seems like a really
| good idea to me.
| stickfigure wrote:
| It doesn't appear to identify any specific key, just that the
| user has _a_ yubikey. You could only ban a whole key
| manufacturer (or key batch, however large that is).
| sammy2244 wrote:
| This article is not very well written.. "security keys are quiet
| fast"
| djoldman wrote:
| It seems to me that forcing the user to go through captcha is a
| big negative user experience.
|
| Google must be docking points from websites that employ captcha
| then, right?
| kevincox wrote:
| It doesn't matter. Googlebot is whitelisted.
|
| (Partially sarcasm, I do know that Google does do some anti-
| cloaking crawling)
| viraptor wrote:
| Even if we ignore the technical reasons, for me CloudFlare's
| proposal fails at their "Associate a unique ID to your key"
| property, where they say CloudFlare could, but won't do it. If
| they implement this scheme they start normalising this approach.
| Once it gets to FB and Google implementation, their answer will
| be: we could, but we... look! a squirrel!
| tialaramex wrote:
| Their document says, correctly, that the means by which they
| could try to do this would be to shove the arbitrary random ID
| they get into a cookie.
|
| You may have noticed that both Facebook and Google already use
| cookies. Did you know Hacker News has a cookie too?
___________________________________________________________________
(page generated 2021-05-14 23:01 UTC)