[HN Gopher] You don't want to be on Cloudflare's naughty list
___________________________________________________________________
You don't want to be on Cloudflare's naughty list
Author : merlinscholz
Score : 349 points
Date : 2022-09-20 14:26 UTC (8 hours ago)
(HTM) web link (www.ctrl.blog)
(TXT) w3m dump (www.ctrl.blog)
| yamtaddle wrote:
| Harsh blocking/limiting/challenging is way too valuable to sites
| that are actually trying to make money online. It's not going
| away short of legislation banning it. Losing 1/10,000 legitimate
| customers to cut fraud attempts, spam, exploit attempts, and so
| on, by 90% or more, is just too good a trade-off.
|
| I have bad news about the most-likely fix for it, longer term, so
| we can lay off the IP-based reputation stuff and the geo-
| blocking: it's tying some form of personal ID to your browsing
| activity, so _that_ bears the reputation instead of the address.
|
| Sorry. Said it was bad news.
| Waterluvian wrote:
| I think this is true. It also reminds me of one possible
| purpose of regulation and government, given the majority will
| usually be happy to throw any sort of minority under the bus
| for the "greater good."
|
| This also reminds me of the anxiety of Google deciding to just
| ban my account for some reason. They can't be bothered to
| commit resources to making sure mistakes can be resolved. They
| don't care to lose a fleetingly small percentage of customers.
|
| Not sure I have an answer. Just a thought.
| akira2501 wrote:
| > Harsh blocking/limiting/challenging is way too valuable to
| sites that are actually trying to make money online.
|
| I'm not understanding the generalized sentiment here. How
| would, for example, a retailer benefit from this strategy? How
| does it protect their bottom line?
|
| I can see how a particular kind of "facilitated user economy,"
| such as games, gambling and promotional companies could
| benefit, but it doesn't seem that broadly applicable to what
| most people would consider a "mainstream" business.
|
| > so we can lay off the IP-based reputation stuff and the geo-
| blocking: it's tying some form of personal ID to your browsing
| activity
|
| And a new market for identity theft is born.
|
| Also, as someone who serves content and geo blocks it, that's
| not up to me, that's up to the owner of the content or whoever
| happens to be licensing it for them. So, even if you sent me a
| picture of your government ID, it changes nothing.
| les_diabolique wrote:
| > a retailer benefit from this strategy? How does it protect
| their bottom line?
|
| A couple of examples I can think of is blocking bots from
| scraping their site for pricing and details and from
| resellers from buying up all of the stock (see sneakers,
| electronics, etc). The last example doesn't directly impact
| their bottom line, but it will make customers go elsewhere.
| yamtaddle wrote:
| > I'm not understanding the generalized sentiment here. How
| would, for example, a retailer benefit from this strategy?
| How does it protect their bottom line?
|
| The amount of automated _and apparently-manual_ attempted
| credit card fraud (and exploit attempts, for that matter) any
| halfway-prominent site with a CC form is subjected to is hard
| to appreciate if you 've never seen it. It's _a whole lot_.
| They aren 't even necessarily trying to buy what you have,
| but to validate that their stolen cards work. And they're
| quite busy. If too much of that gets through--really, any
| more than a _very_ tiny amount of it gets through--you 're
| gonna have an extremely bad time.
|
| Various CC service providers like Stripe do provide tools to
| try to block those attempts, but defense in depth is usually
| a very good idea, including fairly aggressive firewall-level
| blocking.
| hot_gril wrote:
| The other not-so-great approach is to act like a normal user.
| This stuff doesn't tend to happen to the average Joe who
| browses the WWW. It's when you're doing unusual (albeit
| harmless) things.
| jabbany wrote:
| An alternative that preserves some privacy also doesn't seem
| that hard to imagine... though it probably has its own can of
| worms*.
|
| Basically, the core problem is digital identities (accounts,
| IPs, phone #s etc.) are cheap to create (even considering
| captchas and all) so fraud is easy. The solution could be just
| to make it "costly" to create new digital identities. For
| example, you could get a "verified but anonymous" identity
| issued by locking some assets (could be real world money, or
| maybe something intangible like community reputation) as
| collateral with a trusted party (or, for the crypto people, the
| blockchain). If you misbehave, you lose your reputation on that
| identity (and essentially your collateral) and have to start
| over. This lets anyone bootstrap a "minimal" level of trust at
| the beginning before they can use time to prove themselves
| trustworthy.
|
| Note: This model might remind some of things like staking in
| crypto. However the idea is really not anything new... Putting
| money on the line is really how most low-trust bootstrapping
| happens.
|
| *: To name a few:(1) this can result in participation being
| gated by wealth, which can be unfair. (2) it makes accounts
| more valuable to hack so people need better security practices
| [re: twitter checkmark]. (3) one would need some authority to
| decide how accounts lose their collateral or maybe the
| collateral is just burned to create that initial credibility...
| mhink wrote:
| > Basically, the core problem is digital identities
| (accounts, IPs, phone #s etc.) are cheap to create (even
| considering captchas and all) so fraud is easy. The solution
| could be just to make it "costly" to create new digital
| identities. For example, you could get a "verified but
| anonymous" identity issued by locking some assets (could be
| real world money, or maybe something intangible like
| community reputation) as collateral with a trusted party (or,
| for the crypto people, the blockchain). If you misbehave, you
| lose your reputation on that identity (and essentially your
| collateral) and have to start over. This lets anyone
| bootstrap a "minimal" level of trust at the beginning before
| they can use time to prove themselves trustworthy.
|
| I've always thought that client certs would be an interesting
| solution to this problem. Any given certificate can carry
| signatures from multiple signing authorities, right? So we
| could imagine a world where there are many different
| certificate authorities, each of whom have their own criteria
| for signing a particular certificate and each of whom offer
| different varieties of assurance regarding the signature-
| holder's identity.
|
| From here, the question of "should I allow the user
| identified by this client cert to use my service" simply
| becomes a question of 1.) checking the validity of the
| signatures of the client cert and 2.) deciding if the CA's
| criteria for signing certs aligns with my desired userbase.
|
| For example, a particular CA might insist that their users go
| through some real-world process to renew their certification
| every few years, but when they sign a cert it means that the
| bearer has been strongly vetted as a real person.
|
| An interesting side effect of this auth model is that a
| service provider accepting certs from a particular CA has
| someone to complain to if a user bearing their signature acts
| improperly on their platform. You could imagine a CA which
| has a code of conduct expected of the users whose certs they
| sign, and would perhaps revoke a user's certification if too
| many websites complain.
| unwise-exe wrote:
| That's not safe for a lot of sites, though.
|
| I hear that porn tends to be officially frowned on in a
| fair number of places.
|
| Reading non-approved news is dangerous in some places.
|
| Honestly _debating_ political topics can be super dangerous
| if you 're identifiable.
|
| Sometimes even having a login on a site is dangerous, I
| think I heard about this after a non-mainstream discussion
| site got hacked like a hear and a half ago.
| georgyo wrote:
| Your idea is comes from a good place, but identity theft is
| already a thing in the real world. Digital identities would
| also be very stealable. This malware more harmful in the long
| term. Imagine if your Twitter gets hacked and your digital
| identity makes it so your Gmail gets blocked.
|
| Similar, the internet is already very difficult for the
| people with limited means. This would make it even harder.
| [deleted]
| smsm42 wrote:
| They are already testing out digital IDs. Now link that to the
| social score... and make the browsers and the sites exchange
| these data on the background, and make frontend services
| providers refuse connections from non-supporting browsers as
| "bots"...
| tboyd47 wrote:
| How does having a personal ID tied to browsing activity help
| with spam? Are spammers not real people with IDs?
| les_diabolique wrote:
| Spammers typically implement bots to carry out tasks. I mean,
| technically at some point a spammer is a real person, but
| when you're automating tasks and using bots, it's not at the
| same scale.
| notsapiensatall wrote:
| So what happens when your ID gets hacked and reused for
| fraudulent activity?
|
| Would you have to submit a dispute with the internet credit
| agencies? Maybe join a class action suit against the entity
| that leaked your ID so that they're forced to give you a
| year of free internet identity monitoring?
| jamie_ca wrote:
| Then you need to deal with levels of rate-limiting that
| are fine for individuals but make it not feasible for
| spammers.
|
| Keeping with the cloudflare topic, if Cloudflare only
| permits you 10 requests per second (HTML + JS/images)
| that's still usable for web browsing, but someone running
| a cloud of hundreds of bots would be effectively shut
| down. Similarly with email, an individual probably
| doesn't need to send more than one email per 10 seconds
| but email spammers wouldn't find any ROI at that rate -
| business needs being different might necessitate a
| different registry or something in that case.
| smsm42 wrote:
| The same that happens now when somebody stills your
| identity and ruins your credit history. You'll have to
| live in a bureaucratic hell for the next couple of years.
| And yes, as a compensation, you'll get the $6.99 worth of
| services from the guilty party. If you win the class
| action suit, that is.
| notsapiensatall wrote:
| Exactly. Why on earth would we want to replicate such a
| terrible system online?
|
| We should be reforming our current credit agency system,
| not empowering it with a new mandate of judging
| somebody's social or political creditworthiness.
| mcguire wrote:
| Nobody said it wouldn't suck. The only question is
| whether it sucks less than the alternatives.
| adamckay wrote:
| Of course, but the theory is it's restricting 1 real person
| to 1 account, versus 1 spammer creating 1,000 accounts via
| automation.
|
| And once your spammer has been identified then that's them
| banned/removed, unable to sign up again.
| tboyd47 wrote:
| What's to stop them from using fake IDs
| thayne wrote:
| I haven't experienced it as badly as the author. But I find the
| cloudflare page checking that I am using a "secure browser" very
| frustrating. I seem to get it the most for gitlab pages for some
| reason.
| marcus_holmes wrote:
| I use a VPN, for perfectly legitimate reasons (I travel a lot,
| and most internet services assume that your IP address also
| indicates your nationality, citizenship, language, bank account
| country, etc. Being able to change IP source country is vital).
|
| Some VPN exit addresses have obviously been flagged as "bad" by
| Cloudflare and I get challenged with CAPTCHAs from some
| countries. It's an interesting experience, but luckily my VPN
| provider has enough exits that I can usually switch to one that
| has better reputation with Cloudflare.
|
| Obviously, none of this is helping the internet be a better place
| from my point of view. I get that it's part of the ongoing fight
| against bots and spam, but it always feels so arbitrary. IP
| addresses are interchangeable, folks - they say nothing about the
| nature of the request. Or rather, for a large majority they do,
| but there's us minority that don't obey those rules and resent
| getting caught up in it.
| jasonlotito wrote:
| Yeah, this just continues to reinforce my opinion Cloudflare.
| It's not something I would ever recommend, and there are numerous
| other superior options out there. I see Cloudflare failing
| frequently enough that if it were something I was responsible
| for, I'd be embarrassed at the very least.
| tire-fire wrote:
| What superior options would you recommend that are privacy
| focused and free?
| dedward wrote:
| I'm curious if you've had experience with their enterprise
| package?
|
| I can understand people's gripes about things on the free/cheap
| packages, where Cloudflare makes decisions for you, sometimes
| ones you don't like.
|
| But as an enterprise customer, I've never found it to be
| anything short of fantastic - I can tailor it to behave exactly
| how I want, and not interfere with my customers.
| johnklos wrote:
| Your response seems to ignore the very article being
| discussed.
|
| Or are you suggesting that if you're having trouble visiting
| sites because of Cloudflare, you should become an enterprise
| customer? (slightly sarcastic, but not completely)
| dedward wrote:
| My response is simply trying to understand where you are
| coming from. You've mentioned there are numerous superior
| options and you would never recommend it.
|
| I'm wondering (genuinely!) if you are speaking as an
| enterprise customer or a free plan, or what.... both for
| the sake of meaningful discussion and potentially learning
| about even better options for my own work.
|
| As to the article - I fully believe the responsibility lies
| with site owners to pick and choose how they want to serve
| their sites. Nobody is forcing them to use Cloudflare on a
| free plan, or to ignore any analytics it provides and make
| sure it is serving their customers correctly. Cloudflare is
| one piece of a delivery solution, and only works as well as
| you configure it. If your decision for your app is "I'll
| just use the free plan, and let Cloudflare decide
| everything for me" then you get what you pay for.
|
| If Cloudflare is getting in their way, they can go
| somewhere else.
| DethNinja wrote:
| There is a chance you might've been hacked.
|
| You would be surprised to see how easy it is to hack domestic
| routers.
|
| 1. Find and disinfect the devices, including the router. If you
| don't have enough technical knowledge, then buy a new router.
|
| 2. Use 30 character long random password on the router.
|
| 3. Disable UPnP.
|
| 4. Anything with WI-FI and weak password can be hacked within
| minutes, so check your other devices as well, especially IOT
| ones.
| mh- wrote:
| My assumption is also that something on his network is
| compromised, and getting his IP into reputation issues.
|
| Tarpitting (serving content slowly from the edge, in order to
| slow down bots) is necessarily one of the most expensive tools
| in a WAF/CDN's toolbox.
|
| It's _much_ more likely that something on his network is
| sending sketchy traffic to CF-fronted /Google sites, and the
| slow loading he's experiencing elsewhere is because his
| upstream is being saturated by whatever is happening on his
| network.
| d2wa wrote:
| (Author here.) My router isn't a domestic router. It's a
| MikroTik running RouterOS, completely unsupported by the ISP.
| Outgoing connections and DNS is logged. UPnP is only allowed
| for the Xbox, PS4, and off-most-of-the-time gaming PC. Nothing
| out of the ordinary in the logs.
| alexforster wrote:
| > It's a MikroTik running RouterOS
|
| https://google.com/search?q=mikrotik+botnet
|
| These things are the absolute scourge of the internet.
| aaronmdjones wrote:
| > It's a MikroTik running RouterOS
|
| It's almost certainly compromised.
| malfist wrote:
| Why would you disable UPnP? You're gonna break most
| collaboration tools/video games/etc.
| kunwon1 wrote:
| Disabling UPnP doesn't break much. I've used enterprise
| firewalls at home for years, none of them have UPnP, I've
| never noticed a problem arising from that lack. I don't have
| a problem with video games or collaboration tools
|
| UPnP allows devices inside your network to open ports to the
| outside world without your knowledge. I think everyone should
| avoid it if they can get by without it
| d2wa wrote:
| It's absolutely required for most multiplayer games. Many
| need random ports and some even refuse to work if UPnP is
| blocked even if you manually open a port for them.
| aaronmdjones wrote:
| I've never had UPnP enabled and I don't have any problems
| doing online gaming / flight sim / video chatting / etc.
| malfist wrote:
| What's your solution for the grandmother who just wants to
| make a zoom call to her grandson? Have her log into her
| router portal and setup a static ip for her laptop and then
| port forwarding routes for zoom?
| Karrot_Kream wrote:
| STUN servers? Also, while I (not GP) do think UPnP is
| dangerous, I also think it's only something you disable
| if you _know_ you can live without.
| thayne wrote:
| I don't think zoom uses UPnP. If it did, that would cause
| problems on corporate networks that typically have UPnP
| disabled.
| [deleted]
| zinekeller wrote:
| To be frank, that's exactly the problem with NAT-PMP et al.
| assuming that there's no router bugs: the ability to forward
| ports has been abused to set up bot relays on hacked IoT
| devices. This is why I predict that even in IPv6 era we would
| still have to rely on a TURN-equivalent.
| malfist wrote:
| That's exactly the problem with NAT-PMP?
|
| So what's your alternative for peer to peer connections?
| Static routing that the common end user can't figure out?
| Re-centralize connections?
|
| UPnP is necessary.
| zinekeller wrote:
| I'm simply pointing the problem, a real-world an
| realistic problem, and you're acting like it's a non-
| issue. Point me a CGNATted network which has enable port
| forwarding. Does it break a lot of things? Oh,
| absolutely. Did the carriers still not activated it? Yes.
| Automatic port forwarding is only beautiful when you know
| how would your device react. It's ugly when you're a
| network administrator who don't control all devices.
|
| There is no "perfect" solution here because the real
| world is a messy place with devices that you cannot
| personally vouch for.
| nuc1e0n wrote:
| This story shows Cloudflare is now harming legitimate users, is
| an effective monopoly and as such should be broken up.
| jabroni_salad wrote:
| Do you have an ISP-provided email account that you never check?
| You might want to check it to see if you have any botnet
| notifications.
| simple-thoughts wrote:
| There's a real lack of education I've seen in developers for
| small projects who go directly to cloudflare for anything and
| everything. They don't understand that they are immediately
| losing a large chunk of their user base who is either from the
| third world or is privacy literate. Devs working on projects that
| are targeting those groups need to understand the tradeoffs from
| using cloudflare.
| shiomiru wrote:
| If you'd like to experience this treatment first-hand, try
| surfing the web using the Tor Browser.
|
| Spoiler alert: many websites simply refuse to load at all (e.g.
| any google service, and lots of websites "protected" by CF).
| Captchas are everywhere: in many cases, you can't even complete
| simple GETs of blogs without donating free labor to CF.
|
| And the most infuriating part, you get CF marketing messages
| right in your face while your browser is calculating hashcash (I
| guess?)... At this point I can recognize every single one of
| them: something about bots making up 40% of all internet traffic,
| something about their web scraper protection racket, something
| about small businesses (???), etc etc...
|
| To be fair, Tor exit nodes have an awful reputation for sure.
| Nevertheless, I have a hard time forgiving how CF makes browsing
| the Internet hell for those who actually need Tor.
| yjftsjthsd-h wrote:
| > And the most infuriating part, you get CF marketing messages
| right in your face while your browser is calculating hashcash
| (I guess?)... At this point I can recognize every single one of
| them: something about bots making up 40% of all internet
| traffic,
|
| Yeah, there's something amazingly aggravating about CF telling
| you how much traffic is bots _while showing that they can 't
| distinguish you from a bot_.
| robocat wrote:
| CloudFlare are creating a new devision for advertising to
| bots. They have projected that in the near future, bots will
| be 90% of spending, so the bot demographic is the most
| important to target, marketingwise.
|
| The fact that humans are seeing the traffic meant for bots is
| an unfortunate side-effect.
|
| I personally welcome our future bot overlords (not only
| because being unwelcome might be unhealthy for me -- why
| would I publicly disagree with an overlord or not want to be
| their friend?).
| jasonfarnon wrote:
| I routinely use Youtube with Tor. I will occasionally get
| kicked off with a "suspicious traffic" message, but it isn't my
| experience that it "refuses to load at all".
| synthetigram wrote:
| Cloudflare has mixed up the definitions of "bot" and "abuse".
| Tor users may or may not be bots, but as long as they don't
| abuse (spamming or DoS), they ought to be treated the same.
| sampa wrote:
| If an ordinary user would have to deal with google/CF bs everyday
| as I do, they'd burn their computer.
|
| PS Proud user of Firefox + resistFingerprinting=true PPS Ain't
| nothing better than CF guard page constantly-reloading on 20% of
| sites if you open some url :( No, fella, you first have to open
| the root '/' page so that guard page finally can either pass me
| through or show the cloudflare captcha. Ugh. Progress, they say.
| mikessoft_gmail wrote:
| johnklos wrote:
| Imagine all the people in countries deemed less desirable by
| Cloudflare that go through this all the time. Cloudflare, whether
| it's their stated goal or not, is re-stratifying and re-
| centralizing the Internet because of their desire to be a
| monopoly, and we'll all suffer as a result.
| andrewnyr wrote:
| there are multiple other large CDNs out there... its a lot more
| like 5 market leaders tbh
| johnklos wrote:
| But how many of them:
|
| 1) refuse to take responsibility for content they host by
| claiming they don't host
|
| 2) discriminate against huge parts of the Internet with no
| publicly known rules, nor methods to change that
| discrimination
|
| 3) make the abuse reporting process intentionally difficult
| and time-consuming
|
| 4) want to aggregate all the DNS data they can by making a
| deal with Firefox to turn on DNS-over-https by default
| without asking or even informing end users
|
| 5) want to re-centralize the Internet, in part so they can
| mix bad actors with good, in ways that make blocking next to
| impossible
|
| How many of them do the discrimination we're all writing
| about here?
| easrng wrote:
| tbh I think one of the very few positives of having so many
| sites going through a few CDNs is that you can make it
| impossible to block a protocol or site without significant
| collateral damage, which can be a good thing, things like
| Tor's meek bridge rely on that.
| andrewnyr wrote:
| 1) refuse to take responsibility for content they host by
| claiming they don't host >CDNs don't host content, they
| proxy it
|
| 2) discriminate against huge parts of the Internet with no
| publicly known rules, nor methods to change that
| discrimination >Not large parts of the internet, scammy and
| attacky parts of the internet. If the rules were public
| they wouldn't be effective.
|
| 3) make the abuse reporting process intentionally difficult
| and time-consuming >simply untrue, every abuse report i
| have filed has had an answer back within 24hrs
|
| 4) want to aggregate all the DNS data they can by making a
| deal with Firefox to turn on DNS-over-https by default
| without asking or even informing end users >this is a good
| thing as they are audited as having not keeping logs of dns
| queries
|
| 5) want to re-centralize the Internet, in part so they can
| mix bad actors with good, in ways that make blocking next
| to impossible >again every cdn centralizes the internet,
| and many sites need this protection
| PaulHoule wrote:
| The rise of Cloudflare is the first real threat I've seen to
| ordinary people running webcrawlers.
| mschuster91 wrote:
| Tragedy of the commons, unfortunately. There were a bunch of
| cases where web crawlers and scrapers built competitive
| services on the back of the services they scraped, some of
| these ending up in courts [1].
|
| [1] https://www.derstandard.at/story/1389860104020/eu-
| gerichtsho...
| lorey wrote:
| Which is in turn a threat to the open web in general. Could not
| agree more.
| adamsb6 wrote:
| Does the author have a fixed IP?
|
| If not, figure out how to get a new one and see if the blocking
| recurs. If it does, the bad activity is probably coming from
| inside the house -- or CloudFlare has a way to identify you
| across an IP change.
| d2wa wrote:
| The author, me, does have a dynamic IP, but it only changes
| once every two years or so.
| digitailor wrote:
| Not saying that this is the case here, but this may be possible
| due to having a bad tab open. Especially over cellular. Haven't
| looked into it with any depth, but I've had correlations on a
| much shorter timeframe. Suddenly, CloudFlare and/or Google start
| questioning my humanity, so I close all tabs. Then okay. Sloppy
| hypothesis with no evidence: JS gone haywire
| rubyist5eva wrote:
| Cloudflare is on _my_ naughty list. I actively advocate against
| people using them.
| robjan wrote:
| I have two dedicated home internet IPs (one iCable fibre and a
| China Mobile 5G fallback/quarantine WiFi) and get these "checking
| if your internet connection is secure" interstitials all the time
| now. Also see them on my HKBN work connection.
|
| I'm from Hong Kong and suspect the whole territory is on the
| naughty list.
| JCWasmx86 wrote:
| Couldn't you use e.g. the DSGVO/GDPR in the EU to get all the
| information about your IP, everything cloudflare has stored about
| it until you find the root cause?
| unity1001 wrote:
| Its amazing how Cloudflare became another tech monopoly that can
| decide the lives of ordinary people in a totally unregulated,
| private fashion.
| therealmarv wrote:
| If you surf on desktop sites from Philippines on a mobile phone
| plan (which is often the best Internet connection in that
| country) you also get Cloudflare's captchas everywhere.
|
| I told it before and tell it now again: Cloudflare is dividing
| the World between first and second/third World countries with
| their captchas. I call it discrimination of second/third World
| countries! If you are from US and Europe you will never notice it
| but if you travel a little bit more you see these blocking
| captchas everywhere.
| chrismorgan wrote:
| I've had a similar experience in India with wired internet from
| a local ISP: CGNAT is used so there are who knows how many
| customers on the same IPv4 address,
| https://iknowwhatyoudownload.com/ shows at least forty hours of
| movies being downloaded every day, the IP address is on half
| the blacklists out there because _someone_ is part of an email-
| sending botnet, and yeah, Cloudflare hates you.
| Jamie9912 wrote:
| Maybe your mobile ISPs dont do enough to stop malicious/spam
| traffic. That's not Cloudflare's fault
| therealmarv wrote:
| It only affects Cloudflare hosted sites though.
| thewebcount wrote:
| I get it browsing from a major ISP in the US. I have the gall
| to browse in private mode and to block trackers and ads because
| of all the malware they contain. (And I don't use a browser
| that requires me to login just to browse the web - gasp!) And
| apparently, that means I'm worthy of this sort of punishment as
| well.
| aendruk wrote:
| The other side of this story is that PLDT stands out from other
| residential networks as a persistent source of web form spam.
| I'd love to learn what's going on differently there.
| Dma54rhs wrote:
| I get these a lot and I'm from EU. But it's "seasonal".
| ReptileMan wrote:
| I am from Europe and I notice if I use some non residential ip.
| The captchas are extremely annoying especially when trying to
| access a site I have already been logged into with 2fa. Who is
| protected in this case.
| cft wrote:
| I actually think that Cloudflare is setting up the foundation of
| Chinese style (but privately outsourced in the US case)
| censorship machinery in the US. Between their AI erroneously
| flexing its power, Kiwifarms scandal and similar, they are
| emerging as a rival to Google in its censorship effort. One of
| the most dangerous companies in the internet.
| smsm42 wrote:
| So this gets me thinking. We know Cloudflare will boot a site if
| they really don't like them. Now, what happens if Cloudflare
| doesn't like _you_? I mean, really really doesn 't like. Maybe,
| you said something wrong online or participated in a wrong group
| activity, or something like that. Is it the case that they have
| the power to essentially deny you (provided you have a static IP
| and don't use VPN, say) access to a major part of the Internet?
| And you can do absolutely nothing about it?
|
| I know they haven't done anything like that yet. But the
| technical capability is there, and we all know how short is the
| distance between technical capability and doing it, when the
| appropriate pressure is applied. So I wonder, how long before
| activists start demanding for CF to boot people from the
| internet, and how long before CF caves in to that...
| [deleted]
| neurostimulant wrote:
| If your ISP is using CGNAT, sooner or later you'll going to
| experience this problem. When this happen, I had to use a VPN (I
| use mullvad) to reduce the amount of cloudflare challenges I get.
| Pretty funny because usually I got more challenges when using a
| VPN instead of the other way around. The Privacy Pass extension
| also seems to help a bit.
| jgrahamc wrote:
| _Well into the second day of Cloudflare's blockade of my home
| internet connection, Google Search also began blocking requests.
| It required me to resolve a CAPTCHA challenge for every other
| search. This luckily only lasted a day._
|
| _Cloudflare shares IP reputation data with partners like Google,
| coordinated through a program called the Bandwidth Alliance. So,
| my original offense might not even have been against Cloudflare.
| It might have received the reputation data from a partner, and it
| just propagated through the Bandwidth Alliance network._
|
| That's not what Bandwidth Alliance is at all. It's about reducing
| or eliminating egress fees between a cloud provider and
| Cloudflare. Not sure where the idea that it's about sharing IP
| reputation data comes from.
|
| https://www.cloudflare.com/bandwidth-alliance/
|
| So, if Google Search started showing a CAPTCHA that's not
| Cloudflare.
| xani_ wrote:
| > Not sure where the idea that it's about sharing IP reputation
| data comes from.
|
| Probably from scam called mail blacklists
| [deleted]
| plumeria wrote:
| It is interesting that the Bandwidth Alliance partners list
| shows pretty much every big cloud provider except AWS and
| Akamai [0]
|
| [0] https://www.cloudflare.com/bandwidth-alliance/
| throwawayays wrote:
| The tone of this reply is a bit shit from a PR perspective.
|
| How about _also_ pointing to a knowledge base article for how
| an end user could go about working out what network activity
| from their IP might be flagging Cloudflare's systems?
| phantom_of_cato wrote:
| But that's beside the main point. You guys are essentially the
| "single point of failure" for half the internet. [1] Being
| competent and smart doesn't really help too much, as
| demonstrated by how you guys had to give in to the pressure to
| censor recently.
|
| [1]: https://easydns.com/blog/2020/07/20/turns-out-half-the-
| inter...
| TakeBlaster16 wrote:
| Can you acknowledge the main point of the article? What should
| someone do if they find themselves misclassified by
| Cloudflare's systems?
| mh- wrote:
| _(not the parent commenter)_
|
| That person should start with the assumption they _haven 't
| been misclassified_ and eliminate the possibility that a
| device on their network is compromised.
| JohnFen wrote:
| A task that would be made much easier and less likely to
| miss something if the affected person had some indication
| as to what the problem was.
| buildbot wrote:
| Devil's advocate - would it not then be pretty easy to
| engineer malicious bots to avoid detection?
| d2wa wrote:
| (Author here.) That's missing from the article. But I have
| logs of the network. There's nothing out of the ordinary.
| "I don't know what I did wrong," as I started the article,
| means "I've checked logs and such and there's no indication
| of anything wrong on my end."
| tomxor wrote:
| FYI, this guy is far from alone, your "protection" has given me
| a lot of grief over the past few years, particularly on highly
| NATed mobile networks.
|
| I've been gradually removing cloudflare based CDNs from
| services I develop and control because I don't want my users
| being arbitrarily discriminated against.
|
| There was a good article posted on HN recently titled "The
| ideal level of fraud is non-zero" which I think is highly
| relevant here... In essence any mechanism employed to prevent
| illegitimate use comes with a negative cost to legitimate
| users, if that cost is too high it defeats the purpose. i.e
| what's the point in a website that is completely immune to a
| botnet and also cannot be accessed by anyone else? unplugging
| the ethernet cable also effectively protects against botnets.
| More subtly the cost of outright rejecting some legitimate
| users is usually not worth the savings of rejecting 100% of
| illegitimate ones. I think Cloudflare's service has it the
| wrong way around: it currently accept blocking legitimate users
| far too easily, that is not an acceptable cost; whereas you
| should be letting a higher level of bots through to avoid
| pissing off legitimate users - if it's not obviously a DDoS,
| it's probably worth the bandwidth cost.
|
| Consider the bigger picture, if you save a slither of a penny
| by blocking a bot, but also end up blocking or seriously
| inconveniencing 10 real users... is it worth it.
| dmix wrote:
| Cloudflare just isn't worth the tradeoffs: the risks
| associated with their centralization, how they made Tor
| basically unusable on non-onion sites, the lack of
| transparency when content-moderating the internet, etc.
|
| The space is in need of solid competitors to break the
| stranglehold they have on the internet. Whether it's the
| right combination of services, documentation, etc.
| thaumaturgy wrote:
| Tor made Tor unusable on non-onion sites. I feed a
| netfilters table with the list of exit node IPs that Tor
| publishes (https://check.torproject.org/torbulkexitlist) as
| a standard part of server deployment, and it's the single
| most effective way to reduce form and login abuse on hosted
| sites. I like the idea of Tor, but there's no denying that
| it's a huge source of nuisances.
| shaky-carrousel wrote:
| I live in a country with censored internet. What you are
| doing is harmful. I can only hope whatever you provide is
| irrelevant enough.
| thaumaturgy wrote:
| I'm sorry. I have a colleague based out of Venezuela.
| We've had to work together to get tunnels and vpns
| configured so that he can get uncensored and secure
| internet access.
|
| But Tor is an enormous source of abusive traffic and if I
| don't filter it, then that's harmful to site owners. I'm
| being forced to choose between the needs of people that I
| know, work with, and depend on financially, and the needs
| of people in countries with issues that are far outside
| my ability to resolve. It's not a hard decision.
| justsomehnguy wrote:
| > It's not a hard decision.
|
| Depends on what you imply under 'hard'.
|
| As a IaaS provider I endured alk the hurdles about that
| and ten years later - I don't care, at least not until my
| outbound bill is bigger than usual.
|
| Like some of the clients are on CentOS6, on a public
| facing machines.
| parroteal wrote:
| I'm a noob, can you give me a pointer?
|
| What kind of abusive traffic is coming through Tor and
| why do they do it?
| remus wrote:
| Say you're running an account take over script that spams
| login forms with a list of known username and password
| combos. If a website owner sees thousands of login
| attempts coming from a single IP address they're likely
| to block you to prevent abuse on their website. This is
| annoying for you as you then need to rotate your IP
| address.
|
| Using tor hides your IP address from the website and
| makes switching exit nodes very straightforward, so you
| can run your account take over script in peace.
| thaumaturgy wrote:
| Mainly forms -- login forms, comment forms, signup forms.
| Bots use Tor pretty heavily because it's anonymous and
| hard to block them without blocking the entire network.
| Login form abuse is mildly irritating but not a huge deal
| if you have other measures in place. Comment spam is
| annoying but there are some options that deal with it
| pretty well.
|
| But the signup spam was a headache. I didn't want to just
| blackhole Tor traffic, and tried to reduce the abuse with
| other tools, including some custom stuff. The final straw
| was a customer's small business site that had a MailChimp
| or Constant Contact signup form. Those vendors want you
| to embed their code by default to render the form, so you
| have less control over the form itself. There were
| workarounds, but they all sucked.
|
| Tor bots would sign up email addresses through this
| newsletter form, and then I'd have to go through and
| manually scrub them before newsletters went out, or the
| service would penalize my client for too many
| bounces/unsubscribes/complaints. Very nearly 100% of the
| abuse on that particular form came from Tor IPs.
|
| I do not want to spend my limited time on this Earth
| manually sorting out bots from humans because of one
| particular network. Blackholing Tor made that problem
| disappear immediately.
|
| VPNs are dime-a-dozen now, cheap VPSs are available from
| lots of vendors, there's Wireguard, there's ssh, a clever
| person could even set up Apache or nginx as a forward
| proxy with ssl from LetsEncrypt. Tor is well over 90%
| abusive traffic (https://blog.cloudflare.com/the-trouble-
| with-tor/). This is a Tor problem, not a me problem.
| There are better alternatives available.
| judge2020 wrote:
| https://blog.cloudflare.com/the-trouble-with-tor/
|
| > . Based on data across the CloudFlare network, 94% of
| requests that we see across the Tor network are per se
| malicious. That doesn't mean they are visiting
| controversial content, but instead that they are
| automated requests designed to harm our customers. A
| large percentage of the comment spam, vulnerability
| scanning, ad click fraud, content scraping, and login
| scanning comes via the Tor network. To give you some
| sense, based on data from Project Honey Pot, 18% of
| global email spam, or approximately 6.5 trillion unwanted
| messages per year, begin with an automated bot harvesting
| email addresses via the Tor network.
| Zak wrote:
| There are probably more sophisticated options that would
| solve your problems than simply blocking it.
| plumeria wrote:
| Is using CAPTCHAs one of those?
| judge2020 wrote:
| Such as?
| cowtools wrote:
| The answer depends on the type of service you host. I
| don't know what you need to do, but I do know that
| filtering IP space is merely security-by-obscurity, it is
| a cheap and broken solution to the hard problems of sybil
| resistance. If you need IP filtering to operate on a day-
| to-day basis, then the security of your service is
| fundamentally broken.
|
| Tor users do not have any special properties over clear-
| net users besides low accountability for their IP space.
| There are other ways to acquire this type of setup that
| don't involve broadcasting a public list of known exit
| nodes as an act of good faith. Any sophisticated attacker
| will be able to easily get ahold of the IP space and
| bandwidth they need to do their work, whether it's
| through a botnet or simply because they operate out of
| some less-accountable country like China or Russia.
|
| IP filtering: now you have two problems!
| thaumaturgy wrote:
| This is why I'm strongly against spam filtering for
| email. Spam filters are fundamentally security-through-
| obscurity. I mean, they don't protect your email from
| targeted bombing attacks or phishing. If you need spam
| filters to operate your email on a day-to-day basis, then
| the security of your email is fundamentally broken.
|
| /s, obviously, I hope.
|
| Blocking Tor isn't a security measure, it's a nuisance
| reduction measure.
| plumeria wrote:
| How often is the list of exit nodes updated?
| thaumaturgy wrote:
| Daily, I believe. I don't have the file git-controlled.
| That would be a good idea, though.
| andrewnyr wrote:
| there are many solid competitors: Amazon, Fastly, Akamai,
| Imperva to name a few
| wahnfrieden wrote:
| Bunny
| thaumaturgy wrote:
| Just 10 minutes ago, I got the following email from a
| housemate (I'm not home at the moment):
|
| > _The past few weeks I 've been getting tons of redirects to
| verify my humanity before being allowed to view a webpage.
| Usually I just have to click the box that says human, not
| find all the ladders in a photo. SoFi is doing it every
| single time I log in. Petco, too, along with others who are
| more sporadic. This is happening with and without uBlock on.
| Same browser I've always used. ..._
|
| SoFi and Petco both use Cloudflare. I do exactly zero web
| crawling / scraping / abusive anything from my home
| connection.
|
| I'm noticing a recent increase in volume of complaints about
| Cloudflare's human verification filter. I'm starting to
| wonder if they touched a dial.
|
| I had already started pulling some infra back from Cloudflare
| after their last appearance in the tech news cycle. Now I've
| got an additional reason to continue doing that.
| patrec wrote:
| > I had already started pulling some infra back from
| Cloudflare after their last appearance in the tech news
| cycle.
|
| What triggered your reaction? That they terminated a
| customer with zero notice?
| tarakat wrote:
| You're looking at it all wrong. From Cloudflare's point of
| view, this kind of blocking is a _feature_. Anyone doing
| legitimate web crawling, or offering alternative web services
| such as Starlink, now needs Cloudflare 's permission.
|
| Essentially, for a broad class of web-based businesses, they
| have made themselves gatekeepers. I'm sure they'll find a
| profitable use for this position. Charging outright would
| look bad, but investing in businesses that just happen to not
| run into Cloudflare-based trouble, but whose competitors
| do...
| tomxor wrote:
| I'm familiar with that perspective, and biased towards
| it... Cloudflare is certainly in such a position, but they
| are a relatively young company (for their size and reach)
| and I've seen good things come from them.
|
| I'd guess the intent is unlikely to be anti-competitive or
| monopolistic, just over-aggressive. However regardless of
| intent their position does cause an absence of market
| forces to put pressure on fixing such issues - Similar to
| how it's become acceptable to have downtime when it's on
| AWS, because "everyone is affected".
| O__________O wrote:
| They do have a threat score
|
| https://developers.cloudflare.com/firewall/recipes/block-ip-...
|
| I was surprised to learn Cloudflare was born out of Project
| Honeypot, so I am guessing Cloudflare does share data with
| them:
|
| https://www.projecthoneypot.org/cloudflare_beta.html
| [deleted]
| elcomet wrote:
| FYI you're responding to the cloudflare CTO
| trasz wrote:
| It's naive to assume Cloudflare CTO would not be lying if
| beneficial to him or Cloudflare.
| elcomet wrote:
| I don't assume anything. The previous comment was just
| trying to teach something about cloudflare to its CTO
| nemothekid wrote:
| I wonder if HN posters have ever held a job before. Can
| you explain why it's beneficial for Cloudflare to block
| legitimate users? Why is the simplest explanation
| "Cloudflare just hates this one user in particular?"
| lmm wrote:
| Well, apparently they scared this user into installing
| their browser extension, so it sounds like this incident
| was a win for them.
| Veen wrote:
| It's even more naive to assume Cloudflare's CTO would
| tell lies that can be trivially shown to be untrue.
| trasz wrote:
| How would you show they are untrue? Ask? :-D
| pessimizer wrote:
| Don't use an assumption of someone's superiority solely
| based on their job title as a justification for the
| silencing of the disagreement of others.
| d2wa wrote:
| > That's not what Bandwidth Alliance is at all. It's about
| reducing or eliminating egress fees between a cloud provider
| and Cloudflare. Not sure where the idea that it's about sharing
| IP reputation data comes from.
|
| It comes from the Cloudflare blog.
| https://blog.cloudflare.com/cleaning-up-bad-bots/
|
| There's a support page about it too.
| https://developers.cloudflare.com/bots/get-started/free/
| jgrahamc wrote:
| I need to look into that. Thanks for pointing it out. I had
| totally forgotten about that post.
|
| Edit: team tells me this idea never got off the ground. Did
| talk with some potential partners (which did NOT include
| Google) but didn't happen. So if Google was throwing CAPTCHAs
| it wasn't because of our IP reputation.
| d2wa wrote:
| Dear John. What am I -- as a normal human being/end-user --
| supposed to do in this situation? People can't do anything
| without any information about why they're blocked. Who do
| you contact? Where do you go? What to do? The challenge
| page doesn't help the end user understand why this is
| happening to them. It's okay if you only see it for two
| seconds. But the page stays on screen for over a minute.
| When this happens for every website -- what do you do?
| You'd be furious if this had happen to you. I'm just trying
| to read my online comics and lookup some stuff about some
| interests and hobbies. It reduced my quality of life/sanity
| for a week. The last two days, I started worrying that this
| was going to be the new normal. I even looked into swapping
| ISP to get a new IP address.
|
| PS: I love all the innovation and engineering stuff you
| guys regularly share on the Cloudflare blog. It's [almost]
| always an interesting read. Even though I'm no fan of the
| massive centralization your company has caused.
| JohnFen wrote:
| > People can't do anything without any information about
| why they're blocked. Who do you contact? Where do you go?
| What to do?
|
| This is the most serious problem with all of the major
| companies these days. Cloudflare, Google, Apple, etc.
| When you get on their "bad side", you're just screwed.
| You'll never even know what got them mad at you, and
| there's nothing you can do to recover.
|
| The only reasonable way to deal with this is to avoid
| them all to the greatest extent possible. You have no
| control over whether or not you deal with Cloudflare,
| unfortunately, which makes them the worst of the lot.
| adammartinetti wrote:
| > It's okay if you only see it for two seconds. But the
| page stays on screen for over a minute.
|
| That doesn't sound right. You shouldn't see a loading
| page for over a minute. If you're open to providing more
| details privately I'd love to help troubleshoot. You can
| drop me an email at amartinetti @ cloudflare.
| jgrahamc wrote:
| Once upon a time Matthew made us set the IP reputation of
| every Cloudflare office to bad so that we experienced the
| worst case scenario. Helped a lot.
|
| I don't understand why you saw one minute block screens.
| That's not right. Should be seconds.
|
| I'm talking with the team about your other points.
| tinus_hn wrote:
| The main problem of course, and it isn't limited to
| Cloudflare and I won't pretend to have the solution, is
| that if you are caught in this kind of web, you have no
| recourse but go public and hope the spotlight lands on
| you. For every problem we see in an upvoted post there's
| tons that nobody sees.
| northwest65 wrote:
| What about answering his actual question?
| easrng wrote:
| I haven't been getting challenges that last that long,
| but I have noticed that the redesigned "security check"
| challenge pages with the spinner do seem much slower than
| the old design with the loader that was made of 3 orange
| dots.
| d2wa wrote:
| I edited and added a second link to a support page that
| mentions it too.
| jgrahamc wrote:
| Thanks. I'm talking with the team.
|
| Edit: see comment above.
| cvwright wrote:
| You block this guy from the internet for a week --- for no
| apparent reason --- and then you come in here with a nitpick
| about how another related system works?
|
| Really?
| judge2020 wrote:
| The point is that Cloudflare does not beam IP reputation data
| to Google. If Google and CF are blocking this IP separately,
| what's the chance there's some malicious device or hacked IoT
| device on the network, participating in DDOS attacks or
| unauthorized vulnerability scanning of random websites?
| zinekeller wrote:
| Yeah, if for example Spamhaus (which both Cloudflare and
| Google consult) has detected that a subnet is bad then that
| could be the cause.
|
| Still, it doesn't excuse Cloudflare that there's no redress
| if you are caught on a block or even a clue on what you can
| do to reduce it (especially that Spamhaus do have redress
| procedures).
| cvwright wrote:
| Fair point
| pessimizer wrote:
| According to another comment, it's a wrong point:
| https://blog.cloudflare.com/cleaning-up-bad-bots/
|
| > Once enabled, when we detect a bad bot, we will do three
| things: (1) we're going to disincentivize the bot maker
| economically by tarpitting them, including requiring them
| to solve a computationally intensive challenge that will
| require more of their bot's CPU; (2) for Bandwidth Alliance
| partners, we're going to hand the IP of the bot to the
| partner and get the bot kicked offline; and (3) we're going
| to plant trees to make up for the bot's carbon cost.
| judge2020 wrote:
| I'm pretty sure this was for a situation like
| Digitalocean themselves hosting a bot, but such IP
| sharing very well might be currently (ab)used by
| partners, if it's happening here.
| jgrahamc wrote:
| Yeah. I'm looking into that.
| stefan_ wrote:
| A wrong nitpick, even! Way to look like the asshole.
| noasaservice wrote:
| Given their business model is "Protect DDoS'ers (booters) so
| they can DDoS sites so Cloudflare can sell DDoS-prevention
| services", I wouldn't trust them one whit in doing the right
| thing.
|
| And frankly, if you want to dig deeper, just look at who they
| have no problems having their free clientele as.
| tshtf wrote:
| Not sure why this is getting downvoted, it's completely
| factual.
| cma wrote:
| Its like finding the worst videos on youtube and saying
| that's their business model.
| acdha wrote:
| It makes a very broad claim which makes it sound like an
| extortion racket but doesn't have anything to back it up.
| I would bet that if it included some evidence it would
| fare much better. For example, they have a ton of large
| organizations which are customers. The very first
| question the average reader is going to have is whether
| it's really the case that these sites are predominantly
| attacked by booter services which use Cloudflare for
| hosting? That seems unlikely and as general rule here the
| broader the claim the more people are going to expect you
| to show that you did your homework first.
| [deleted]
| gusgus01 wrote:
| The claim was discussed in this post:
| https://news.ycombinator.com/item?id=32709329
|
| Basically DDOS booters use Cloudflare to protect their
| websites from competitors, since Cloudflare is one of the
| best. The same people Cloudflare is protecting (and
| claims to do so on an ethical neutrality basis) is
| furthering the need for Cloudflare to exist.
| acdha wrote:
| Note that I'm not saying whether or not this is true,
| only that a comment which links to something like that
| will generally fare better than one which begs the
| question.
| kevingadd wrote:
| I'm used to getting assaulted by Cloudflare's browser check
| interstitials along with random Cloudflare and Google CAPTCHAs
| because (presumably) I run Firefox and an ad-blocker instead of
| vanilla Google Chrome. It's already tremendously inconvenient to
| wait multiple seconds on many page loads and click 20 bicycles, I
| can only imagine how infuriating it would be if every page load
| started taking 60 seconds because your IP ended up on some random
| algorithmic blacklist....
| 20after4 wrote:
| I use firefox and an ad blocker and I don't see these CAPTCHAs
| ( except for a few rare instances that I can recall). Something
| else must be going on to get you flagged.
| leonfs wrote:
| If you haven't done anything, someone else might have. Check your
| router logs for strange devices and activity in your network,
| also check your machine/s for malware.
| d2wa wrote:
| (Author here.) Plenty of logging of outgoing connections and
| DNS. Nothing out of the ordinary.
| Jamie9912 wrote:
| Is your IP address listed on https://www.abuseipdb.com/ or
| any other spam blocklists?
| NelsonMinar wrote:
| Cloudflare is a regular problem for Starlink users. We're on
| CGNAT so users share IPv4 addresses. I see CAPTCHAs when using
| Starlink ten times as often as on my other ISP. I don't think it
| actually breaks things the way this article describes, it seems
| like a gentler behavior, but it's annoying.
|
| A few months ago I got on Akamai's naughty list (with my other
| ISP) for some very light automated website downloading. That was
| a straight block with HTTP errors and I had to use a proxy to
| access the Web. It cleared up after a few days.
|
| The lack of any user feedback or support for this situation is
| really annoying. Reminds you how much power the CDNs have. It'd
| be really bad if loading websites got as difficult as sending
| email through all the layers of spam filtering.
| Syonyk wrote:
| > _Cloudflare is a regular problem for Starlink users. We 're
| on CGNAT so users share IPv4 addresses. I see CAPTCHAs when
| using Starlink ten times as often as on my other ISP. I don't
| think it actually breaks things the way this article describes,
| it seems like a gentler behavior, but it's annoying._
|
| I've been noticing this too, and it's why Starlink remains my
| secondary ISP/bulk transfer connection. If I had to drop one
| connection, I'd drop Starlink for this reason alone.
|
| There are some sites that I simply can't browse, and it's not
| Cloudflare errors, either. Lowes, in particular, simply returns
| error pages for anything but the main landing page on a regular
| enough basis. Of course, my observed public IP changes so it's
| not consistent, but it's genuinely annoying.
| somedude895 wrote:
| > If I had to drop one connection, I'd drop Starlink for this
| reason alone.
|
| Why are you using Starlink at all if you have other options?
| Syonyk wrote:
| Because my other connection is a 25/3 WISP link that mostly
| doesn't. I generally see about 5/1 in the evenings, if
| that.
|
| I've had several area WISP connections, as there's no wired
| infrastructure to my area, and they vary in quality. I work
| full time remote, so I need two connections as a general
| habit - I can work with one, but when that one is down for
| a week straight, I have problems. I like being able to fail
| over.
|
| I typically keep one connection for "interactive" traffic,
| and one for "bulk transfer/failover" - things like my local
| Ubuntu repo mirror, offsite backup traffic, etc. And I can
| fail to it if needed, which I do often enough.
|
| On a good day, Starlink is far better than my WISP
| connection, and I have some machines routed out it
| persistently. On a bad day, I can't hit much from it,
| because that particular public IP has been blocked from
| large parts of the internet. It's very hit and miss, and
| overall bandwidth has definitely dropped from the early
| days, though reliability of getting packets where they need
| to go is drastically improved.
| cma wrote:
| > I've been noticing this too, and it's why Starlink remains
| my secondary ISP/bulk transfer connection. If I had to drop
| one connection, I'd drop Starlink for this reason alone.
|
| Could cloudflare legally charge them a bribe to captcha their
| users less? It isnt good to have a company in this position
| of power if so.
| diebeforei485 wrote:
| Cloudflare said they're working on this-
| https://blog.cloudflare.com/eliminating-captchas-on-iphones-...
| ThatPlayer wrote:
| I feel like Starlink could at least partially mitigate this by
| supporting IPv6. T-mobile US supports IPv6, and I hardly notice
| this as an issue on my phone. Or the time my work ran the
| business over a 4G mobile while waiting for ISP install.
| causi wrote:
| What archival tool were you using? I've been looking for a
| replacement for HTTRACK forever.
| NelsonMinar wrote:
| A combination of shotscraper and metascraper; really more web
| previews than archives. And in a single thread, to different
| hostnames, maybe one every 10 seconds? Honestly surprised
| Akamai or anything even noticed. I fake my user agent now,
| lesson learned.
| justoreply wrote:
| But any automated tool won't work. I have a similar problem
| with my self hosted feed reader, my vps hosting ip doesn't have
| 100% reputation with Cloudflare and I can't download some feeds
|
| Edit: spelling
| btdmaster wrote:
| "The data subject shall have the right not to be subject to a
| decision based solely on automated processing, including
| profiling, which produces legal effects concerning him or her or
| similarly significantly affects him or her."
|
| However, this does not apply if:
|
| "is necessary for entering into, or performance of, a contract
| between the data subject and a data controller;"
|
| Cloudflare would therefore perhaps claim that this is
| "necessary".
| grishka wrote:
| Here's a handy list of correct uses for IP addresses:
|
| 1. Packet routing
|
| In other words, I wish services like Cloudflare were made
| illegal.
| scarface74 wrote:
| Notice that he suspects that some of the problems with podcast
| rss feeds and assets that can't be captcha confirmed may be
| caused by websites who are on the free tier and that don't have
| the ability to specify that some subdomains shouldn't be blocked
| by captchas.
|
| I have absolutely no sympathy for website owners who are
| depending on a free service.
| ritcgab wrote:
| What is Cloudflare? The answer is simple - the biggest MITM on
| your Internet traffic.
| joshfraser wrote:
| If this happened to me, the first thing I would do is switch to
| using a VPN. In my experience, Google is far more likely to throw
| up CAPTCHA challenges to VPN users. I wonder if this is what
| happened to the OP.
| superkuh wrote:
| Daniel Aleksandersen of ctrl.blog has absolutely no foot to stand
| on here. He is a proponent of this kind of algorithmic blocking
| for weird browsers and even implemented it on his own site and
| argued _for_ it. https://www.ctrl.blog/entry/detect-non-browser-
| form-submissi...
|
| It's only after it happened to _him_ that now he 's suddenly
| against it. Until he removes the same type of blocks from his own
| website I have absolutely no sympathy for him.
| bergwerf wrote:
| From the link you mentioned:
|
| > Bots often mimic the User-Agent of a common browser, but the
| version numbers used in the bots rarely change. Over time they
| drift farther and farther behind until a point (maybe two-year-
| old versions) where you can safely block them without
| inconveniencing legitimate users.
|
| This supports the idea that browsers are subject to constant
| change and everyone should be forced to come along (rather than
| respecting and supporting standards). I have a Chromebook that
| stopped receiving updates some years ago (thank you for your
| very safe and sustainable product Google!), his heuristic would
| litteraly block me.
| ReptileMan wrote:
| Doesn't your chrome app updates? Never used chromebook. Just
| asking.
| phreack wrote:
| Even if that were the case (which we can debate), him being
| wrong before does not prevent him from being right now. Being
| de facto banned from the common internet due to centralization
| is absolutely scary.
| ranger_danger wrote:
| superkuh wrote:
| I completely agree. I am against Cloudflare and the
| centralization it implies 100%. I never use it for sites I
| develop.
|
| I just have no sympathy for Daniel since up until just now he
| was trying to get everyone to do this.
| scarface74 wrote:
| CloudFlare allows website host to have much finer grain
| control that would have solved many of these problems - _if
| they pay for it_. I see no problem with this.
| dmix wrote:
| The hosts aren't blocking him though, it's Cloudflare.
|
| > Just about every website I visited from my home
| internet connection would result in a challenge page.
| scarface74 wrote:
| Cloudflare is blocking him because the hosts didn't
| configure Cloudflare to not use captcha for sub domains
| that host non browser traffic like podcast RSS feeds.
| That was his theory.
|
| That capability is only available for paid CloudFlare
| plans.
| bashinator wrote:
| It's almost as though sufficiently large communications
| providers should be regulated as utilities.
| daenney wrote:
| Burn the witch!
|
| Lets read through that page for a second though:
| Drop support for obsolete HTTP versions
|
| Doesn't seem like that's going to cause much issue for any
| legitimate client from the past 10-20 years. He only recommends
| blocking HTTP 0.9/1.0, which fair enough Append
| a #hash to the form's action URL
|
| Hah. Clever man. I don't see how this is going to stop any
| legitimate user from loading your website or submitting the
| form, but I can see how it might frustrate bots.
| Include a hidden prefilled form field
|
| This is just standard practice to mitigate CSRF.
| Verify the Host and Origin request headers
|
| Yes. You should be doing that. Set a test
| cookie and verify it gets included in the submission
|
| Another CSRF trick. Swap the name attributes in
| the name and email fields
|
| This one's a little user hostile to folks who use assistive
| devices like screen readers. But still won't prevent you from
| accessing the site in the first place. Verify
| the POST/Redirect/GET (PRG) chain
|
| As noted by the author, might cause some issues but again,
| won't stop anyone from loading your website.
| Block ancient versions of common browsers
|
| Alright please just don't do this. UA blocking is gross and
| might prevent access through specialist software. But he also
| calls this out himself. I strongly discourage
| you from blocking or discriminating against unknown or uncommon
| browser User-Agent request headers
|
| All in all, with the exception of UA blocking I don't see how
| any of these mitigations would result in users not being able
| to access said website, or having their loading times
| drastically increased.
| d2wa wrote:
| >> Verify the Host and Origin request headers > > Yes. You
| should be doing that.
|
| (Author here.) If I remember correctly, his browser of choice
| predates the Origin header.
| daenney wrote:
| Alright well fair enough. Looks like that's only been
| supported since Fx 70 released somewhere in 2019. So maybe
| don't do that depending on what you intend to block. But
| then again it's been 3 years also.
|
| In general though the whole tone of parent of "I am owed
| access to someone else's computer system on my and my terms
| alone" just doesn't jive with me. It's also not remotely
| comparable to Cloudflare's approach of sitting in the
| middle snd then appropriating end-user compute resources
| without their consent to fuel their business.
| ceejayoz wrote:
| > This one's a little user hostile to folks who use assistive
| devices like screen readers.
|
| As long as you're using a <label> or aria-label attribute,
| that shouldn't be an issue.
| d2wa wrote:
| (Author here.) I am. There's plenty of accessibility labels
| in place. It's literally just the name attributes. No user
| ever sees this, whether they're using accessive
| technologies or not. It only confused bots that assumes
| that the field named email is for the email address.
| nijave wrote:
| All that stuff is easily defeated by automated browsers
| anyway (i.e. selenium)
| mh- wrote:
| Yes, but those automated browsers are much more expensive
| to operate than simple HTTP clients _pretending_ to be
| browsers.
|
| It's an arms race/defense-in-depth situation. If someone
| truly wants to automate your site in a _targeted_ fashion,
| and it 's profitable for them to do so, you'll have to
| invest a lot more in stopping it (and decide how much of it
| is _worth_ stopping).
| Aperocky wrote:
| Even youtube fails with yt-dlp going as far as a internal
| python file that parses javascript and execute them.
| LinuxBender wrote:
| He's quite tame compared to me I suppose. I block anything
| that is not HTTP/2.0 which _currently_ knocks out all the
| bots and all crawlers except Bing. But I just have hobby
| sites these days. Nobody would notice or care if my sites
| went offline.
|
| Using NGinx as an example: if
| ($server_protocol != HTTP/2.0) { return 403 'Nope'; }
|
| Another thing I have found useful to drops some bots is to
| become invisible to them. Many of the poorly written scanning
| tools do not properly set MSS for reasons I still don't
| understand. I use this to my advantage.
|
| Using IPTables as an example:
| /sbin/iptables -t raw -I PREROUTING -i eth0 -p tcp -m tcp
| --tcp-flags FIN,SYN,RST,ACK SYN -m tcpmss ! --mss 420:16384
| -j DROP
|
| Any TCP packets setting a very low or high MSS or missing MSS
| will be silently dropped. I drop about 35K packets per host
| per day on average. This also drops hping3 floods.
| toast0 wrote:
| > Many of the poorly written scanning tools do not properly
| set MSS for reasons I still don't understand.
|
| MSS issues attract me like a moth to flame [1], so let me
| ask some questions.
|
| It looks like this is dropping syns with MSS over 16384???
| That is indeed a pretty crazy high number. 9000ish seems
| reasonable for someone on a jumbo network without a mss
| clamping router, but above that is someone weird for sure.
|
| Under 420 seems unlikely too, but technically acceptable,
| but sure, I'd drop it. In theory, a proper OS will send
| several SYNs with MSS, then assume your server doesn't
| support TCP options and send you a SYN with no options.
| Going to take a while, but if someone legitimately has a
| mss less than 536, their internet is probably pretty junky
| anyway, so ok, seems fine.
|
| [1] I just built a browser based pmtud test site,
| http://pmtud.enslaves.us/
| LinuxBender wrote:
| You are right. I just happen to use a very safe range. If
| I didn't care about anyone using jumbo frames I could set
| the range to 1220:1536 and nearly all legit traffic would
| pass just fine. 1220 (to 13xx) for the people using VPN's
| and ip6-ip4 gateways. I just try to give really
| conservative examples so that it is less likely I break
| someones unusual setup. Anything just over 9k is fine for
| most jumbo-frame setups.
|
| All of this said, I could set the range to 1:65536 and it
| would still drop most bots as they don't even bother to
| set MSS at all in their scans. I'm not sure which tool
| they are using.
| judge2020 wrote:
| IMO blocking bots isn't too big of a concern, the problem
| is when a dedicated attacker realizes you serve valuable
| data (in your HTML). Next thing you know, they're running
| puppeteer or a similar remote controlled browser to scrape
| your site, which is both undesirable in itself and the
| scraper might overload your site/database by scraping with
| no internal parallel request limit. If you're not a startup
| with an unlimited early cloud budget, it can be costly if
| you want to handle both bot usage (including official API-
| based or scraping bots) and regular users.
| ceejayoz wrote:
| The techniques described in that article are pretty reasonable
| and shouldn't significantly impact users - swapping name/email
| fields' names won't do a thing to you. There's also a
| difference between "this one website doesn't work for me" and
| "I've been blocked from half the Internet".
| IshKebab wrote:
| To be fair there's a difference between doing it for one site,
| and doing it for a significant portion of the internet.
| phantom_of_cato wrote:
| Throwing an ad hominem is not cool.
| [deleted]
| jamespo wrote:
| None of those techniques affect normal browsing
| ufmace wrote:
| I just read it, and I don't see any contradiction here. IMO,
| he's recommending simple and direct anti-bot methods to web
| admins specifically because it's better than relying solely on
| Cloudflare etc for all bot blocking. He never recommends making
| un-appealable access control decisions based on third-party
| lists, and specifically recommends caution on methods that
| might potentially impact innocent users. Seems perfectly
| consistent to me.
| ldoughty wrote:
| I don't know the author or his reputation, but his suggestions
| that you linked are (in my opinion) standard actions for any
| dev/server admin getting spammed by their forms... And the
| suggestions really only impact malicious actors accessing your
| website from a script... Virtually none of those would be an
| issue for any browser made in the last 15-20 years, or headless
| browsers, but would break rudimentary scripts like entry level
| hackers/spammers might use.
|
| He also specifically called out CAPTCHA as user-hostile.
| superkuh wrote:
| I guess like ctrl.blog you can't grasp the significance of
| the issue until it happens to you. My firefox fork is
| definitely blocked by his algorithmic "bot" detector. Just
| because your browser isn't doesn't mean it only blocks bots.
|
| False positives happen. They happen a lot more than you
| think. And they are a serious problem. Even more serious when
| it's cloudflare, but arguing for everyone to implement these
| algorithmic blocks "that won't inconvenience users"
| individually, taken to it's logical end, does the same.
| [deleted]
| ldoughty wrote:
| I don't see the reason for the personal attack.
|
| The blog post also calls out that you should not block
| based on user agent.
|
| If a form post didn't respect the action property having a
| #, that name/email HTML names might be reversed (whole the
| type is correct, and the user displayed values are
| correct), or include hidden HTML form fields that have been
| standard since ~97? Back when I made my first few websites,
| I certainly would agree that they are likely bots.
|
| Again, apparently this person has some hateful following,
| but I don't appreciate you limping me into this hatred for
| agreeing with his statements on this one particular issue.
| superkuh wrote:
| You said, "And the suggestions really only impact
| malicious actors accessing your website from a script."
| and that was false. Since you didn't have experience
| being blocked you couldn't know. Not till it happens to
| you. I don't think pointing this out is a personal
| attack. It's just the way people work. People don't
| believe things are a problem until they become a problem
| for them.
|
| You and others can keep quoting the legit and clever ways
| to mitigate bot spam but if you ignore the false
| positives the other checks create it kind of defeats the
| point.
| scarface74 wrote:
| > I strongly discourage you from blocking or discriminating
| against unknown or uncommon browser User-Agent request headers.
| The web is weird and we as developers shouldn't discourage it.
| [deleted]
| Melatonic wrote:
| As much as I like Cloudflare now this is why long term monopolies
| (not saying they are now) are bad
| bastardoperator wrote:
| Has he tried unplugging the router for 15 minutes and plugging it
| back in? I jest but I know Comcast and Spectrum will both issue a
| new IP address in that timeframe.
| d2wa wrote:
| (Author.) My ISP only rotates IPs when they reboot their
| central equipment. Not enough to do it on my end.
| bornfreddy wrote:
| With some ISPs, they will issue a new IP if you change the
| router's (WAN) MAC address. Might be worth a try next time
| (crossing fingers you don't need it).
| bastardoperator wrote:
| This is what I've always seen too. I've never seen a
| residential ISP that allocates static DHCP addresses, they
| typically allocate in days which is why many people can
| maintain a leased address for months on end. Once you go
| offline though, all bets are off. Every ISP can determine
| if the subscriber is disconnected and if they are, they're
| going to reallocate your address. To your point, once the
| MAC address is changed, they have to issue a new IP address
| because using the logic posted above, the other address is
| allocated to a different MAC.
| dmix wrote:
| IP bans by modern services like CF can't be solved that easily
| in my experience.
| bastardoperator wrote:
| Clearly CF has a crystal ball /s.
|
| Once the IP address I don't own is released and assigned to
| some other router how do you think CF determines the new IP
| address for the individual/home? Unless this person is
| running the CF Dynamic DNS service which gives CF the IP
| address, I'm not sure CF would have any reasonable validation
| techniques to determine who is what given the size of
| residential networks.
| aendruk wrote:
| Cookie on their validation page? Browser fingerprint
| hopping IPs in the same block?
| dmix wrote:
| Bingo
| bastardoperator wrote:
| So i've turned cookies off and switched to my ipad to
| browse the internet for the evening, they have no
| fingerprint, and no cookie... now what?
| dmix wrote:
| Are you on a different IP block? ISPs sometimes just
| switch the last number.
|
| I had to use a VPN (a whole new IP) and clean chrome
| install to bypass one those "IP blocks" which was
| combined with fingerprinting.
| bbu wrote:
| I think cloudflare updated their bot detection algorithms because
| we had multiple customers who complained that they get
| challenged. I verified that they got a bot score of 1. As usual,
| CF support is not that helpful...
| synthetigram wrote:
| Reputation systems should be based on /abuse/, not on automation.
| I also ended up on the naughty list for running an archival
| scraping program. Trying to preserve part of the Internet is
| apparently against the rules. It's really a shame because my code
| honors rate limits, doesn't spam, and is completely docile.
| socialismisok wrote:
| Is it plausible some ISP shared some IP address that was on
| Cloudflare's list of suspicious IPs, or that some IoT device on
| this person's network created a burst of suspicious traffic?
|
| I get that this sucks for the end user, but I wonder how much we
| should blame Cloudflare vs the wider systemic challenges of
| managing DDOS protection on the web.
| laxis96 wrote:
| I believe that might happen, but then I also believe it's the
| ISP's responsibility to ensure that its IP addresses are kept
| clean
| socialismisok wrote:
| For sure, the point I'm making is that there's a multi party
| transaction here, with systemic complexity. Makes it hard to
| pin responsibility on just Cloudflare (or just the user or
| just the ISP, etc).
| yjftsjthsd-h wrote:
| Cloudflare is the one blocking a user based on things that
| aren't their fault; I'm happy to blame them.
| socialismisok wrote:
| That's fine, but you are ignoring the broader picture if
| you do. You've correctly identified a detail, but haven't
| placed that detail in context.
| yjftsjthsd-h wrote:
| I'm not ignoring the context, I'm saying that it's
| irrelevant. Cloudflare made the choice to block real
| people based on factors outside of their control, and
| then to market that product as a panacea; they don't get
| to pass the buck, doubly so when they don't expose enough
| information to let other people fix the things they
| broke.
| kazinator wrote:
| > _For whatever reason, I must have done something that angered
| Cloudflare_
|
| I'm guessing: having an IP address close to (or outright reused
| from and thus identical to) someone malicious, whom you know
| nothing about.
___________________________________________________________________
(page generated 2022-09-20 23:01 UTC)