[HN Gopher] Arc - A P2P CDN that runs in the browser
___________________________________________________________________
Arc - A P2P CDN that runs in the browser
Author : sansui12
Score : 88 points
Date : 2021-09-05 10:26 UTC (12 hours ago)
(HTM) web link (arc.io)
(TXT) w3m dump (arc.io)
| agentdrtran wrote:
| anyone got an invite code? looks interesting
| grun wrote:
| Of course! Shoot me an email at ansgar@arc. =]
| zimbatm wrote:
| How stable is the CDN when the each node lifetime is 5 minutes?
| The likelihood of an ongoing request being cut off is fairly
| high. Nodes are also constantly in syncing state I assume.
| grun wrote:
| Very! The lifetime of nodes is actually quite a bit longer than
| 5 minutes as the tab doesn't need to be foregrounded (which
| Google Analytics and others report as the end of the
| 'session'). That, and when a node comes online, it joins the
| network with the cache contents it had filled previously. So it
| joins the network warm, not cold and needing to be filled.
| zimbatm wrote:
| Interesting.
|
| Is the cache shared between all the websites that install
| this product?
|
| What happens to a HTTP request to a browser whenever they
| close their tab, is it retried on another browser?
| caseyross wrote:
| My first reaction was "yikes, that's super unethical to use
| users' bandwidth, storage, and processing power without their
| knowledge". But after thinking about it a lot, I have to say I've
| come around.
|
| As users, we all pay for upload bandwidth, and 99% of the time
| it's sitting there idly and providing literally no value to us or
| anyone else. That's because gosh darn it, we paid for it, so
| there's no way we're gonna let anyone else us it!
|
| But, if we stop thinking in binary terms of "my property" or
| "your property", and open our minds for a minute, what's so bad,
| really, about each of us donating a few resources to make the
| shared Internet run faster for everyone? If everyone does the
| same, we're all better off, and no one is worse off since the
| bandwidth was just going to waste anyway. Furthermore, the
| storage needed is already in use, storing the exact same content,
| just for your own exclusive use instead of everyone's, and the
| additional CPU cycles needed are surely much fewer than it takes
| to run the worthless ads your browser downloads everyday.
|
| I think of it as sort of the same logic as vaccination, where we
| ask people to pay a small private cost in a way that produces
| outsized benefits for society as a whole. In particular, each
| person gets an individual benefit much larger than the cost of
| the private resources they paid.
|
| Now, if the private resources aren't practically free like
| bandwidth or ROM, but actually expensive like capped data or
| mobile phone battery life, the ethical logic starts to get a bit
| murkier. However, I think it's definitely possible to build an
| effective peer-to-peer CDN without needing to touch those kind of
| scarce resources.
|
| The other reservation I have about this sort of project is that
| you really need to be able to trust the people running it to not
| abuse their control of the network (which really just means:
| other peoples' computers). That's, in my view, a much thornier
| issue than anything relating to the consumer end of the network.
|
| Let's hope the team will be able to successfully tackle both the
| technical and the human problems of a P2P CDN, and help push the
| Internet towards a more communal, sharing paradigm. Best of luck
| to them.
| lvass wrote:
| It's scary how things like this can become invisible to the user.
| WebRTC is a great concept that could really improve
| decentralization, but so incredibly easy to misuse that it's
| absurdly irresponsible browser developers enable it by default.
| Remember how people went to jail for simply having certain p2p
| file sharing software running? All this one takes is opening a
| web page.
| leephillips wrote:
| > Remember how people went to jail for simply having certain
| p2p file sharing software running?
|
| No. Do you have a link?
| lvass wrote:
| https://www.reddit.com/r/torrents/comments/30jlz0/raided_aft.
| ..
|
| Similar cases though they might require downloading
| something, perhaps innocent looking:
|
| https://www.propublica.org/article/prosecutors-dropping-
| chil...
|
| https://forum.emule-project.net/index.php?showtopic=161954
|
| https://www.robertslawteam.com/criminal-defense-
| overview/art...
|
| https://mashable.com/archive/child-porn-download
| leephillips wrote:
| The story is about a raid, not jail, but point taken in any
| case.
|
| EDIT: Later he mentions being "released", so I guess there
| was some jail involved.
| [deleted]
| vmoore wrote:
| > it's absurdly irresponsible browser developers enable it by
| default
|
| And not absurd for people who disable it because it leaks your
| real IP when using a VPN
| vbezhenar wrote:
| What's so irreponsible about it? Any website can waste your
| download/upload/CPU/RAM/disk. WebRTC does not make anything
| worse, but rather allows for a plenty of useful applications,
| some of them might help to reduce centralization which is the
| plague of the web IMO.
| lvass wrote:
| >waste your download/upload/CPU/RAM/disk
|
| Those are present here but not the largest threat. This
| service apparently makes you download ~300MB you have no idea
| what is and SHARES it in the background. What if it is child
| abuse content? This isn't even a "think of the children"
| argument, it can have very real consequences to anyone
| cluelessly browsing the web unlike your usual p2p network.
|
| WebRTC has additional insecurities and deanonymization
| potential like VPN leaks. I'm fairly certain the only reason
| so many people have it enabled is a certain large browser
| distributor happens to own some services that use WebRTC to
| reduce server bandwidth costs, and they want to make it's
| usage seamless without displaying even a small warning about
| the risks. Perhaps they also consider deanonymization a
| feature.
| azinman2 wrote:
| So arrogant to monetize your website using your users in such a
| way. It's one thing to show me ads, it's another to use me as a
| node in a distributed CDN for which I gain no value, and everyone
| else does. It's not even some kind of p2p situation I as a user
| opt into because I believe in the communal network.
|
| I hate this leechy behavior that has come to plague the internet
| for a while now. When it was all nerds it was glorious. Then the
| money came, and the bullies followed.
| samtheprogram wrote:
| You gain the value of whatever service you're using. It's the
| same approach of monetizing a site with ads, except instead of
| "paying" for the site with your data or eyeballs, you're paying
| with your bandwidth.
|
| I actually quite like this idea. Any site/app/etc that is free
| has to make money, at the very least to pay for infrastructure,
| and I find this a lot less invasive than surveillance, and not
| "arrogant" at all.
| [deleted]
| ROARosen wrote:
| I wonder can't a simple `display: none` hide Arc's widget?
| grun wrote:
| As a user, yep. But doing so as a website gets your Arc account
| banned and your websites blacklisted.
| mikeiz404 wrote:
| Any one know what the typical latency is like?
|
| I see mentions of many points of presence and claims of pops
| being close (I assume due to the size and distributed nature of
| the peers) and therefore _could_ be faster but no measurements or
| numbers in the FAQ.
| alex23478 wrote:
| > only runs when a device is connected to Wi-Fi or Ethernet.
| Cellular bandwidth is never used.
|
| This cannot be guaranteed.
|
| They seem to use the NetworkInformation API, which as of now is
| declared as experimental and straight up does not work on Firefox
| or more importantly on Safari (and therefore the whole iOS
| browser ecosystem). Apart from that, they're using IP ranges of
| mobile carriers for detection. [1]
|
| So let's say you're using iOS and a VPN while on the go. This
| will use your mobile data unless you're using an adblocker (since
| their service is only opt-in if you're using an adblocker) [2]
|
| [1] https://github.com/easylist/easylist/issues/7872 [2]
| https://github.com/uBlockOrigin/uAssets/pull/8874#issue-6142...
| sdze wrote:
| wonderful idea. What could possibly go wrong. /s
| pcr910303 wrote:
| It's an interesting idea. I did thinking it looked pretty scammy
| (felt like a crypto miner), but...
|
| It does feels different in that it doesn't use any user's
| bandwidth if the site doesn't monetize it and it mandates a UI
| with an opt-off button if the site does monetize it and uses the
| user's bandwidth.
|
| But I'm not sure if this is ever gonna work... will a nearby user
| device really give the content that I need than a datacenter...?
| There's no mention on the FAQ page how it prevents from a fake
| user sending malicious scripts across the network as well.
| grun wrote:
| > There's no mention on the FAQ page how it prevents from a
| fake user sending malicious scripts across the network as well.
|
| All content is fragmented, encrypted, and hashed before it's
| distributed across the network. If a peer ever receives a file
| piece from a peer and that piece's hash doesn't match the
| expected hash, it's dropped along with the connection to that
| peer.
| bityard wrote:
| I wonder how long it will take before arc.io appears on the
| same content blacklists as ads and cryptominers?
| vbezhenar wrote:
| https://github.com/uBlockOrigin/uAssets/issues/5771
| https://github.com/uBlockOrigin/uAssets/pull/8874 this might
| be of interest to you.
| gurchik wrote:
| So uBlock added them to the abuse list for not being opt-
| in, so the developer responds with adding opt-in _only if
| you 're using an adblock_. Way to completely miss the
| intent of the rule.
| jopsen wrote:
| How does the performance compare?
| sam0x17 wrote:
| Yeah I have to say I am for this in the vein of reducing
| centralization. Yes in a way it is akin to a crypto-miner, but
| the future internet could literally be distributed in this way,
| so props to them.
| bastawhiz wrote:
| This doesn't actually decentralize anything, it relies on a
| central system to connect peers.
| sam0x17 wrote:
| Ah, well, never mind then
| true_religion wrote:
| I wish I knew how this sort of CDN was created, so I could
| sponsor one explicitly for NSFW sites.
|
| It's a large an existent market, but always one that's ignored
| because of the complexities of dealing with it. However if you
| are already an insider, initial moderation isn't so hard.
| goodpoint wrote:
| > Arc uses only a small portion of spare bandwidth, imperceptible
| CPU, 300MB of browser cache
|
| Sounds very scammy. Is this happening without user consent? What
| about privacy?
| franga2000 wrote:
| This is my first time hearing about it too, but I don't see
| what you see. Where is the scam? _What_ about privacy?
|
| It seems like a good way of monetizing a site by creating
| something with intrinsic value to others (CDN capacity) out of
| something with next to no value to users (spare bandwidth).
| Sure beats the alternatives:
|
| - ads: value=user's money that they manipulate them into
| spending. Additionally, they usually track people relentlessly
| and in turn use that data to manipulate them even better. No
| value is created, just money is transferred with extra steps.
|
| - crypto miners: the value of (PoW) crypto is net negative (see
| environmental concerns), despite being profitable for some.
| They also decrease the user experience by draining batteries
| and slowing down computers. So no real value is created and
| negative value for users.
|
| - micro-donations: they just transfer something of value
| (money) from the user to the operator. Nothing is "created", so
| users are by definition losing money. You can argue that things
| _should_ cost money, but that 's a separate discussion - value
| for users is still negative.
|
| Of course I'd prefer this to be a vendor-neutral standard and
| not some private company, but none of the current "distributed
| web" solutions have gotten any serious level of adoption. This
| one actually got some attention since it's also a monetization
| platform - even if webmasters don't care about the distributed
| web idea, they still help get is closer.
| jeroenhd wrote:
| > only runs when a device is connected to Wi-Fi or Ethernet.
| Cellular bandwidth is never used.
|
| That's interesting. How does the website possibly detect that I'm
| on a cellular network when I'm a) use Firefox (which doesn't
| implement the NetworkInformation API) and b) use a VPN to either
| a data center or my home server? What about iOS devices using
| Apple's pseudo-Tor (because Safari doesn't expose
| NetworkInformation either), or devices using tethered WiFi?
|
| I don't believe this claim at all. This makes me take all their
| other claims they make with a massive grain of salt as well.
|
| There's also the massive privacy issue: other people will know
| what websites you visit by simply using the P2P system, and the
| entire thing seems to be opt-in unless you use an adblocker. That
| last part shows that the devs know of the privacy issue but have
| decided to take the practical approach of not fixing the issue
| and only doing the bare minimum to remove their website from
| blocklists.
|
| It's only a matter of time before someone will make a tool that
| enumerates all IPs in the Arc network together with the content
| they've been served. This one is going onto my Pihole's
| blocklist...
| grun wrote:
| Great questions!
|
| > How does the website possibly detect that I'm on a cellular
| network when I'm a) use Firefox (which doesn't implement the
| NetworkInformation API) and b) use a VPN to either a data
| center or my home server?
|
| Yep. An IP lookup is also done to see which AS the user is on.
| Eg an AWS IP vs a T-Mobile IP.
|
| We also do this to detect when people tether, as when tethering
| Chrome will report a Wi-Fi connection via the
| NetworkInforamtion API but it's Wi-Fi on top of an underlying
| cellular connection.
|
| > There's also the massive privacy issue: other people will
| know what websites you visit by simply using the P2P system
|
| All cached data is both fragmented and encrypted. When a node
| sends data to another peer, it's an encrypted fragment of a
| file and the sender doesn't know 1) what data it's sending nor
| 2) which website that data is being sent for.
| indolering wrote:
| PeerCDN tried this and was acqui-hired by Yahoo in 2013 and Peer5
| did this for video but was acqui-hired by Microsoft. What about
| P2P CDN didn't work before?
| shacharz wrote:
| Internal sources say Peer5 were not acqui-hired ;) But we took
| a very different direction than this or PeerCDN
| grun wrote:
| PeerCDN was too early; the web wasn't ready yet.
|
| For example, to intercept network requests to serve them from
| the peer-to-peer network, a <![CDATA[ tag had to be inserted
| into the <head> of the document to block rendering of the
| subsequent document. Then, once the document had finished
| downloading, the page HTML was manually rendered so all assets
| (eg JS, CSS, images, etc tags) could be loaded via JS instead
| of the browser natively. This was both slow and resulted in
| empty white pages on load. Now? We have the Service Worker API.
| (https://developer.mozilla.org/en-
| US/docs/Web/API/Service_Wor...)
|
| I'm not as familiar with Peer5's tech stack, so I can't speak
| there. But hi Shachar!
| reitanqild wrote:
| Lots of people would go to great lengths to secure a job at
| Microsoft, so I'm not sure if it counts like a failure, just
| not a runaway success.
| dest wrote:
| I wonder what is the time to first byte for a given asset when
| the page loads for the first time. Establishing a webrtc
| connection takes several rtts to the tracker/signalling server
| and will always be longer than a plain CDN request. I think it's
| rather useful on subsequent page navigation then, when p2p
| connections are already established.
|
| What about partnering with browsers like Brave, that would embed
| it in many pages at once?
|
| Disclaimer: I work at Lumen/Streamroot, we do this for video.
| mishafb wrote:
| Before this is loaded and completely ready, you probably load
| resources with your regular CDN
| dest wrote:
| So that you can seed it later, very probably
| baybal2 wrote:
| Gnutella was painfully slow, and error prone even during its best
| times.
| detaro wrote:
| Show HN 2 days ago: https://news.ycombinator.com/item?id=28394888
| baybal2 wrote:
| Since there is so much interest into turning a browser into a BT
| client basically for CDN purposes, check out this:
| https://github.com/webtorrent/webtorrent
|
| Unwittingly, people who put the WebRTC into the browser turned
| Chrome into world's biggest file sharing network, and now the
| Genie is out of the bottle.
| a_paddy wrote:
| Under EU Directive 2009/136/EC a user's permission is required
| before Arc can store any data in their browser. Is Arc obtaining
| that permission, or just looking for forgiveness by allowing
| users to Opt-Out?
| na85 wrote:
| >Arc uses only a small portion of spare bandwidth, imperceptible
| CPU, 300MB of browser cache, and only runs when a device is
| connected to Wi-Fi or Ethernet. Cellular bandwidth is never used.
|
| Ugh, the arrogance. To think that you are somehow entitled to use
| my bandwidth and CPU power is such a sadly typical mindset today.
|
| Yet another reason to run NoScript.
| BiteCode_dev wrote:
| And that's for each page that uses it I assume, so If you have
| 10 pages using Arc, that's not going to be imperceptible.
| grun wrote:
| Not quite. =] Arc coordinates and synchronizes across tabs
| (via an iframe). So Arc behaves identically whether you have
| one open tab with Arc or 100.
| na85 wrote:
| Aren't tabs supposed to be sandboxed?
| not_really wrote:
| Sounds like the iframe is used to glue sandboxes together
| sudosysgen wrote:
| Well, you're using CPU and bandwidth as long as you're using
| assets delivered with that CDN. Seems perfectly fair to me.
| dhaavi wrote:
| So, if I run this on a server and fill my cache, can I then
| monitor other people browsing arc.io-enabled websites?
|
| I feel the privacy implications of such a technology haven't been
| evaluated.
| grun wrote:
| Great question! All cached data is both fragmented and
| encrypted. When a node sends data to another peer, it doesn't
| know 1) what data it's sending nor 2) which website that data
| is being sent for.
| qeternity wrote:
| I have serious doubts that this is going to be remotely as
| performant as any major CDN.
| grun wrote:
| Hey guys! I'm Ansgar (https://github.com/gruns). I build Arc
| (http://arc.io/cdn).
|
| In the past, I built two of the world's largest YouTube to MP3
| converters. In doing so, I learned the hard way 1) that
| distributing content at scale globally is painful and _expensive_
| and 2) that I hate ads. So I built Arc.
|
| It's a two-sided content exchange. On one side, websites buy a
| faster, 10x cheaper peer-to-peer CDN. On the other side, websites
| make money without ads by contributing bandwidth to the peer-to-
| peer CDN. Arc connects the two, like Airbnb connects guests and
| hosts.
|
| As bandwidth capacity grows around the globe, we find ourselves
| in a world where people can share bandwidth both beneficially and
| imperceptibly. We see glimpses of that already today: Amazon
| shares bandwidth with Amazon Sidewalk, Microsoft shares bandwidth
| with Windows updates, etc. We build Arc for this world -- a post-
| adblock world -- to give sites a better, more ethical way to
| support themselves that doesn't bombard users with ads, suck up
| their personal data, and preserves their privacy.
|
| A few notes:
|
| - For sites that use Arc's CDN, users do not use upload bandwidth
| and Arc's widget isn't displayed. It's just a faster, 10x cheaper
| CDN in one <script> tag. That's it. (See https://arc.io/faq#do-
| users-upload-content-with-just-the-cdn)
|
| - For sites that monetize with Arc, we mandate that Arc's widget
| remains visible and intractable in the lower left corner of your
| website so users can learn about Arc and, if they so desire, opt
| out. (See https://arc.io/faq#can-i-move-modify-or-hide-arcs-
| widget) Additionally, Arc never activates on cellular
| connections; Wi-Fi and ethernet only.
|
| - If you elect to opt out (two clicks in Arc's widget), you're
| opted out of all sites with Arc.
|
| Please email me if you'd like an invite code to take Arc's CDN
| for a spin: ansgar@arc.io. It's 10x cheaper than Fastly, AWS,
| Google, etc. (See http://arc.io/cdn/). And I'd love to hear your
| thoughts! Feedback is how good products become great.
| catlifeonmars wrote:
| I'm curious what your analysis of the equilibrium states of
| this model look like. I think it's an interesting approach, but
| I'm not entirely convinced it's sustainable.
|
| Edit: TL;DR; show me the math! :D
___________________________________________________________________
(page generated 2021-09-05 23:01 UTC)