[HN Gopher] Arc - A P2P CDN that runs in the browser
       ___________________________________________________________________
        
       Arc - A P2P CDN that runs in the browser
        
       Author : sansui12
       Score  : 88 points
       Date   : 2021-09-05 10:26 UTC (12 hours ago)
        
 (HTM) web link (arc.io)
 (TXT) w3m dump (arc.io)
        
       | agentdrtran wrote:
       | anyone got an invite code? looks interesting
        
         | grun wrote:
         | Of course! Shoot me an email at ansgar@arc. =]
        
       | zimbatm wrote:
       | How stable is the CDN when the each node lifetime is 5 minutes?
       | The likelihood of an ongoing request being cut off is fairly
       | high. Nodes are also constantly in syncing state I assume.
        
         | grun wrote:
         | Very! The lifetime of nodes is actually quite a bit longer than
         | 5 minutes as the tab doesn't need to be foregrounded (which
         | Google Analytics and others report as the end of the
         | 'session'). That, and when a node comes online, it joins the
         | network with the cache contents it had filled previously. So it
         | joins the network warm, not cold and needing to be filled.
        
           | zimbatm wrote:
           | Interesting.
           | 
           | Is the cache shared between all the websites that install
           | this product?
           | 
           | What happens to a HTTP request to a browser whenever they
           | close their tab, is it retried on another browser?
        
       | caseyross wrote:
       | My first reaction was "yikes, that's super unethical to use
       | users' bandwidth, storage, and processing power without their
       | knowledge". But after thinking about it a lot, I have to say I've
       | come around.
       | 
       | As users, we all pay for upload bandwidth, and 99% of the time
       | it's sitting there idly and providing literally no value to us or
       | anyone else. That's because gosh darn it, we paid for it, so
       | there's no way we're gonna let anyone else us it!
       | 
       | But, if we stop thinking in binary terms of "my property" or
       | "your property", and open our minds for a minute, what's so bad,
       | really, about each of us donating a few resources to make the
       | shared Internet run faster for everyone? If everyone does the
       | same, we're all better off, and no one is worse off since the
       | bandwidth was just going to waste anyway. Furthermore, the
       | storage needed is already in use, storing the exact same content,
       | just for your own exclusive use instead of everyone's, and the
       | additional CPU cycles needed are surely much fewer than it takes
       | to run the worthless ads your browser downloads everyday.
       | 
       | I think of it as sort of the same logic as vaccination, where we
       | ask people to pay a small private cost in a way that produces
       | outsized benefits for society as a whole. In particular, each
       | person gets an individual benefit much larger than the cost of
       | the private resources they paid.
       | 
       | Now, if the private resources aren't practically free like
       | bandwidth or ROM, but actually expensive like capped data or
       | mobile phone battery life, the ethical logic starts to get a bit
       | murkier. However, I think it's definitely possible to build an
       | effective peer-to-peer CDN without needing to touch those kind of
       | scarce resources.
       | 
       | The other reservation I have about this sort of project is that
       | you really need to be able to trust the people running it to not
       | abuse their control of the network (which really just means:
       | other peoples' computers). That's, in my view, a much thornier
       | issue than anything relating to the consumer end of the network.
       | 
       | Let's hope the team will be able to successfully tackle both the
       | technical and the human problems of a P2P CDN, and help push the
       | Internet towards a more communal, sharing paradigm. Best of luck
       | to them.
        
       | lvass wrote:
       | It's scary how things like this can become invisible to the user.
       | WebRTC is a great concept that could really improve
       | decentralization, but so incredibly easy to misuse that it's
       | absurdly irresponsible browser developers enable it by default.
       | Remember how people went to jail for simply having certain p2p
       | file sharing software running? All this one takes is opening a
       | web page.
        
         | leephillips wrote:
         | > Remember how people went to jail for simply having certain
         | p2p file sharing software running?
         | 
         | No. Do you have a link?
        
           | lvass wrote:
           | https://www.reddit.com/r/torrents/comments/30jlz0/raided_aft.
           | ..
           | 
           | Similar cases though they might require downloading
           | something, perhaps innocent looking:
           | 
           | https://www.propublica.org/article/prosecutors-dropping-
           | chil...
           | 
           | https://forum.emule-project.net/index.php?showtopic=161954
           | 
           | https://www.robertslawteam.com/criminal-defense-
           | overview/art...
           | 
           | https://mashable.com/archive/child-porn-download
        
             | leephillips wrote:
             | The story is about a raid, not jail, but point taken in any
             | case.
             | 
             | EDIT: Later he mentions being "released", so I guess there
             | was some jail involved.
        
           | [deleted]
        
         | vmoore wrote:
         | > it's absurdly irresponsible browser developers enable it by
         | default
         | 
         | And not absurd for people who disable it because it leaks your
         | real IP when using a VPN
        
         | vbezhenar wrote:
         | What's so irreponsible about it? Any website can waste your
         | download/upload/CPU/RAM/disk. WebRTC does not make anything
         | worse, but rather allows for a plenty of useful applications,
         | some of them might help to reduce centralization which is the
         | plague of the web IMO.
        
           | lvass wrote:
           | >waste your download/upload/CPU/RAM/disk
           | 
           | Those are present here but not the largest threat. This
           | service apparently makes you download ~300MB you have no idea
           | what is and SHARES it in the background. What if it is child
           | abuse content? This isn't even a "think of the children"
           | argument, it can have very real consequences to anyone
           | cluelessly browsing the web unlike your usual p2p network.
           | 
           | WebRTC has additional insecurities and deanonymization
           | potential like VPN leaks. I'm fairly certain the only reason
           | so many people have it enabled is a certain large browser
           | distributor happens to own some services that use WebRTC to
           | reduce server bandwidth costs, and they want to make it's
           | usage seamless without displaying even a small warning about
           | the risks. Perhaps they also consider deanonymization a
           | feature.
        
       | azinman2 wrote:
       | So arrogant to monetize your website using your users in such a
       | way. It's one thing to show me ads, it's another to use me as a
       | node in a distributed CDN for which I gain no value, and everyone
       | else does. It's not even some kind of p2p situation I as a user
       | opt into because I believe in the communal network.
       | 
       | I hate this leechy behavior that has come to plague the internet
       | for a while now. When it was all nerds it was glorious. Then the
       | money came, and the bullies followed.
        
         | samtheprogram wrote:
         | You gain the value of whatever service you're using. It's the
         | same approach of monetizing a site with ads, except instead of
         | "paying" for the site with your data or eyeballs, you're paying
         | with your bandwidth.
         | 
         | I actually quite like this idea. Any site/app/etc that is free
         | has to make money, at the very least to pay for infrastructure,
         | and I find this a lot less invasive than surveillance, and not
         | "arrogant" at all.
        
         | [deleted]
        
       | ROARosen wrote:
       | I wonder can't a simple `display: none` hide Arc's widget?
        
         | grun wrote:
         | As a user, yep. But doing so as a website gets your Arc account
         | banned and your websites blacklisted.
        
       | mikeiz404 wrote:
       | Any one know what the typical latency is like?
       | 
       | I see mentions of many points of presence and claims of pops
       | being close (I assume due to the size and distributed nature of
       | the peers) and therefore _could_ be faster but no measurements or
       | numbers in the FAQ.
        
       | alex23478 wrote:
       | > only runs when a device is connected to Wi-Fi or Ethernet.
       | Cellular bandwidth is never used.
       | 
       | This cannot be guaranteed.
       | 
       | They seem to use the NetworkInformation API, which as of now is
       | declared as experimental and straight up does not work on Firefox
       | or more importantly on Safari (and therefore the whole iOS
       | browser ecosystem). Apart from that, they're using IP ranges of
       | mobile carriers for detection. [1]
       | 
       | So let's say you're using iOS and a VPN while on the go. This
       | will use your mobile data unless you're using an adblocker (since
       | their service is only opt-in if you're using an adblocker) [2]
       | 
       | [1] https://github.com/easylist/easylist/issues/7872 [2]
       | https://github.com/uBlockOrigin/uAssets/pull/8874#issue-6142...
        
       | sdze wrote:
       | wonderful idea. What could possibly go wrong. /s
        
       | pcr910303 wrote:
       | It's an interesting idea. I did thinking it looked pretty scammy
       | (felt like a crypto miner), but...
       | 
       | It does feels different in that it doesn't use any user's
       | bandwidth if the site doesn't monetize it and it mandates a UI
       | with an opt-off button if the site does monetize it and uses the
       | user's bandwidth.
       | 
       | But I'm not sure if this is ever gonna work... will a nearby user
       | device really give the content that I need than a datacenter...?
       | There's no mention on the FAQ page how it prevents from a fake
       | user sending malicious scripts across the network as well.
        
         | grun wrote:
         | > There's no mention on the FAQ page how it prevents from a
         | fake user sending malicious scripts across the network as well.
         | 
         | All content is fragmented, encrypted, and hashed before it's
         | distributed across the network. If a peer ever receives a file
         | piece from a peer and that piece's hash doesn't match the
         | expected hash, it's dropped along with the connection to that
         | peer.
        
         | bityard wrote:
         | I wonder how long it will take before arc.io appears on the
         | same content blacklists as ads and cryptominers?
        
           | vbezhenar wrote:
           | https://github.com/uBlockOrigin/uAssets/issues/5771
           | https://github.com/uBlockOrigin/uAssets/pull/8874 this might
           | be of interest to you.
        
             | gurchik wrote:
             | So uBlock added them to the abuse list for not being opt-
             | in, so the developer responds with adding opt-in _only if
             | you 're using an adblock_. Way to completely miss the
             | intent of the rule.
        
       | jopsen wrote:
       | How does the performance compare?
        
       | sam0x17 wrote:
       | Yeah I have to say I am for this in the vein of reducing
       | centralization. Yes in a way it is akin to a crypto-miner, but
       | the future internet could literally be distributed in this way,
       | so props to them.
        
         | bastawhiz wrote:
         | This doesn't actually decentralize anything, it relies on a
         | central system to connect peers.
        
           | sam0x17 wrote:
           | Ah, well, never mind then
        
       | true_religion wrote:
       | I wish I knew how this sort of CDN was created, so I could
       | sponsor one explicitly for NSFW sites.
       | 
       | It's a large an existent market, but always one that's ignored
       | because of the complexities of dealing with it. However if you
       | are already an insider, initial moderation isn't so hard.
        
       | goodpoint wrote:
       | > Arc uses only a small portion of spare bandwidth, imperceptible
       | CPU, 300MB of browser cache
       | 
       | Sounds very scammy. Is this happening without user consent? What
       | about privacy?
        
         | franga2000 wrote:
         | This is my first time hearing about it too, but I don't see
         | what you see. Where is the scam? _What_ about privacy?
         | 
         | It seems like a good way of monetizing a site by creating
         | something with intrinsic value to others (CDN capacity) out of
         | something with next to no value to users (spare bandwidth).
         | Sure beats the alternatives:
         | 
         | - ads: value=user's money that they manipulate them into
         | spending. Additionally, they usually track people relentlessly
         | and in turn use that data to manipulate them even better. No
         | value is created, just money is transferred with extra steps.
         | 
         | - crypto miners: the value of (PoW) crypto is net negative (see
         | environmental concerns), despite being profitable for some.
         | They also decrease the user experience by draining batteries
         | and slowing down computers. So no real value is created and
         | negative value for users.
         | 
         | - micro-donations: they just transfer something of value
         | (money) from the user to the operator. Nothing is "created", so
         | users are by definition losing money. You can argue that things
         | _should_ cost money, but that 's a separate discussion - value
         | for users is still negative.
         | 
         | Of course I'd prefer this to be a vendor-neutral standard and
         | not some private company, but none of the current "distributed
         | web" solutions have gotten any serious level of adoption. This
         | one actually got some attention since it's also a monetization
         | platform - even if webmasters don't care about the distributed
         | web idea, they still help get is closer.
        
       | jeroenhd wrote:
       | > only runs when a device is connected to Wi-Fi or Ethernet.
       | Cellular bandwidth is never used.
       | 
       | That's interesting. How does the website possibly detect that I'm
       | on a cellular network when I'm a) use Firefox (which doesn't
       | implement the NetworkInformation API) and b) use a VPN to either
       | a data center or my home server? What about iOS devices using
       | Apple's pseudo-Tor (because Safari doesn't expose
       | NetworkInformation either), or devices using tethered WiFi?
       | 
       | I don't believe this claim at all. This makes me take all their
       | other claims they make with a massive grain of salt as well.
       | 
       | There's also the massive privacy issue: other people will know
       | what websites you visit by simply using the P2P system, and the
       | entire thing seems to be opt-in unless you use an adblocker. That
       | last part shows that the devs know of the privacy issue but have
       | decided to take the practical approach of not fixing the issue
       | and only doing the bare minimum to remove their website from
       | blocklists.
       | 
       | It's only a matter of time before someone will make a tool that
       | enumerates all IPs in the Arc network together with the content
       | they've been served. This one is going onto my Pihole's
       | blocklist...
        
         | grun wrote:
         | Great questions!
         | 
         | > How does the website possibly detect that I'm on a cellular
         | network when I'm a) use Firefox (which doesn't implement the
         | NetworkInformation API) and b) use a VPN to either a data
         | center or my home server?
         | 
         | Yep. An IP lookup is also done to see which AS the user is on.
         | Eg an AWS IP vs a T-Mobile IP.
         | 
         | We also do this to detect when people tether, as when tethering
         | Chrome will report a Wi-Fi connection via the
         | NetworkInforamtion API but it's Wi-Fi on top of an underlying
         | cellular connection.
         | 
         | > There's also the massive privacy issue: other people will
         | know what websites you visit by simply using the P2P system
         | 
         | All cached data is both fragmented and encrypted. When a node
         | sends data to another peer, it's an encrypted fragment of a
         | file and the sender doesn't know 1) what data it's sending nor
         | 2) which website that data is being sent for.
        
       | indolering wrote:
       | PeerCDN tried this and was acqui-hired by Yahoo in 2013 and Peer5
       | did this for video but was acqui-hired by Microsoft. What about
       | P2P CDN didn't work before?
        
         | shacharz wrote:
         | Internal sources say Peer5 were not acqui-hired ;) But we took
         | a very different direction than this or PeerCDN
        
         | grun wrote:
         | PeerCDN was too early; the web wasn't ready yet.
         | 
         | For example, to intercept network requests to serve them from
         | the peer-to-peer network, a <![CDATA[ tag had to be inserted
         | into the <head> of the document to block rendering of the
         | subsequent document. Then, once the document had finished
         | downloading, the page HTML was manually rendered so all assets
         | (eg JS, CSS, images, etc tags) could be loaded via JS instead
         | of the browser natively. This was both slow and resulted in
         | empty white pages on load. Now? We have the Service Worker API.
         | (https://developer.mozilla.org/en-
         | US/docs/Web/API/Service_Wor...)
         | 
         | I'm not as familiar with Peer5's tech stack, so I can't speak
         | there. But hi Shachar!
        
         | reitanqild wrote:
         | Lots of people would go to great lengths to secure a job at
         | Microsoft, so I'm not sure if it counts like a failure, just
         | not a runaway success.
        
       | dest wrote:
       | I wonder what is the time to first byte for a given asset when
       | the page loads for the first time. Establishing a webrtc
       | connection takes several rtts to the tracker/signalling server
       | and will always be longer than a plain CDN request. I think it's
       | rather useful on subsequent page navigation then, when p2p
       | connections are already established.
       | 
       | What about partnering with browsers like Brave, that would embed
       | it in many pages at once?
       | 
       | Disclaimer: I work at Lumen/Streamroot, we do this for video.
        
         | mishafb wrote:
         | Before this is loaded and completely ready, you probably load
         | resources with your regular CDN
        
           | dest wrote:
           | So that you can seed it later, very probably
        
       | baybal2 wrote:
       | Gnutella was painfully slow, and error prone even during its best
       | times.
        
       | detaro wrote:
       | Show HN 2 days ago: https://news.ycombinator.com/item?id=28394888
        
       | baybal2 wrote:
       | Since there is so much interest into turning a browser into a BT
       | client basically for CDN purposes, check out this:
       | https://github.com/webtorrent/webtorrent
       | 
       | Unwittingly, people who put the WebRTC into the browser turned
       | Chrome into world's biggest file sharing network, and now the
       | Genie is out of the bottle.
        
       | a_paddy wrote:
       | Under EU Directive 2009/136/EC a user's permission is required
       | before Arc can store any data in their browser. Is Arc obtaining
       | that permission, or just looking for forgiveness by allowing
       | users to Opt-Out?
        
       | na85 wrote:
       | >Arc uses only a small portion of spare bandwidth, imperceptible
       | CPU, 300MB of browser cache, and only runs when a device is
       | connected to Wi-Fi or Ethernet. Cellular bandwidth is never used.
       | 
       | Ugh, the arrogance. To think that you are somehow entitled to use
       | my bandwidth and CPU power is such a sadly typical mindset today.
       | 
       | Yet another reason to run NoScript.
        
         | BiteCode_dev wrote:
         | And that's for each page that uses it I assume, so If you have
         | 10 pages using Arc, that's not going to be imperceptible.
        
           | grun wrote:
           | Not quite. =] Arc coordinates and synchronizes across tabs
           | (via an iframe). So Arc behaves identically whether you have
           | one open tab with Arc or 100.
        
             | na85 wrote:
             | Aren't tabs supposed to be sandboxed?
        
               | not_really wrote:
               | Sounds like the iframe is used to glue sandboxes together
        
         | sudosysgen wrote:
         | Well, you're using CPU and bandwidth as long as you're using
         | assets delivered with that CDN. Seems perfectly fair to me.
        
       | dhaavi wrote:
       | So, if I run this on a server and fill my cache, can I then
       | monitor other people browsing arc.io-enabled websites?
       | 
       | I feel the privacy implications of such a technology haven't been
       | evaluated.
        
         | grun wrote:
         | Great question! All cached data is both fragmented and
         | encrypted. When a node sends data to another peer, it doesn't
         | know 1) what data it's sending nor 2) which website that data
         | is being sent for.
        
       | qeternity wrote:
       | I have serious doubts that this is going to be remotely as
       | performant as any major CDN.
        
       | grun wrote:
       | Hey guys! I'm Ansgar (https://github.com/gruns). I build Arc
       | (http://arc.io/cdn).
       | 
       | In the past, I built two of the world's largest YouTube to MP3
       | converters. In doing so, I learned the hard way 1) that
       | distributing content at scale globally is painful and _expensive_
       | and 2) that I hate ads. So I built Arc.
       | 
       | It's a two-sided content exchange. On one side, websites buy a
       | faster, 10x cheaper peer-to-peer CDN. On the other side, websites
       | make money without ads by contributing bandwidth to the peer-to-
       | peer CDN. Arc connects the two, like Airbnb connects guests and
       | hosts.
       | 
       | As bandwidth capacity grows around the globe, we find ourselves
       | in a world where people can share bandwidth both beneficially and
       | imperceptibly. We see glimpses of that already today: Amazon
       | shares bandwidth with Amazon Sidewalk, Microsoft shares bandwidth
       | with Windows updates, etc. We build Arc for this world -- a post-
       | adblock world -- to give sites a better, more ethical way to
       | support themselves that doesn't bombard users with ads, suck up
       | their personal data, and preserves their privacy.
       | 
       | A few notes:
       | 
       | - For sites that use Arc's CDN, users do not use upload bandwidth
       | and Arc's widget isn't displayed. It's just a faster, 10x cheaper
       | CDN in one <script> tag. That's it. (See https://arc.io/faq#do-
       | users-upload-content-with-just-the-cdn)
       | 
       | - For sites that monetize with Arc, we mandate that Arc's widget
       | remains visible and intractable in the lower left corner of your
       | website so users can learn about Arc and, if they so desire, opt
       | out. (See https://arc.io/faq#can-i-move-modify-or-hide-arcs-
       | widget) Additionally, Arc never activates on cellular
       | connections; Wi-Fi and ethernet only.
       | 
       | - If you elect to opt out (two clicks in Arc's widget), you're
       | opted out of all sites with Arc.
       | 
       | Please email me if you'd like an invite code to take Arc's CDN
       | for a spin: ansgar@arc.io. It's 10x cheaper than Fastly, AWS,
       | Google, etc. (See http://arc.io/cdn/). And I'd love to hear your
       | thoughts! Feedback is how good products become great.
        
         | catlifeonmars wrote:
         | I'm curious what your analysis of the equilibrium states of
         | this model look like. I think it's an interesting approach, but
         | I'm not entirely convinced it's sustainable.
         | 
         | Edit: TL;DR; show me the math! :D
        
       ___________________________________________________________________
       (page generated 2021-09-05 23:01 UTC)