[HN Gopher] Why Prefetch Is Broken
       ___________________________________________________________________
        
       Why Prefetch Is Broken
        
       Author : astdb
       Score  : 230 points
       Date   : 2021-06-02 06:35 UTC (16 hours ago)
        
 (HTM) web link (www.jefftk.com)
 (TXT) w3m dump (www.jefftk.com)
        
       | failwhaleshark wrote:
       | Isn't it a best practice to white screen / "loading" until
       | everything important is loaded?
       | 
       | How does one detect when everything is loaded? I've seen some
       | websites break when UI interaction occurs _before_ all js are
       | loaded.
        
       | torgard wrote:
       | This is quite interesting. I didn't know this was a thing.
       | 
       | A primary issue, as I see it, is caching of third-party assets
       | (as dmkil posted elsewhere, think jQuery, Google Fonts, Facebook
       | Pixel, etc).
       | 
       | Could this not be solved using the Cache-Control header, or maybe
       | some HTML attribute variation of this? Maybe something like:
       | <!-- Use a site-specific cache for its stylesheet, default
       | behavior -->        <link rel=spreadsheet href=index.css cache-
       | key="private">            <!-- Use a global cache for jQuery -->
       | <script
       | src="https://code.jquery.com/jquery-3.6.0.slim.min.js"
       | integrity="sha256-u7e5khyithlIdTpu22PHhENmPcRdFiHRjhAuHcs05RI="
       | crossorigin="anonymous"         cache-key="public"
       | ></script>
        
         | ktpsns wrote:
         | Your cache-key idea infiltrates the overall idea with third-
         | party access. As a site owner who wants to place Ads I just use
         | your script notation to include the ad and then leak data
         | again...
        
           | torgard wrote:
           | Ah yes, of course. I was thinking with a perspective of a
           | user who has full control of the site, as they are the owner
           | too.
           | 
           | The cache-key idea would only work if the user themselves
           | could specify it for every resource.
        
       | ZiiS wrote:
       | Isn't the actual problem with Chrome's behavior and as=document
       | that they still leak? If a.test preloads
       | b.test/i_have_visited_a.html and it is added to b.test's cache.
        
         | londons_explore wrote:
         | Correct. as=document reintroduces a data leak between different
         | domains, and therefore isn't viable.
         | 
         | The only solution is not to allow cross-origin document
         | preloads. Which is lame because the impact on user experience
         | is reasonably substantial.
        
           | jefftk wrote:
           | How is as=document more of a leak than ordinary cross-site
           | navigation?
        
             | kevincox wrote:
             | Because you can do this without navigating the user. You
             | are correct that this can be accomplished by redirecting to
             | the tracker site and back but this is 1) detrimental to
             | user experience so not frequently done. 2) Easier to detect
             | and block as the behaviour is suspicious 3) Means that your
             | site breaks if the tracker site is down. This is especially
             | an issue if you want to "register" the session with
             | multiple trackers.
             | 
             | With preload you can do this in the background very
             | efficiently.
        
               | jefftk wrote:
               | Hmm, I think you're right, there may be a problem here.
               | Unlike any other request you can trigger on a page in a
               | browser that blocks third-party cookies, if a.test has
               | <link rel=prefetch as=document href=b.test> this will
               | send b.test's first-party cookies. This allows cross-site
               | tracking through link decoration without having to
               | navigate the top-level page.
        
       | xallarap wrote:
       | That aside it's a terrible name because it looks like preftech.
       | If it's just caching they why not keep the ass tight?
        
       | slver wrote:
       | If the cache is per domain, does that mean CDN-served
       | dependencies like JQuery and React are in fact... useless in
       | terms of cache reuse.
        
         | eli wrote:
         | Honestly cache reuse was never as high as anyone hoped. There
         | were so many versions of jQuery and so many different CDNs that
         | very few first time visitors already had the one you wanted.
        
         | r1ch wrote:
         | Using a library-provided CDN can be performance-negative now,
         | since you usually need several RTTs for DNS, TCP, TLS before
         | you even get to HTTP. Serving it from your own domain / CDN
         | allows it to be part of an existing HTTP connection.
        
         | Cthulhu_ wrote:
         | And that's fine; CDN benefits are minimal, given how many
         | versions of dependencies are out in the wild, and how the total
         | payload of a website can be reduced by clever packaging and
         | modern data transfer mechanisms. The JS standard library also
         | improved since then, with things like CSS selectors, the Fetch
         | API, and CSS animations being part of the standard library
         | nowadays.
         | 
         | I'd argue there's few 'shared' dependencies on websites
         | nowadays.
        
         | move-on-by wrote:
         | As others have already said: Yes, it is useless in terms of
         | cache.
         | 
         | Besides it being useless in terms of cache, it also incurs
         | other overhead. Another DNS request, another TCP handshake,
         | another TLS connection. With HTTP 1.1, this might still make
         | sense because you don't get resource pipelining, but with
         | HTTPv2, the extra overhead is simply extra overhead. With
         | HTTPv3, it becomes even less useful to have the domains
         | sharded. Generally speaking, the best use of resource usage
         | with the modern web is to serve everything from the same
         | domain.
        
           | anticristi wrote:
           | I kind of like how protocol improvements have made the domain
           | an authority, almost like a frontend security boundary.
        
         | floo wrote:
         | Yes.
        
         | WayToDoor wrote:
         | Yes. CDNs only offer limited benefits such as lower latency and
         | higher bandwidth.
        
           | SemiNormal wrote:
           | So still useful for images and other large files. Not so much
           | with scripts.
        
         | pornel wrote:
         | Yes. Browsers and protocols have changed, and a lot of past
         | performance best-practices have become actively harmful for
         | performance.
         | 
         | A couple of related techniques are also useless: domain
         | sharding and cookieless domains. HTTP/2 multiplexing and header
         | compression made them obsolete, and now they're just an
         | overhead for DNS+TLS, and often break HTTP/2 prioritization.
         | 
         | You should be careful with prefetch too. Thanks to preload
         | scanners and HTTP/2 prioritization there are few situations
         | where it is really beneficial. But there are many ways to screw
         | it up and cause unnecessary or double downloads.
        
         | jefftk wrote:
         | Yes. They didn't used to be, but they are now.
        
           | bobmaxup wrote:
           | In Safari, since ~2013.
        
             | jefftk wrote:
             | That's true, but in the other major browsers it's much more
             | recent: 2020 or even early 2021.
        
         | kevingadd wrote:
         | Yeah, the security changes here made CDNs go from good to
         | actively bad.
        
         | bmn__ wrote:
         | To make cacheing efficient again:
         | 
         | https://developer.mozilla.org/docs/Web/Security/Subresource_...
         | 
         | https://localcdn.org/
        
       | blixt wrote:
       | The cache segregation is a bit of a nuisance when building sites
       | that uses iframes on different domains to sandbox user content.
       | For example, Framer (where I work) sandboxes each user project on
       | a unique subdomain of framercanvas.com, which is on the public
       | suffix list so that different projects can't share cookies,
       | localStorage etc. But all resources loaded on these subdomains
       | are always loaded fresh for new projects because of cache
       | segregation, even if it's the same old files being loaded over
       | and over. I wish there was a way to establish a shared cache
       | between two domains, because we could manually implement a
       | content tunnel between framer.com and the sandbox that sends down
       | cached content with postMessage, so optionally sharing cache
       | doesn't seem like an additional tracking issue if opted in
       | explicitly.
        
         | atirip wrote:
         | Load all resources with fetch(), store to cache by yourself,
         | take the response, clone and transfer into anything that can be
         | sent with postMessage()
        
           | mrblampo wrote:
           | Clever! It would be better for the intended behavior to be
           | supported, though, right?
        
         | torgard wrote:
         | That's interesting.
         | 
         | Just so I understand it correctly: * The iframe loads
         | resources, e.g. /static/bundle.js and /public/index.css (does
         | this include user-defined resources?). * But due to the iframe
         | being embedded on sub**.framercanvas.com, the cache key
         | includes the subdomain. So all resources are fetched again for
         | all projects?
        
         | finnthehuman wrote:
         | >Framer (where I work)
         | 
         | Your website runs terribly on firefox. Multiple hundred-of-
         | millisecod periods where the viewport went blank.
        
         | jefftk wrote:
         | So your setup is:                   outer.example
         | userN.inner.example             shared.example/resource
         | 
         | Since, as you say, outer.example could load the shared
         | resources and postMessage them into userN.inner.example, it
         | does seem to me like there should be a way for outer.example
         | and userN.inner.example to opt into letting userN.inner.example
         | use the outer.example cache partition.
         | 
         | Have you considered raising a spec issue?
        
           | derefr wrote:
           | I would think that the original intent of the origin cache
           | sandboxing here, is to disallow different source origins from
           | sharing a target origin, even if that target origin _wants_
           | to be shared. Think  "the target origin is an ad-tracking
           | provider domain."
           | 
           | I don't see any good way of enabling a target origin to opt
           | into allowing source origins to share caches with it, that
           | wouldn't also reintroduce the privacy leaks. (As, after all,
           | even if the only things malicious-site-X can see in your
           | cache are ad-tech providers' origins that opted into allowing
           | anyone to interface with them, that's likely still enough to
           | fingerprint you.)
        
             | jefftk wrote:
             | It would still be sharded by top-level domain, though,
             | which would be enough to prevent cross-site tracking. This
             | is specifically for the case where a page looks like:
             | [common top level origin] -> [iframed different origin].
        
       | andrejguran wrote:
       | I don't understand the problem here! Prefetch loads additional
       | data while you're not doing anything. The whole argument is that
       | prefetch doesn't respect caching??? Those are two different
       | concepts. While I am looking at slide N I don't care if slide N+1
       | image is loaded fresh or from a cache. Am I missing something
       | here?
        
         | skymt wrote:
         | You are missing something, but I wish people hadn't down-voted
         | you for asking an honest question.
         | 
         | The problem described in the blog post is that prefetch loads
         | the resource into cache, which when combined with per-site
         | cache segmentation means that it's ambiguous _which cache_ a
         | resource should be loaded into when it 's prefetched across
         | sites.
        
       | TazeTSchnitzel wrote:
       | Are "a.test" and "b.test" meant to represent different domains?
       | The actual syntax used would just be different file paths.
        
         | kevincox wrote:
         | Or .example which is a TLD reserved for examples. Different
         | file paths wouldn't cut it because they are same-origin and
         | don't trigger this issue.
        
         | slver wrote:
         | Yes, different domains.
        
         | runxel wrote:
         | Really hard to understand. I think this should be rewritten to
         | make it more clear.
        
           | ksec wrote:
           | Yes. It should have been Example.com or something like that.
           | Where we instantly knew it was a domain name.
        
       | danjordan wrote:
       | It was a surprise to me that browsers partition their cache now.
       | I think Safari has done it since 2013!
       | 
       | When I found out I wrote a blog post about the HTTP Cache
       | partitioning and hosting jQuery, or any library, from a CDN.
       | 
       | https://www.danjordan.dev/posts/should-i-host-jquery-from-a-...
        
       | matsemann wrote:
       | I don't really understand the issue? If I want to prefetch an
       | image, I'm on the same origin the whole time and this cache
       | segregation doesn't matter.
        
         | Aeolun wrote:
         | It's not an issue most of the time, but I do agree that it
         | would be nice to have a fix.
        
         | adrr wrote:
         | It would be full document prefetches. Would be useful for sites
         | like Reddit or Google News. Also for things like Okta
         | application list page.
        
           | [deleted]
        
         | bastawhiz wrote:
         | Yeah, his slideshow example doesn't show the problem. Unless
         | each slide was on its own domain, this isn't a problem. It
         | matters for things like Goggle Fonts, but very few folks have
         | multiple domains that share enough of the same assets for this
         | to matter in practice.
        
           | dannyw wrote:
           | Question: Does Google use Google Fonts to track users across
           | the web?
           | 
           | Google's FAQ [1] says that it only collects the information
           | needed to serve fonts, but it says the generic Google privacy
           | policy applies. The Google Privacy Policy allows them to use
           | any information it collects for advertising purposes.
           | 
           | While Google also states that requests do not contain
           | cookies, Google Chrome will automatically send a high-entropy
           | [3], persistent identifier on all requests to Google
           | properties, and this cannot be disabled (X-client-data) [2].
           | Google can use this X-client-data, combined with the
           | useragent's IP address, to uniquely identify each Chrome
           | user, without cookies.
           | 
           | So, perhaps the privacy statement is more of a sneakily
           | worded non-denial?
           | 
           | [1]: https://developers.google.com/fonts/faq?hl=en#what_does_
           | usin...
           | 
           | [2]: https://github.com/w3ctag/design-
           | reviews/issues/467#issuecom...
           | 
           | [3]: A sample: `X-client-data: CIS2yQEIprbJAZjBtskBCKmdygEI8J
           | /KAQjLrsoBCL2wygEI97TKAQiVtcoBCO21ygEYq6TKARjWscoB` - looks
           | very high entropy to me!
        
             | jefftk wrote:
             | _> Google Chrome will automatically send a high-entropy
             | [3], persistent identifier on all requests to Google
             | properties, and this cannot be disabled (X-client-data)
             | [2]._
             | 
             | X-Client-Data indicates which experiment variations are
             | active in Chrome:
             | 
             |  _Additionally, a subset of low entropy variations are
             | included in network requests sent to Google. The combined
             | state of these variations is non-identifying, since it is
             | based on a 13-bit low entropy value (see above). These are
             | transmitted using the "X-Client-Data" HTTP header, which
             | contains a list of active variations. On Android, this
             | header may include a limited set of external server-side
             | experiments, which may affect the Chrome installation. This
             | header is used to evaluate the effect on Google servers -
             | for example, a networking change may affect YouTube video
             | load speed or an Omnibox ranking update may result in more
             | helpful Google Search results._ -- https://www.google.com/c
             | hrome/privacy/whitepaper.html#variat...
             | 
             | Google doesn't use fingerprinting for ad targeting, through
             | like with IP, UA, etc it receives the information it would
             | need if it were going to. I don't see a way Google could
             | demonstrate this publicly, though, except an audit (which
             | would show that X-Client-Data is only used for the
             | evaluation of Chrome variations.)
             | 
             | (Disclosure: I work on ads at Google, speaking only for
             | myself)
        
               | dannyw wrote:
               | Thanks for the informative answer. I still have trust in
               | engineers and assume truth and good faith, so that's is
               | comforting to know.
        
             | londons_explore wrote:
             | You could always ask someone who works on Google Fonts. I
             | did just that. The answer is they don't use the logs for
             | much apart from counting how many people use each font to
             | draw pretty graphs.
             | 
             | Doesn't mean that won't change in the future though. But
             | log retention is only a matter of days, so they can't
             | retrospectively change what they do to invade your privacy.
        
               | amluto wrote:
               | I find myself wondering whether Google's front end
               | implements a fully generic tracker: collect source
               | address and headers and forward it to an analytics
               | system. The developers involved in each individual Google
               | property behind the front end might not even know it's
               | there. Correlating the headers with the set of URLs hit
               | and their timing might give quite a lot of information
               | about the pages being visited.
               | 
               | I hope Google doesn't do this, but I would not be
               | entirely surprised if they did.
        
               | londons_explore wrote:
               | If the frontend had a fully generic tracker, teams
               | wouldn't need to set up their own logging and stats
               | systems... Which they do...
        
               | lars wrote:
               | I think they would in any case. My impression is that
               | data is siloed internally at Google, and that data
               | sharing between departments would be way more complex
               | than just setting up some (possibly redundant) logging.
        
               | Filligree wrote:
               | I spent ten seconds thinking about the logistics of
               | adding logging to the frontends, and...
               | 
               | Well, obviously I can't say for sure they don't have any.
               | I didn't look it up, and if I had I wouldn't be able to
               | tell you. But since I didn't, I can tell you that the
               | concept seems completely infeasible. There's too much
               | traffic, and nowhere to put them.
               | 
               | Besides that, not everything is legal to log. The
               | frontends don't know what they're seeing, though; they're
               | generic reverse proxies. So...
        
               | nacs wrote:
               | > completely infeasible. There's too much traffic, and
               | nowhere to put them
               | 
               | If there's one company in the world for whom bandwidth
               | and storage are not an issue, it's Google.
        
               | phh wrote:
               | It sounds so easy to make, yet so useful, that I can't
               | see how they wouldn't do that. Deontology has been thrown
               | out Google's window a long time ago.
        
               | fogihujy wrote:
               | Unless it's regularly verified by a trusted third party,
               | such as a government agency, I wouldn't trust them not
               | to. After all: we're talking about a corporation that
               | lives off the data it gathers about people using their
               | services and products.
        
               | alpaca128 wrote:
               | I just went for the easy solution and disabled web fonts.
               | Comes with the drawback that many site UIs are now at
               | least partially broken (especially since some developers
               | had the bright idea to use fonts for UI icons), though
               | flashier sites tend to come with less interesting content
               | anyway.
               | 
               | But as it stands I don't want to trust Google, Facebook
               | etc. more than absolutely necessary. They have lost every
               | right to that a long time ago and are incentivized by
               | their business model to not change anything.
        
           | hypertele-Xii wrote:
           | So download your fonts off Google and serve them from your
           | own domain.
        
         | cxr wrote:
         | Indeed. As the author puts it "sometimes you know something is
         | very likely to be needed". Let's have a look:
         | <link rel=prefetch href=url>
         | 
         | What's going on here, for the test case given? It's introducing
         | tight coupling (or it already exists, and you're trying to
         | capture a description of it to serve to the browser) to an
         | external resource. It's not that prefetch is broken, it's that
         | to desire to be able to gesture at the existence of a resource
         | outside your organization's control, while insisting that it's
         | so important as to be timing-critical, is like trying to have
         | your cake and eat it, too.
         | 
         | As mentioned in similar comments, the observed behavior for
         | this particular test case is potentially a problem if you are
         | building Modern Web Apps by following the received wisdom of
         | how you're supposed to do that. There are lots of unstated
         | assumptions in the article in this vein. One such assumption is
         | that you're going to do things that way. Another assumption is
         | that the arguments for doing things that way and the plight of
         | the tech professionals doing the doing are universally
         | recognized and accepted.
         | 
         | From the Web-theoretic perspective--that is, following the
         | original use cases that the Web was created to address--if that
         | resource is so important to your organization, then you can
         | mint your own identifier for it under your own authority.
         | 
         | Ultimately, I don't have a lot of sympathy for the plight
         | described in the article. It's fair to say that the instances
         | where this sort of thing shows up involve abusing the
         | fundamental mechanism of the Web to do things that are,
         | although widely accepted by contemporaries as standard
         | practice, totally counter to its spirit.
        
           | marcosdumay wrote:
           | Hum... The resource is not that important to me. I'm just the
           | author of the current page, and are letting the browser know
           | that users are very likely to want that resource.
           | 
           | You are attributing a lot of intention into a mechanism. You
           | don't know if it's a 3rd party tracker or the news link in a
           | discussion page.
           | 
           | The proposal at the article is actually quite good, since I
           | should always know very well if it will load into a frame or
           | a link.
        
             | vlovich123 wrote:
             | I could be wrong but it seems to me that any cross-domain
             | prefetch that uses the "document" option from the article
             | is potentially privacy-violating and can reintroduce the
             | same leaks that necessitated the original segregation.
             | 
             | A.test prefetches b.test/visited_a.js, b.test/unique_id.js,
             | and log(n) URLs that bisect unique_id.js so that you can
             | search the cache for the unique id.
             | 
             | Have to be careful to balance performance and "this is
             | useful to me" with abuse prevention at scale. It's also
             | important to realize we have to tread carefully with
             | browser features that seem useful as the graveyard of
             | deprecated features that didn't survive privacy attacks is
             | quite large.
        
               | marcosdumay wrote:
               | Well, if so it's a problem, because the article says that
               | Chrome and Safari handle prefetch exactly that way.
               | 
               | But I don't see that problem. On this case the a.test
               | domain can not see what is on the cache, only b.test sees
               | it. (At least by what I understood.)
        
           | horsawlarway wrote:
           | I understand what you're saying, but I fundamentally disagree
           | with you.
           | 
           | The issue is that you're immediately deciding that domain X
           | and domain Y are different entities.
           | 
           | In practice, I find that there are a HUGE number of use cases
           | where two domains are actually the same organization, or two
           | organizations that are collaborating.
           | 
           | There is basically no way to say to the browser - I am "X"
           | and my friends are "Y" and "Z", they should have permissions
           | to do things that I don't allow "A", "B", and "C" to do.
           | 
           | ---
           | 
           | We actually have a functioning standard for this on mobile
           | (both iOS and Android support /.well-known paths for
           | manifests that allow apps to couple more tightly with sites
           | that list them as "well-known" (aka - friends)
           | 
           | The browser support for this kind of thing is basically non-
           | existent, though, and it's maddening. SameSite would have
           | been a _PERFECT_ use case. We 're already doing shit like
           | preflight requests, why not just fetch a standard manifest
           | and treat whitelisted sites as "Samesite=none" and everything
           | else as "Samesite=lax"? Instead orgs were forced into a
           | binary choice of none (forfeiting the new csrf protections)
           | or lax (forfeiting cross site sharing).
        
             | hypertele-Xii wrote:
             | > The issue is that you're immediately deciding that domain
             | X and domain Y are different entities.
             | 
             | That's... literally what domain means.
             | 
             | "A domain name is an identification string that defines a
             | realm of administrative autonomy, authority or control
             | within the Internet." - Wikipedia
             | 
             | The entire security policy of the Internet is built on this
             | _definition_. It 's not an assumption. It's a core
             | mechanism.
        
               | horsawlarway wrote:
               | wrong again... It's very clear "* _A*_ realm ". Not "The
               | realm". Certainly not "The _only_ realm ".
               | 
               | Entities are allowed to control many assets.
               | 
               | Simplest possible case in the wild for you, since you're
               | being obtuse.
               | 
               | I'm company "A". I just bought company "B". Now fucking
               | what?
        
             | cxr wrote:
             | > but I fundamentally disagree with you ["We actually have
             | a functioning standard for this on mobile"]
             | 
             | I wouldn't expect any less.
             | 
             | https://quoteinvestigator.com/2017/11/30/salary/
             | 
             | > In practice [...]
             | 
             | There's your problem. Try fixing that.
        
               | horsawlarway wrote:
               | Your take is wrong.
               | 
               | Having a domain is identical to having a driver's
               | license: This org says I am "X".
               | 
               | It is fundamentally different from uniquely identifying
               | me.
               | 
               | I am still the same person if I give you my library card
               | - a different ID from a different org that says I am "Y".
        
               | cxr wrote:
               | Nah, the problem is definitely on your end.
               | 
               | > Having a domain is identical to having a driver's
               | license: This org says I am "X".
               | 
               | Nope. You just described two different documents.
               | 
               | http://horsawlarway.example/drivers-license.html
               | 
               | http://horsawlarway.example/library-card.rdf
               | 
               | It's very odd the position you're taking here, given the
               | sentiment in your other tirade about being digital
               | sharecroppers to Facebook and Google
               | <https://news.ycombinator.com/item?id=27369652>. Your
               | proposed solution is dumping more fuel in their engines--
               | which is why it's the kind of solution they prefer
               | themselves--and completely at odds with e.g. Solid and
               | other attempts to do things thing would _actually_
               | empower and protect individual users. I 'm interacting
               | with your digital homestead, why are you so adamant about
               | leaking my activity to another domain?
        
               | Spivak wrote:
               | Moving everything under a single domain makes no sense
               | for the use-case of cross-organization sharing though.
               | Domain as as the root identity on the web is just broken
               | and there's no way to make it work.
               | mysharedhosting.com/customer1
               | mysharedhosting.com/customer2
               | 
               | Two separate identities. Is it even possible to let the
               | browser know that they should be segmented? Nope.
               | customer1.mysharedhosting.com
               | customer2.mysharedhosting.com
               | 
               | How about this? No again, not without getting yourself on
               | the suffix list which is just a hack.
               | mycompany.com         joinmycompany.com
               | 
               | Can you tell the browser that these are actually the same
               | and shouldn't be segmented? Nope!
        
             | Welteam wrote:
             | If you actually search a bit on the internet, you'll find
             | Google actually made a RFC for a standard allowing websites
             | to list other domains that should be considered as same
             | origin. Look for the comments on it by the ietf and you'll
             | understand why this is a terrible idea.
        
               | rsj_hn wrote:
               | > RFC for a standard allowing websites to list other
               | domains that should be considered as same origin
               | 
               | No, they allowed an origin to list other origins whose
               | cookies would be sent back to the serving origin
               | correctly even if they were iframes loaded in the parent
               | origin DOM.
               | 
               | I.e. this is the expected behavior for iframes until
               | Safari decided that there was such a thing as "third
               | party" origins whose web semantics could be broken in
               | their war against advertising.
               | 
               | Google is trying to (partially) restore the expected
               | behavior of iframes so that named origins get their own
               | cookies sent to them, which is how things worked for the
               | first two decades of the web.
        
               | horsawlarway wrote:
               | Why don't you search a bit and come back with a link?
               | 
               | Because I can't comment on an RFC I haven't seen, and a
               | quick google search of my own based on your comment turns
               | up nada.
               | 
               | That said - I'm fully aware of the downsides of this
               | approach, but I want my browser to be (to put it crudely)
               | MY FUCKING USER AGENT. I want to be able to allow sharing
               | by default in most cases, and I want a little dropdown
               | menu that shows me the domains a site has listed as
               | friendly/same-entity, and I want a checkbox I can uncheck
               | for each of them.
               | 
               | Then I want an extension API to allow someone else to do
               | the unchecking for me, based on whether the domain is
               | highly correlated with tracking (Google analytics,
               | Segment, Heap, Braze, etc)
               | 
               | -------------
               | 
               | The way I see it, the road to hell is paved with good
               | intentions. If the web was developed in our current
               | climate of security/privacy focus, how likely is it that
               | even a fucking <a href=[3rd party]> would be allowed?
               | Because I see us driving to a spot where this verboten.
               | Which also happens to be the final nail in the coffin for
               | any sort of real open platform.
               | 
               | Welcome to the world where the web is literally
               | subdomains of facebook/google. What a fucking trash place
               | to be.
        
               | will4274 wrote:
               | As somebody directly involved in this space, that's a
               | pretty bad summary. The spec for first party sets (not an
               | RFC, just a draft) isn't in a great state. Google is
               | going to implement it anyway, Microsoft is supporting,
               | Apple basically said the spec was crap but they might be
               | interested in a good version of it, and Firefox said they
               | didn't like it.
               | 
               | Speaking as engineer.. the Firefox folks don't really get
               | it. You can't just break what sites like StackOverflow
               | and Wikipedia have been doing for years (and in some
               | cases decades) and then say "you were doing the wrong
               | thing." Some version of FPS will ship in browsers,
               | probably in the next 2 years.
               | 
               | Quoting Apple's position directly "[...] Given these
               | issues, I don't think we'd implement the proposal in its
               | current state. That said, we're very interested in this
               | area, and indeed, John Wilander [Safari lead] proposed a
               | form of this idea before Mike West's [Google] later re-
               | proposal. If these issues were addressed in a
               | satisfactory way, I think we'd be very interested. [...]"
               | 
               | Also it was a W3C TAG review. The W3C and IETF are
               | different organizations.
        
         | dmkii wrote:
         | Most, or at least a lot, of the prefetching is for third party
         | libraries (think jQuery, Google Fonts, Facebook Pixel, etc).
         | There's a general speed advantage for users caching commonly
         | used libraries and fonts across sites. Nonetheless I believe
         | prefetch will still have a speed advantage even when the cache
         | is segregated.
        
         | jefftk wrote:
         | Yes, sorry, an image is a bad example. The main issue is with
         | HTML documents. You might open one at the top level, by
         | clicking on it and navigating to it, or you might open one as a
         | child of the current page, by putting it in an iframe. Since
         | they can be opened in both contexts, prefetch doesn't know what
         | to do.
        
           | bo1024 wrote:
           | But is it right that the issue is fetching resources from a
           | different domain than the current one? As a user, just
           | because I've connected to domain A, it doesn't mean I
           | necessarily want my computer to connect to any domain B that
           | A links to. Also, I'd rather developers focus on making small
           | pages that are easier to fetch on demand, and am worried that
           | they'll use prefetch to justify bloated pages. If a page is
           | large enough to need prefetch, then I might not want to spend
           | the data pre-fetching it especially if the click probability
           | is not very high. Between all of these, I'm not convinced of
           | the need for cross-domain prefetch.
           | 
           | Apologies that I'm not a front-end person so this may be
           | naive, but it would be great to hear your thoughts!
        
             | jefftk wrote:
             | Yes, this is only an issue for cross domain prefetch.
             | 
             | With HTML resources, the goal of prefetch is typically not
             | to get a head start on loading enormous amounts of data,
             | but instead to knock a link off of the critical path. The
             | HTML typically references many different resources (JS,
             | CSS, images, etc) and, if the HTML was successfully
             | prefetched, when the browser starts trying to load the page
             | for real it then can kick off the requests for those
             | resources immediately.
        
               | bo1024 wrote:
               | Makes sense, thanks for the reply!
        
       | labster wrote:
       | Prefetching is a cool idea but a lot of us can't actually use it.
       | I tried implementing prefetch in a personal project only to find
       | out uBlock Origin is disabling it. Apparently prefetched
       | resources aren't filtered by extensions, which kind of defeats
       | tracking protection. So I can't even use it for my own project,
       | as I'd rather avoid the trackers. I assume many people are using
       | the same default setting here.
        
         | kevincox wrote:
         | > I tried implementing prefetch in a personal project only to
         | find out uBlock Origin is disabling it.
         | 
         | That seems fine to me. Implement it and if the users don't want
         | it then it doesn't occur. You should still code as if it works.
         | 
         | > Apparently prefetched resources aren't filtered by extensions
         | 
         | This sounds like a browser bug. It should probably be raised
         | against the browsers.
         | 
         | > as I'd rather avoid the trackers.
         | 
         | Again, this is just a result of the browser bug. I see no
         | reason to throw away a nice declarative prefetch simply because
         | browsers forgot to allow filtering.
        
           | willis936 wrote:
           | >That seems fine to me. Implement it and if the users don't
           | want it then it doesn't occur. You should still code as if it
           | works.
           | 
           | Please correct me if I'm misinterpreting this statement. Are
           | you saying it is acceptable if the code breaks if prefetch
           | fails?
        
             | ninjapenguin54 wrote:
             | How would the absence of a prefetch break something?
             | 
             | Isn't this just a performance optimization?
             | 
             | I take the original statement to mean that worse case
             | scenario is extra time to load.
        
               | kevincox wrote:
               | Yes, that is what I meant. You may as well include the
               | prefetch. And if the browser (or the user) doesn't want
               | the prefetch they just get a slower load. If the user
               | enables them the get the snappier experience.
        
       | shuringai wrote:
       | i would rather wait even seconds for a page to load than having
       | my cache bruteforced
        
       ___________________________________________________________________
       (page generated 2021-06-02 23:02 UTC)