[HN Gopher] We rendered a million web pages to find out what mak...
       ___________________________________________________________________
        
       We rendered a million web pages to find out what makes the web slow
        
       Author : simonpure
       Score  : 156 points
       Date   : 2021-01-26 16:43 UTC (6 hours ago)
        
 (HTM) web link (catchjs.com)
 (TXT) w3m dump (catchjs.com)
        
       | ibraheemdev wrote:
       | Previous discussion:
       | https://news.ycombinator.com/item?id=25517628
        
       | jzer0cool wrote:
       | How about a standard how heavy the site (and other meta details)
       | before it is loaded within the html spec? In this way, a user can
       | opt out to continuing to browsing with details from say a short
       | response header.
       | 
       | Is something like already proposed anywhere, or not really a
       | solution to the problem -- thoughts?
        
       | matylla wrote:
       | This reminds me of an experiment [1] we run a couple months back.
       | We crawled top 100 Alexa websites and check the bloat in the
       | images served to billions of users.
       | 
       | [1]: https://optidash.ai/blog/optimizing-images-on-the-worlds-
       | top...
        
         | PeterWall wrote:
         | Thank you for sharing this.
         | 
         | It's interesting to see that even these big websites potentialy
         | still have a lot of room for improvement for their loading
         | times.
        
         | tyingq wrote:
         | That's really cool. It would be interesting to see what that
         | 32% savings adds up to both for typical market bandwidth and
         | AWS like egress rates.
         | 
         | Curious, did your comparison ensure that none of the images
         | lost any detail, etc? Or how much "lossiness" did you introduce
         | to get the 32%?
        
           | matylla wrote:
           | Thanks for chiming in.
           | 
           | It's a perceptually lossless optimization and recompression.
           | 
           | We use saliency detection (trained on an eye-tracker) which
           | tells us where the human vision system would look in the
           | images and optimise those fragments (heatmaps) using our own
           | comparison metrics.
           | 
           | If you're interested in the details shoot me an email to
           | przemek [at] optidash [dot] ai
        
       | joosters wrote:
       | The article talks about a million web _pages_ , but it's actually
       | the top million _domains_. This is an important difference
       | because the top sites could be completely different to the web
       | pages that we view.
       | 
       | For example, Facebook is probably high up the list. But _which_
       | Facebook page do they measure? The  'front page' of a non logged
       | in user, presumably? That would be vastly different from the
       | majority of FB users. Likewise, Wikipedia is also an extremely
       | popular site. But surely most users are looking at specific pages
       | of interest, and not just the front page?
        
       | dvfjsdhgfv wrote:
       | ...and we discovered it's JavaScript. Most of it being just
       | tracking code.
        
       | jiveturkey wrote:
       | a million? is that a lot?
        
       | Santosh83 wrote:
       | Ads are usually the number one culprit why simple "news" pages
       | take sometimes dozens of seconds to finish loading, (and in some
       | egregiously engineered designs, the page is unusable until every
       | last script/ad and font has loaded), and spawn several processes
       | and consume hundreds of megabytes of memory and utilise 50% of
       | your i7/i9 3000MHz CPU, all for displaying a news page or an
       | article.
       | 
       | The JS for actual essential site functionality often pales in
       | comparison to the assets and scripts activated by ads, which
       | simultaneously track you.
        
         | underwater wrote:
         | It's not just the ads. Tag managers are the scourge of
         | performance. Marketing folk just want to stuff them full with
         | dozens of analytics services and third party integrations. And
         | they want to do that without involving pesky engineers.
         | 
         | Some, like Google Optimize, tell you to to clobber performance
         | by hiding all content on your page using CSS until their code
         | has loaded and run.
        
         | mjevans wrote:
         | Also the accursed videos every site wants to cram down your
         | throat.
         | 
         | No no no.
         | 
         | Mostly text, MAYBE some images.
        
           | EForEndeavour wrote:
           | Anyone who directs their web team to implement a floating
           | picture-in-picture autoplay video whose pause and close
           | buttons either render late or don't work at all deserve a
           | special place in hell.
        
           | krylon wrote:
           | There used to be an add-on (or was it even built-in?) to
           | block loading videos _and_ images unless explicitly requested
           | by the user. I sometimes still miss that, especially on
           | mobile.
        
       | dubcanada wrote:
       | What is Amazon Publisher Services, is it a tracking script or
       | something? I've never heard of it.
        
         | tyingq wrote:
         | Basically their version of AdSense. Integrate Amazon shopping
         | ads into your content.
        
       | SimeVidas wrote:
       | Off topic: I haven't visited one million websites (yet), but I
       | can already tell you that your sticky header is annoying.
        
         | ben509 wrote:
         | Sticky Ducky is a nice plugin for that.
        
       | Grustaf wrote:
       | Is it "ads"?
        
         | fckthisguy wrote:
         | It is!
        
       | mfontani wrote:
       | And... surprise! It's analytics, tracking, and advertisements!
        
         | Kaotique wrote:
         | Improve the web and the world: remove all garbage third party
         | javascript from your site.
        
           | Mauricebranagh wrote:
           | Its more the CopyPasta of god knows how many JavaScript
           | frameworks.
        
             | underwater wrote:
             | It's really not. I ran the website for a large news org.
             | Even without ads third party JS was loading twice as many
             | bytes as our React and first party code. The YouTube embed,
             | alone, was larger than our entire first party JS.
             | 
             | What are you going to do as an engineer? Tell sales that
             | expensive marketing solution they bought isn't going onto
             | the site because it loads too much JS?
        
           | fckthisguy wrote:
           | I did just that and it's surprisingly heart warming to know
           | my site could run on a calculator if need be.
           | 
           | No JS, no tracking, no nonsense. 100% content.
        
         | georgeecollins wrote:
         | The irony that Google has initiatives like AMP to speed up page
         | loading-- supposedly. Check out the Texas anti-trust filing for
         | other possible reasons for AMP..
        
           | eMGm4D0zgUAVXc7 wrote:
           | Could you please give a TLDR for those who don't have the
           | time to read the anti-trust filing? Thanks!
        
         | agumonkey wrote:
         | And bloat. Few years ago a news website was 95% side content,
         | the article in question was basically a tweet long sentence.
         | People want to attract and fill the page with more stuff than
         | "necessary".
        
           | partiallypro wrote:
           | Even with bloat you can trim 2-3 seconds off load times by
           | removing trackers. Trackers that do dynamic swaps like
           | CallRail along with LiveChat are the worst offenders.
        
         | jayd16 wrote:
         | That's the most common stuff but it doesn't seem to be the
         | slowest.
        
         | [deleted]
        
         | Mauricebranagh wrote:
         | NoT really blocking JavaScript / CSS, not preloading and
         | stupidly large images.
         | 
         | BTW I do this for a living for some big brands just dropped FCP
         | from 2.6 to 2.4 and CLS .478 to 0.385 with some tweaks to the
         | preload.
        
       | shadowgovt wrote:
       | The speed correlation with HTTP 0.9 and HTTP 1.0 is interesting.
       | While it is probably more the case that newer protocols are
       | serving newer content which is slower for myriad reasons, I find
       | myself wondering if there are interesting correlations regarding
       | what's being served by older protocols. Is it the case the
       | content on the older protocols is living in an intersection of
       | mission critical to somebody (so it has never gone away due to
       | simple lack of maintenance) and either sufficiently functional as
       | is that nobody has seen a need to upgrade its infrastructure, or
       | too deeply tied into a consumer that assumed quirks of a protocol
       | for upgrading to be tractable?
       | 
       | It would be interesting to get a drill down on what is being
       | served on those protocols.
        
         | giantrobot wrote:
         | I'd be willing to bet (a very small sum) that the HTTP 0.9 and
         | 1.0 servers encountered are app servers/frameworks. They're
         | simpler to implement than HTTP 1.1 and don't set expectations
         | on the client they can't meet.
         | 
         | When you've got a fleet of machines behind load balancers you
         | don't need things like a Host field to support vhosts since
         | it's one site to a host. You also don't need pipelining because
         | each connection is a one and done operation.
        
         | joshspankit wrote:
         | I would read that followup article.
        
       | superkuh wrote:
       | It's in the domain name: catchjs. The reason that the web is slow
       | now is javascript. That is, treating a webpage like an
       | application instead of a document. This analysis just assumes all
       | websites are JS-application monsters and only differentiates
       | between them. It misses the point.
       | 
       | What makes it even slower is the practice of serially chain
       | loading JS from multiple domains. Once you're 3 deep in that it
       | doesn't matter how 'fast' your webserver is or how fast and clean
       | your JS code is, the serial loading will slow things to a crawl.
        
         | frompdx wrote:
         | Absolutely. Last night I had the misfortune of experiencing a
         | comcast outage. Although I live in the city, my neighborhood
         | has horrible cellular coverage. Comcast serves their outage
         | data through their account management portal which happens to
         | be a massive, bloated, slow javascript app. It took minutes to
         | load on my phone to find out if there was an outage.
         | 
         | All of this for something that should be no more complicated
         | than entering a zip code and getting back a small page with
         | information about whether or not there is an outage.
        
           | nitrogen wrote:
           | I ran into the same problem last week. Customer service
           | portals are rarely given priority when resources are
           | allocated, but it's pretty astonishing just how slow such a
           | critical service portal loads over mobile tethering, given
           | that is going to be a primary usecase. I'm practically
           | counting down the days until this summer when a new fiber ISP
           | rollout is supposed to reach me.
        
         | jakelazaroff wrote:
         | _> That is, treating a webpage like an application instead of a
         | document._
         | 
         | That's not why. It's ad networks and tracking scripts. Notice
         | that single page application frameworks show up exactly zero
         | times on the list of worst offenders.
        
         | neya wrote:
         | I miss the 90s where HTML pages just contained just super
         | informative text and a bunch of links to other super
         | informative text. Flash was fun for a while with some nice
         | animations and interactivity, then the JS themes took over,
         | they were double edged swords - they enabled us to do things
         | previously we would need separate installable desktop
         | applications for (Email, Powerpoint, etc.) on the browser
         | itself, but it also enabled every newbie hipster to push these
         | tools for everyday pages, the value of the informative plain
         | text was eroded in just under a decade and replaced with
         | subscription popups, with shallow titled articles like "4 ways
         | you can do X..by the way give me your email so I can spam you"
         | 
         | I miss the marquee era, truly.
        
         | dwd wrote:
         | Every site is a JS-application monster because of the 3rd party
         | add-ons people feel they need to have.
         | 
         | Running Google or FaceBook ads? You need analytics, pixels and
         | event trackers to know if you're ads are working optimally.
         | 
         | ReCaptcha v3 is a good way to slow down your site as are hero
         | banner sliders (5% of sites were running GreenSock + Waypoint),
         | also some flavour of interactive chat. Some 3rd party plugins
         | simply don't work with async/defer or still use DOM write().
        
         | skohan wrote:
         | Is there a good solution to rendering something like a react
         | app to a static page? I feel like there are a lot of pages that
         | don't really need client-side html rendering, but they have it
         | because react is a good solution for modular web content.
        
           | andai wrote:
           | Probably not what you're after, but load the page in Chrome,
           | open the Inspector, and copy the entire HTML out. (Which will
           | be a different HTML than the empty stub that "View Source"
           | gives you.)
           | 
           | You could automate it with Selenium (headless Chrome). I
           | think Googlebot does something similar?
        
           | k__ wrote:
           | As far as I know, all major frontend frameworks can render to
           | a static HTML document.
        
             | skohan wrote:
             | How would you do that? I guess through webpack?
        
               | k__ wrote:
               | https://nextjs.org/docs/advanced-features/static-html-
               | export
        
               | jayphen wrote:
               | This outputs static HTML that gets hydrated on the client
               | (as opposed to server rendering the HTML and then
               | hydrating it on the client), which I don't think what was
               | being asked.
               | 
               | I think what the OP was asking was more along the lines
               | of partial hydration (where only parts of the DOM are
               | hydrated by React/other framework) or no hydration (no
               | JavaScript is loaded at all).
               | 
               | 11ty does the latter: https://www.11ty.dev
               | 
               | The React team are working on partial hydration and
               | announced it in December. Vercel did a write up on it
               | here: https://vercel.com/blog/everything-about-react-
               | server-compon...
        
           | _puk wrote:
           | There's the likes of Gatsby [0], which is generally well
           | supported and pairs well with Netlify and a headless CMS such
           | as contentful.
           | 
           | 0: https://www.gatsbyjs.com/
        
           | csomar wrote:
           | react-static (https://github.com/react-static/react-static)
           | is both good and enough. You don't need Gatsby/Next or
           | anything else.
        
           | Rumudiez wrote:
           | You can look into the Gatsby and Next JS frameworks for this,
           | among others
        
         | k__ wrote:
         | I don't know what kind of web you are using. But The sites I'm
         | visiting on a daily basis are, in fact, applications and not
         | documents.
         | 
         | Slack, Asana, GitHub, Gravit Designer, Google
         | Docs/Sheets/Mail...
        
           | smadge wrote:
           | GitHub fits the document model pretty well in my opinion.
        
             | k__ wrote:
             | And the one that doesn't require JS in that list.
             | 
             | Bad example, I give you that.
        
               | superkuh wrote:
               | Gmail also has a perfectly working non-javscript
               | interface. Or you can just use pop3 or imap with a real
               | client and avoid the web crap all together.
        
               | rootusrootus wrote:
               | > Or you can just use pop3 or imap
               | 
               | This is my choice, except that periodically I have to go
               | back to the web interface because searching over IMAP
               | barely works at all. Maybe it's not even implemented and
               | I'm only really searching downloaded mail, I'm not sure.
        
               | postalrat wrote:
               | I'd argue that gmail without js is an application made to
               | act like a document when it just isn't.
        
           | gjs278 wrote:
           | and all of those are slow as hell
        
           | phreeza wrote:
           | The sad reality is that many of these actual applications
           | load faster than pages that could really just be pages but
           | still load vast amounts of JS...
        
             | jug wrote:
             | Static website generators in contexts where they make sense
             | (heaps of the web) are still criminally underused. :-(
        
             | k__ wrote:
             | Yes.
             | 
             | Probably because people who create "real apps" and not just
             | "docs on steroids" know what they're in for right from the
             | start.
             | 
             | If you start with a doc, you can get the feeling that there
             | is much more headroom than there actually is.
        
           | vbezhenar wrote:
           | Everyone lives in its own bubble. Among websites that I
           | routinely visit, only Youtube could be counted as web
           | application, but I'd argue that its main function could be
           | simplified to a <video> tag.
        
           | superkuh wrote:
           | Of those I only go to github. And for now it still works
           | without javascript execution.
           | 
           | Sites that want or have to serve everyone do not use the web
           | as an application because they know it doesn't work. Gov.uk
           | and amazon.com for example went out of their way to work for
           | all people of the world. And Gov.uk has found approximately 1
           | in 96 actual people do not have JS turned on[1].
           | 
           | For fancy businesses with other businesses as their end users
           | you can get away with not supporting everyone. It doesn't
           | effect their income so it doesn't matter. But the reality is
           | that JS-application sites fail 1% of users and for actually
           | serving everyone that 1% matters.
           | 
           | [1] https://gds.blog.gov.uk/2013/10/21/how-many-people-are-
           | missi...
        
             | nitrogen wrote:
             | _And Gov.uk has found approximately 1 in 96 actual people
             | do not have JS turned on[1]._
             | 
             | It's challenging to convince a PM that people who block
             | stats don't show up in their stats, but do still have money
             | to spend. Often, a simple try/catch in the onclick handler
             | that fires analytics events, and a quick happy path test
             | with ublock on, is all it would take to fix a site. Well
             | worth the 1% of extra revenue for a few minutes effort.
        
               | jandrese wrote:
               | I suspect that 1 in 96 figure is mostly people who have
               | uMatrix or NoScript installed, not people who have
               | disabled JavaScript entirely. A little judicious blocking
               | can greatly improve browsing experience at the cost of
               | having to set up rules for sites when you first visit
               | them sometimes. It can be a real hassle when someone has
               | embedded a video you want to watch and there are like 8
               | layers of embedded scripts from different domains
               | necessary before you get to the actual video content.
        
               | edoceo wrote:
               | The math is $500 expense today for the test code. $0.01
               | each time the test runs and +$ for each client acquired.
               | At $20/mo MRR it's 3 clients/year. Easily pays for
               | itself.
        
             | capitainenemo wrote:
             | gmail works without JS too . I use it routinely with w3m
        
         | newsbinator wrote:
         | If everybody would learn to voice-control their smart phones
         | and receive 3x speed audio feedback, battery life would shoot
         | through the roof, since illuminating the screen is so costly,
         | and you'd never need to do it.
         | 
         | Blind people use smartphones this way, so it's not as though
         | the device can't support it.
         | 
         | And it would encourage all websites to be accessible, content-
         | only, no-image sites, and they'd be no-javascript as well.
         | Problem solved.
        
           | chrisco255 wrote:
           | How is this going to solve the chief bandwidth problem on the
           | internet of watching funny cat videos and perusing dank
           | memes?
        
             | [deleted]
        
         | dimitrios1 wrote:
         | I am sorry to say but this is an overly simplistic answer.
         | We've had Javascript since 1995. We've been building great
         | applications with it as early as the 2000s, Yahoo being on the
         | forefront. We had complex charting applications and dashboards
         | in an era where JavaScript was still interpreted by the latest
         | cutting edge browser: Internet Explorer 6. This was also a time
         | of much slower internet speeds across the board.
         | 
         |  _Bad_ software development slows the web down. The web, for
         | what it is and does, is an incredibly fast network. The problem
         | is people no longer care about performance, nor do you have the
         | engineering talent to write performant web applications. Yes
         | some of this can be boiled down to companies not prioritizing
         | performance, and product managers pushing for more features and
         | tracking, but we 've always had these constraints, and tight
         | deadlines, and seemed to deliver just fine. It was part of the
         | expectations.
         | 
         | Our craft is degrading, I hate to say. We've allowed the vocal
         | minority who were shouting "gatekeepers!" to water down the
         | discipline of software development to the point where yes, you
         | install a single npm dependency so you can reverse your array.
        
           | MaxBarraclough wrote:
           | > The problem is people no longer care about performance, nor
           | do you have the engineering talent to write performant web
           | applications.
           | 
           | I don't think that's the whole story. I don't have hard
           | numbers but Electron-based applications seem to reliably be
           | far more bloated than applications using ordinary GUI
           | frameworks.
           | 
           | I agree that deprioritising performance is part of the
           | problem, but as far as I can tell the web is a very poorly
           | performing GUI framework, when it's used as such.
        
           | commandlinefan wrote:
           | > the engineering talent to write performant web applications
           | 
           | Which, for the most part, is celebrated. Nearly nobody
           | actually understands what they're doing anymore - I challenge
           | anybody to point to the Javascript in their web app and
           | explain what the purpose of each file is. Just the purpose,
           | just the top level files. I doubt the majority of web devs
           | can (if they could, they'd realize they don't need at least
           | half of them).
        
           | joosters wrote:
           | _Bad software development slows the web down_
           | 
           | This is an empty statement and another overly simplistic
           | answer. _Of course_ bad software development slows the web
           | down, because you 're defining slow web sites as being badly
           | developed.
        
             | dimitrios1 wrote:
             | Touche -- I am attempting to make a distinction between the
             | general statement of "javascript slows the web down" to a
             | certain type of Javascript slows the web down, namely lazy
             | and careless Javascript slows the web down. We reach for
             | the dependency _first_ instead of carefully analyzing our
             | requirements and making an engineering decision. I 've seen
             | it countless times, fallen victim to it a few times myself.
             | It's really easy to do. No one stops to ask, "perhaps we do
             | not need a hard dependency, and need to include that 15kb
             | library, but rather just need a few functions from it?" or
             | "perhaps a simple algorithm would indeed suffice." We
             | mindlessly reach for complex regex engines to solve simple
             | parsing problems, try to garner "functional programming
             | cred" by reaching for a series of collection functions from
             | our favorite 50kb utility package when a simple c-style for
             | loop would have not only sufficed, but indeed got the job
             | done much faster.
             | 
             | All this results in bloated, unoptimized javascript
             | bundles. To top it off, we under cache, and aggressively
             | cache bust, and ship unminified bundles with oodles of
             | source maps because of "developer experience" I can't tell
             | you how many times I see, to this day, a fortune 500
             | company bundling an entire unminified javascript framework.
             | Wirth's law indeed.
             | 
             | Hope this was a little less empty. These are my
             | observations having done this for a few decades now and
             | managing developers these days.
        
               | joosters wrote:
               | I agree with all you are saying, but I wonder how much
               | this actually comes down to software developers. I
               | suspect (but have absolutely no data!) that for many
               | sites, the pressure to add extra stats, tracking,
               | advertising, monitoring, etc etc, comes from 'above' and
               | it's not the developers' choice to add these mountains of
               | javascript libraries to their sites. Perhaps they should
               | push back, but what leverage do the developers have?
        
               | trhway wrote:
               | pretty much like enterprise software. The similar
               | situation allows for an obvious for many conclusion -
               | there is no money in performance, or more exactly the
               | opportunity cost of performance is higher than benefit.
        
             | hansel_der wrote:
             | > because you're defining slow web sites as being badly
             | developed
             | 
             | whats wrong with that?
             | 
             | living throu the computers/internet/web for 25 years i can
             | attest that there is truth in wirth's law, if only that
             | folks are lazy.
        
               | vlovich123 wrote:
               | Op is pointing out the argument is tautological and
               | doesn't add anything insightful to the discussion. All
               | tautological statements are true.
               | 
               | You could similarly write:
               | 
               | "Lazy software development is why websites are slow
               | because slow websites are caused by lazy development."
               | 
               | "Greedy business practices are why websites are slow
               | because advertising slows down the web and advertising is
               | the result of greedy business practices".
               | 
               | These are feel good statements that don't actually
               | provide any insight on the market forces at play nor
               | provide any path to improving things.
        
               | dimitrios1 wrote:
               | It's a misreading of the point I was making. I was
               | refuting the claim "javascript slows the web down" by
               | qualifying it with "bad javascript slows the web down"
        
               | nightski wrote:
               | Right, but then you conclude that it is just incompetent
               | engineering. That may be true in some cases, but in a lot
               | of cases it's because it's not worth it to the business
               | in terms of time or resources to do good engineering.
               | Great engineering is not just about having great
               | engineers, but a huge budget because it's very, very
               | expensive.
        
               | dimitrios1 wrote:
               | Whatever the causes may be, the end result is poor
               | engineering. I didn't say incompetent, but rather lazy
               | and lower quality. Reading further into my point, I
               | mention that often time this carelessness comes as a
               | result of businesses pressing without giving attention to
               | proper resources. But nonetheless, the end result of all
               | of a businesses choices: hiring decisions, budget
               | decisions, scheduling decisions, feature prioritizations,
               | all the same result in the root cause of poor
               | engineering.
        
               | Xevi wrote:
               | > "Whatever the causes may be, the end result is poor
               | engineering. I didn't say incompetent, but rather lazy
               | and lower quality."
               | 
               | The engineering might be perfectly fine if you take into
               | account the deadlines, budgets and requirements from the
               | clients/managers.
               | 
               | If you pay someone to build a house on a short deadline
               | with a small budget, then you'll obviously get a crappy
               | house. But the skill that went into building that house
               | in such a short time, and on such a low budget, might be
               | extremely high.
               | 
               | Good engineering doesn't mean a perfect product. It just
               | means you managed to deliver the best possible one out of
               | the situation you were in. It's up to the owner of the
               | product to decide if it's good enough.
        
               | chriswarbo wrote:
               | No _true_ scotsman would slow the Web down
        
               | dimitrios1 wrote:
               | argumentum ad logicam. And around and around we go.
        
               | hansel_der wrote:
               | i see. thx for explaining and sry for the noise
        
             | TimPC wrote:
             | I'll try and offer the more controversial point.
             | 
             | Good software development can slow the web down. People are
             | often optimizing for code reusability, code quality, ease
             | of debugging and other advantages of some heavyweight
             | frameworks rather than raw load time. Often this is the
             | right call as engineering is a cost centre. Where it goes
             | wrong is when things get too slow and too bloated to the
             | point you're actually having people abandon the site before
             | load but most of the major frameworks used properly don't
             | offer that extreme a tradeoff. I also have no idea what %
             | of these sites have reached that level.
        
         | [deleted]
        
         | gfiorav wrote:
         | Agreed with this. I dabbled in React for a while, but then I
         | realized how flawed the concept is (it's like downloading a
         | full .dll every time you run a program).
         | 
         | I always ask myself if something can be static now, and if so,
         | I make it static.
         | 
         | I recently worked on a side project to transition [0] to a
         | fully static web. Needless to say, the speed went up.
         | 
         | [0]- https://beachboyslegacy.com/
        
         | 1vuio0pswjnm7 wrote:
         | James Mickens on Javascript:
         | 
         | https://vimeo.com/111122950
         | 
         | Here's a stupid, simple Vimeo downloader instead of a using the
         | massive, slow starting youtube-dl                 #!/bin/sh
         | # usage: curl https://vimeo.com/123456789 | $0        x=$(curl
         | -s `grep -m1 -o https://player.vimeo.com/video/[^\"?]*|sed
         | 's>$>/config>'`|grep -o https://[^\"]*mp4|sed -n \$p)
         | y=${x%/*};y=${y##*/};exec curl -so $y.mp4 $x
         | 
         | More from Mickens:
         | 
         | https://www.usenix.org/legacy/events/webapps10/tech/full_pap...
         | 
         | https://www.usenix.org/system/files/1403_02-08_mickens.pdf
         | 
         | Unlike Mickens, I cannot save the world, and I am not telling
         | anyone else what to do or not to do, but I made the web fast
         | for myself. Hence I am very skeptical of claims that "the web
         | is slow". Web servers, the network and computers are plenty
         | fast and still getting faster. I do not define "the web" as
         | certain popular browsers, CSS, Javascript, etc. or whatever web
         | developer tell me it is. Those are someone else's follies. I
         | define it as hyperlinks (thus, a "web") and backwards
         | compatible HTML. Stuff that is reliable and always works. To
         | "make the web fast", I follow some simple rules. I only load
         | resources from one domain, I forgo graphics, and I do not use
         | big, complex, graphical, "modern" web browsers to make HTTP
         | requests.
         | 
         | I do not even use wget or curl (only in examples on HN). I
         | generate the HTTP myself using software I wrote in C for that
         | purpose and send using TCP clients others have written over the
         | years. There are so many of them. With a "modern" SSL-enabled
         | forward proxy or stunnel, they all work with today's websites.
         | "Small programs that do one thing well", as the meme goes.
         | 
         | Obviously, I still need the ever-changing, privacy-leaking,
         | security risk-creating, eye-straining, power-consuming, time-
         | wasting, bloated, omnibus browsers for any sort of serious
         | transaction done with web, e.g., commerce, financial, etc.
         | However that is a small fraction of web use in my case.
         | 
         | For me, using the web primarily comprises searching, reading
         | and downloading. I never need Javascript for those tasks. I can
         | do those them faster without the popular broswers than I can
         | with them. The less interaction the better. I use automation
         | where I can because IMO that is what computers were made for.
         | "The right tool for the job", as the meme goes.
         | 
         | To think how much time and energy (kwH) has been devoted to
         | trying to make Javascript faster as a way to make websites
         | faster is, well, I won't think about it. Those working in the
         | "software industry" and now "tech" are highly adept at creating
         | the problems they are trying solve. Unfortunately today as we
         | try to rely on software and the web for important things, we
         | all have to suffer through that process with them.
         | 
         | By not using the popular browsers for a majority of web use, I
         | have minimised the suffering of one user: me. The web is fast.
        
           | 1vuio0pswjnm7 wrote:
           | The title of this blog post refers to "the web" but it mainly
           | discusses "rendering". IMHO, those are two different things.
           | The later is concerned with graphical Javascript- and CSS-
           | enabled browsers. The former is concerned with web servers.
        
         | dgb23 wrote:
         | > It's in the domain name: catchjs. The reason that the web is
         | slow now is javascript. That is, treating a webpage like an
         | application instead of a document. This analysis just assumes
         | all websites are JS-application monsters and only
         | differentiates between them. It misses the point.
         | 
         | A large part of the web are static documents and they should be
         | developed as such.
         | 
         | But I 100% disagree that this is how every website should be.
         | We're given these amazing technologies, the internet,
         | computers, libraries, tools and creativity and now we should
         | just stick with sending and looking at formatted text?
         | 
         | I don't think so!
         | 
         | https://ciechanow.ski/gears/
         | 
         | https://www.maria.cloud/
         | 
         | https://nextjournal.com/
         | 
         | > What makes it even slower is the practice of serially chain
         | loading JS from multiple domains. Once you're 3 deep in that it
         | doesn't matter how 'fast' your webserver is or how fast and
         | clean your JS code is, the serial loading will slow things to a
         | crawl.
         | 
         | Bundling, code splitting, tree shaking and asynchronous
         | loading, pre-optimizations help here too.
        
       | btbuildem wrote:
       | My guess would've been 1) ads / tracking and 2) bloated JS
       | frameworks.
        
       | ffpip wrote:
       | https://ublockorigin.com
       | 
       | Install this wonderful extension to save hours of your time and
       | gigabytes of your bandwidth.
        
         | kowlo wrote:
         | Any good alternative for Safari?
         | 
         | I knew I would get alternative browser recommendations because
         | of my phrasing... but I was also looking forward to them!
        
           | ben509 wrote:
           | I run Wipr on Safari. It's quite cheap, and since I don't use
           | Safari as my main browser, the fact that there's nothing to
           | configure is a bonus.
           | 
           | It was also easier to install it on my parents' computers
           | than to convince them to change browsers.
        
           | alpaca128 wrote:
           | Not sure how applicable this is to Macs but I found that
           | blocklists based on the hosts file cover almost all ads. The
           | only exceptions I encountered were YouTube ads as well as ads
           | hosted directly on the website, which is pretty rare
           | nowadays.
        
           | [deleted]
        
           | dev_tty01 wrote:
           | As far as Mac Safari recommendations, I've tried most of
           | them. _Better_ seems the leanest and most effective for me. I
           | have no relationship with the developer.
           | 
           | https://better.fyi
           | 
           | From their website: "Better is hand-crafted by Small
           | Technology Foundation, a tiny two-person-and-one-husky not-
           | for-profit striving for social justice in the digital network
           | age. We curate our own unique blocking rules for Better based
           | on the principles of Ethical Design."
           | 
           | Nice app.
           | 
           | Be aware, Better actually allows most compact, low compute,
           | non-tracking ads. Anyone who wants to serve me respectful ads
           | that don't abuse my privacy or my compute resources are
           | absolutely welcome on my system. Happy to help. Non-
           | respectful ads are not welcome.
           | 
           | While I'm promoting small Indie browser extension makers, I
           | also like the StopTheMadness extension. This kills lots of
           | rude click/function hijacking that is done by many obnoxious
           | web pages. It also stops a lot of tracking code. Again, I
           | have no relationship with the developer.
           | 
           | https://underpassapp.com/StopTheMadness/
           | 
           | Between the two of these, browsing becomes much less user
           | hostile.
        
           | Xavdidtheshadow wrote:
           | AdGuard is a free extension that, as far as I can tell, is
           | on-par with uBlock Origin. It's been great so far.
        
             | benbristow wrote:
             | I've flip-flopped between uBlock and AdGuard for a while
             | now. Generally find AdGuard to be slightly better and has a
             | much nicer UI.
             | 
             | Also has a neat broken site reporting system in it which
             | automatically generates a GitHub issue to fix the filter
             | lists from a simplified form. Automatically prioritises
             | sites via an algorithm, probably their Alexa ranking or
             | something similar.
             | 
             | https://github.com/AdguardTeam/AdguardFilters/issues
             | 
             | I've found the issues get fixed pretty quickly too.
             | 
             | They've got an iOS app as well which integrates with the
             | Safari content blocking system.
        
               | beagle3 wrote:
               | uBlock or uBlock Origin? They are not the same thing.
        
               | benbristow wrote:
               | Sorry, uBlock Origin
        
           | xtracto wrote:
           | /etc/hosts
           | 
           | https://github.com/Ultimate-Hosts-
           | Blacklist/Ultimate.Hosts.B...
        
           | whycombagator wrote:
           | For iOS: download Firefox focus and set it as Safaris content
           | blocker.
           | 
           | For desktop & mobile: use something like nextdns/adguard or
           | pihole & consider installing it on your home networks router.
        
             | r00fus wrote:
             | Wow, never knew FF Focus could be a content blocker on iOS.
             | I'll try it over Wipr.
        
             | beagle3 wrote:
             | I installed Magic Lasso and Firefox independently, and also
             | together, and together they were basically perfect.
             | 
             | (It's been 2 years or so, for all I know, they have both
             | advanced to the point of being sufficient alone - but I did
             | not experiment, since everything just works so well)
        
             | [deleted]
        
           | viktorcode wrote:
           | I've developed my own blocker for Safari, mostly because I
           | was tired of other blockers breaking too many sites for me.
           | So, it is designed to be less aggressive in filtering.
           | 
           | Give it a try if you want. It's free. https://ads-
           | free.app/Ads-Free!%20Desktop/
        
           | ksec wrote:
           | Just use NextDNS. DNS blocking is just so much more
           | efficient. And gets the job done 98% of time.
        
           | hetspookjee wrote:
           | Firefox. But kidding aside, for what I've found on adblocks
           | for Safari you need to pay and I believe those are quite
           | inferior in blockage to uBlock.
        
             | S_A_P wrote:
             | Have to agree here. I actually quite like safari, but cant
             | use it due to no uBlock Origin which is unfortunate. uBlock
             | and Facebook Container have made all the difference for me.
        
             | jws wrote:
             | If you are in the "I wish Safari blocked ads" boat but
             | don't know where to go, let me recommend an out...
             | 
             | There are a lot of reasonable choices, I've been using Wipr
             | for years. It does a good job of blocking ads and costs
             | about $2. There are macOS and iOS versions, they update
             | their lists, and because of the architecture have no access
             | to your browsing so can never be tempted to start farming
             | you.
             | 
             | (No affiliation, I haven't done an extensive comparison,
             | but this one works and isn't expensive and you can stop
             | worrying about what to do if you do this now.)
        
           | rndomsrmn wrote:
           | https://github.com/notracking/hosts-blocklists
           | 
           | Use this for network wide blocking of all sorts of virtual
           | garbage. Not only for safari, but all your locally connected
           | devices.
        
           | stirner wrote:
           | I use AdGuard, a content blocker that supports the major ABP
           | lists like EasyList, Fanboy's, etc. I wish there was a
           | content blocker that support automatic translation of ABP
           | lists into content blocker rules--I've been working on one
           | but haven't found the time to finish it.
        
             | dawnerd wrote:
             | To expand on adguard, you can self host their dns blocker
             | via docker. It works really well and IMO a much nicer
             | experience than trying to get pihole perfect.
        
           | hpoe wrote:
           | I find Firefox to be a great alternative for Safari.
           | 
           | I joke but from what I know it is only available for Firefox
           | and it either will soon stop working/does not work on Chrome.
        
             | nawgz wrote:
             | uBlock origin works fine on Chrome today as it has for
             | years. With their new "no third party cookie" BS we might
             | see it break but until then I don't think they can be so
             | user hostile
        
           | alcasa wrote:
           | Adguard, its also usable for free and doesn't have shady
           | deals with ad companies.
        
         | catacombs wrote:
         | It's 2021. Who isn't using an ad blocker?
        
           | paxys wrote:
           | 75% of internet users
        
           | vbezhenar wrote:
           | I don't use uBlock. But I learned how to write chrome
           | extensions and it turned out extremely easy to insert my own
           | CSS and JS snippets to the selected pages. So I just added
           | few URL filters to remove most obnoxious tracking and ads, I
           | added very few CSS edits to the selected websites to remove
           | popups and I added some JS to youtube to remove its ads. Web
           | is pretty fast and usable for me. I did not cut every ad, but
           | I don't often browse new websites and I'm okay with some ads
           | as long as they're not very bad.
           | 
           | The reason I don't use uBlock is because I think that it's
           | overkill for me to run thousands of filters for every website
           | in the world. And also I like the fact that I'm in control of
           | my user agent. For example recently I turned off feature on
           | some website which paused video when I switched to another
           | tab. I did not like that feature, so I disabled corresponding
           | JS handler, simple as that.
        
             | eMGm4D0zgUAVXc7 wrote:
             | > The reason I don't use uBlock is because I think that
             | it's overkill for me to run thousands of filters for every
             | website in the world.
             | 
             | In return you get thousands of lines of tracking and
             | advertisement JavaScript running on your machine for almost
             | every website in the world.
             | 
             | Is that better? ;)
        
           | commandlinefan wrote:
           | I don't use an ad blocker - I feel bad since ad revenue is
           | the only thing most of these sites have (OTOH, I don't run
           | ads on my own blog because I don't like what the ad-supported
           | internet has become).
        
           | hansel_der wrote:
           | have been pressing this for around 15years meanwhile the
           | percentage of adblock users actually got smaller.
           | 
           | i wonder if the the fact that the most popular ad-tech
           | company also producing the most popular mobile os has
           | something to do with it.
        
             | Leherenn wrote:
             | Does this number includes mobile browsers? Given the big
             | mobile browsers do not have adblockers (or even extensions
             | altogether?), that would explain it.
        
               | jamesgeck0 wrote:
               | Mobile Safari has had content blockers for years.
        
               | beagle3 wrote:
               | And yet, surprisingly, almost no one is aware of it.
               | 
               | I have informed many of my technophile, uBO/ABP friends
               | and colleagues that it's possible, and not one of them
               | was aware before I told them.
        
       | sneak wrote:
       | > _There 's a handful of scripts that are linked on a large
       | portion of web sites. This means we can expect these resources to
       | be in cache, right? Not any more: Since Chrome 86, resources
       | requested from different domains will not share a cache. Firefox
       | is planning to implement the same. Safari has been splitting its
       | cache like this for years._
       | 
       | [most of the top common resources are Google (and FB) ad
       | tracking]
       | 
       | I read this as Google is willing to spend millions upon millions
       | to move huge amounts of additional unnecessary network traffic to
       | make sure _only they_ can reliably track most people across the
       | whole of the web.
        
         | damagednoob wrote:
         | I don't use any externally hosted scripts on sites I develop.
         | 
         | Security is part of the reason but the bigger problem I find is
         | tying uptime to somebody else whom I have no control over.
        
         | minitoar wrote:
         | Gotta spend money to make money
        
         | ivanbakel wrote:
         | Even if Google didn't have a stake in targeted advertising,
         | shared caches lead to easy identification - browsers have a
         | duty to close off any avenue by which users can be tracked.
        
       | tyingq wrote:
       | JQuery being more prevalent than Google Analytics is a surprise
       | to me.
        
         | yurishimo wrote:
         | jQuery is a default dependency for many WordPress blogs. Not
         | all of those are run by folks who want/care about analytics.
        
           | joshspankit wrote:
           | I had hoped at some point in the breakdown that they
           | differentiated between Wordpress and.. everything else I
           | guess.
           | 
           | Wordpress itself probably skewed the numbers significantly.
        
           | tyingq wrote:
           | Sure. I was assuming that culling to the "top million" would
           | skew things in favor of GA. Clearly it didn't, but I was
           | surprised. And you're probably right. All those Wordpress
           | instances does drive a lot of the stats.
        
       | wruza wrote:
       | Skimming through the article didn't give a meaningful tl;dr -
       | it's complicated and there is no prevalent answer. Personally I
       | got two (naive) things:                 - latency is bad, do a
       | single request at start       - don't use jquery       - ui
       | frameworks are not part of the issue(?)
        
         | nearbuy wrote:
         | They found jQuery is correlated with longer times to
         | interactivity. It doesn't necessarily cause it.
         | 
         | The latest jQuery is 85 KB (~30 KB gzipped), and doesn't do
         | anything time consuming on load. My guess is the types of sites
         | that don't use jQuery happen to be the types of sites that are
         | faster.
        
         | franga2000 wrote:
         | jQuery really isn't a problem - it's just a light-ish wrapper
         | around the native JS APIs. The problem is JS itself or rather
         | the fact that it's used where it really shouldn't be.
        
           | mjevans wrote:
           | jQuery isn't the problem as much as jQuery "needing" to exist
           | is the problem.
           | 
           | Developers still have to make webpages that work with
           | customers that are stuck on old phones which will never see
           | another firmware update, corporate desktops that have some
           | forsaken frozen copy of IE6 that gets used for some legacy
           | corporate platform (and then end users either don't or
           | actually aren't allowed by IT/security to use any other
           | browser), people with PCs that 'still work' (and if you're
           | lucky have Windows XP with whatever it came with)... etc.
           | 
           | Generally we can't have nice things because that legacy cruft
           | is going to be around until all of that legacy falls over and
           | dies so hard that even laypeople consider it laughable to
           | assume it still works. It's got to be like those old blenders
           | or fans... wait they still work; we're cursed forever.
           | 
           | Want the Internet to be fast? jQuery is a defacto standard
           | that needs to be made native in browsers and then when called
           | for not loaded as script but already existing as native code
           | and management models.
        
             | franga2000 wrote:
             | Are those legacy-supporting sites really the slow ones
             | though? Targeting old software usually also means targeting
             | old hardware, so those sites tend to perform reasonably
             | well (and in general, old browsers didn't give developers
             | many options to slow things down).
             | 
             | The problem really starts once developers can start
             | ignoring old platforms and (ab)using the full power of a
             | modern browser. The problem isn't any specific library, but
             | the mindset developers use to develop web sites/apps.
             | 
             | Bundling 20 dependencies and 10k lines of "compiled" JSX
             | into one giant 5MB uncachable blob that needs to get parsed
             | in its entirety, then shoot some requests off then need to
             | complete, be parsed, then compile some HTML and CSS that
             | then need to again get parsed by the browser and then
             | finally, the the browser's rendering engine gets to start
             | laying things out and drawing the first pixels - that is
             | the problem.
             | 
             | The browser is a document viewer, not an application
             | runtime, so you should be giving it documents, not
             | programs. Yes, JavaScript is necessary, but just like you
             | shouldn't be using HTML for layout and CSS for content and
             | graphics (although CSS-only art is impressive!), you
             | shouldn't be using JavaScript for content and styling.
        
           | wruza wrote:
           | It's understandable that keep it simple is a good technical
           | rule. But it is also interesting for app developers who have
           | to use some kind of inplace dom modifications. Personally, I
           | don't (yet) see how an in-browser html parser could be much
           | faster than createElement from e.g. hyperscript driven by
           | pojo. It is basically the same recursive process, with an
           | exception that small updates generate small modifications in
           | the latter.
        
             | giantrobot wrote:
             | With jQuery (as an example) when you change elements it
             | does so on the live DOM. This often causes the browser to
             | re-layout, re-render, and re-draw content. If you create a
             | thousand elements in a loop that's tons of extra work for
             | the browser and blocks interactivity. You can do batch
             | updates or non-display elements but it's not (or at least
             | wasn't) built into jQuery. It's faster but more
             | complicated.
             | 
             | With React and the like they operate on a copy of the DOM
             | and automatically batch updates to it. All of the changes
             | are applied at once so there's fewer layout, render, and
             | draw events.
             | 
             | Part of the problem with jQuery is there's a cohort of web
             | "developers" that grew up with it. Instead of learning the
             | actual web stack they learned jQuery. Everything is put
             | together (slowly) with jQuery. Even if they've been
             | promoted out of junior positions they're not PMs or
             | marketing drones requiring slow scripts because they're
             | written in a style they understand.
        
       | spankalee wrote:
       | dominteractive isn't a good metric for this. Many sites are
       | perceivably slow because of rendering work they're doing after
       | dominteractive. They should be looking at a more user-centric
       | metrics like Largest Contentful Paint: https://web.dev/lcp/
        
       | eric_trackjs wrote:
       | Off-topic but I'd never heard of CatchJS before. As the founder
       | of TrackJS[1] I can't help but feel they were heavily inspired by
       | our product... almost _too_ inspired considering their logo and
       | marketing copy.
       | 
       | (We've been around since 2014, so we pre-date them by 4 years)
       | 
       | [1] https://trackjs.com
        
         | morgosmaci wrote:
         | Wait a company logo about javascript that uses {}? Maybe thrown
         | into a single color circle? Mild Shock.
         | 
         | Edit: I will give you that their web page header and pricing
         | looks very similar.
        
         | trevor-e wrote:
         | You both have an incredibly generic logo/header/copy template
         | so I'm not really sure what you're trying to imply. Your site
         | copies the format of Stripe and they were around in 2010, pre-
         | dating you by 4 years.
        
       | Kaotique wrote:
       | It really shows why big tech own the keys to the internet. We
       | hand them all our browsing history.
        
         | wruza wrote:
         | The sites we visit do. I didn't tell spammers my phone number
         | either, but every org that requires that field to be filled
         | thinks it has an obligation to sell it to them. Skipping few
         | steps, it is trust problem. We don't trust each other, thus
         | some money should be spent on being top trusted and top wanted
         | at open markets/serps. We have to trust big tech because
         | staking money is the only way to the trust and biggest money
         | concentrates at big tech. What should be done?
        
       | randompwd wrote:
       | It's a blog article yet has no date on it. Why are these sites so
       | obnoxious and secretive. Let us know when the blog was written.
       | 
       | 34 days ago:
       | 
       | https://news.ycombinator.com/item?id=25517628 (115 comments)
       | 
       | https://itnext.io/we-rendered-a-million-web-pages-to-find-ou...
       | (Dec 22, 2020 - repost from CatchJS)
       | 
       | WRITTEN BY
       | 
       | Lars Eidnes Chief error catcher at https://catchjs.com
        
       ___________________________________________________________________
       (page generated 2021-01-26 23:01 UTC)