[HN Gopher] Speedometer 3.0: A shared browser benchmark for web ...
___________________________________________________________________
Speedometer 3.0: A shared browser benchmark for web application
responsiveness
Author : cpeterso
Score : 128 points
Date : 2024-03-11 16:17 UTC (6 hours ago)
(HTM) web link (browserbench.org)
(TXT) w3m dump (browserbench.org)
| Vinnl wrote:
| Pretty neat:
|
| > This is the first time the Speedometer benchmark, or any major
| browser benchmark, has been developed through a cross-industry
| collaboration supported by each major browser engine: Blink/V8,
| Gecko/SpiderMonkey, and WebKit/JavaScriptCore.
| kossTKR wrote:
| Hopefully at some point actual click latency will be fixed in
| general after some dark decades.
|
| Still incredible that a gameboy or an 80's computer with a CRT
| feels more responsive than most devices these days.
|
| Bring back tactility. I'm convinced the choppiness and weird
| waits are actually psychologically stressing us out. That's why
| good keyboards + old low latency OS'es or typewriters are so
| soothing to use.
| coldblues wrote:
| Latency will always be an issue as long as developers use web
| technologies. Nothing beats native.
| CharlesW wrote:
| > _Hopefully at some point actual click latency will be fixed
| in general after some dark decades._
|
| Meaning, the 300ms delay (if the site developer does no
| optimization) with mobile browsers?
| tentacleuno wrote:
| Context for the 300ms delay:
| https://developer.chrome.com/blog/300ms-tap-delay-gone-away/
| CharlesW wrote:
| Some other ways to mitigate:
| https://www.sitepoint.com/5-ways-prevent-300ms-click-
| delay-m...
| olliej wrote:
| What is the latency being referred to here?
|
| I don't see noticeable lag on pressing/tapping buttons or other
| ui components in day to day browsing, even on my quite old
| iPhone.
|
| There are obviously ways to make delays in web content anyway
| (user action->synchronous network request being the canonical
| one), but assuming there's nothing silly like that lag isn't an
| issue I've noticed.
|
| Actual execution latency is something I worked on for many
| years in JSC, and so there are a lot of engine optimizations to
| reduce that latency as much as possible (the interpreter
| itself, the interpreter performance, byte code caches,
| hilarious amounts of lazy parsing and source skipping, etc) so
| even the first time you have a ui element trigger code there
| shouldn't be any significant delay.
|
| Obviously if a developer makes poor choices there's only so
| much you can do, but by and large there aren't that many bad
| things a web developer can do that a native dev can't also do
| (and devs in both environments frequently do :-/).
| awesomekling wrote:
| This is fantastic! Speedometer 1.0 was a breath of fresh air, and
| 2.0 was a much-needed refresh, but it's really been showing its
| age in recent years. 3.0 looks like a solid upgrade with many new
| kinds of sub-tests, contemporary frameworks, etc.
|
| I'm looking forward to sucking at this, and then slowly and
| systematically improving. :^)
| aeyes wrote:
| I'd love some videos seeing you try to improve the score or
| trying to get it to run :^)
| riquito wrote:
| Link to the actual test
|
| https://www.browserbench.org/Speedometer3.0/
| ornornor wrote:
| On Firefox iOS: open in a new window because you won't be able
| to press back to come back to this discussion easily.
| ChrisArchitect wrote:
| Related annoucement post:
|
| _Improving Performance in Firefox and Across the Web with
| Speedometer 3_
|
| https://hacks.mozilla.org/2024/03/improving-performance-in-f...
| om2 wrote:
| Another related announcement, with a bit more detail on
| specifics of the benchmark changes (and some history of
| Speedometer):
| https://webkit.org/blog/15131/speedometer-3-0-the-best-way-y...
| murat124 wrote:
| Got "Infinity" after testing my Firefox Dev Edition 123b9. Is
| this because of my FF config because my browser is perhaps
| blocking something (e.g. canvas, fingerprint, etc) or any result
| north of 140 is considered infinity?
| julienwaj wrote:
| Do you see anything (errors or something else) in the web
| console?
| julienwaj wrote:
| Please file a bug there if necessary =>
| https://github.com/WebKit/Speedometer/issues/new :-)
| CharlesW wrote:
| Try it here: https://browserbench.org/Speedometer3.0/
|
| Very unscientific results using a Mac Studio - Chrome: 20.4,
| Safari: 17.9, Firefox: 20.1.
|
| Safari on an iPhone 13 Pro Max - 16.5.
| om2 wrote:
| What Safari version are you using? For me, with 17.4, Safari is
| ahead of Chrome and Firefox, though it is close if you use dev
| channel.
| CharlesW wrote:
| macOS 14.4 for the Mac Studio tests, and iOS 17.4 for the
| Safari-on-iOS test.
| lapcat wrote:
| This may be a dumb question, but what do the scores even mean? Is
| this explained anywhere? Neither
| https://browserbench.org/Speedometer3.0/about.html nor
| https://browserbench.org/Speedometer3.0/instructions.html appear
| to explain it. Are lower scores better, or higher scores?
| tux3 wrote:
| Higher is better. The analogy is speed. You want more speed.
|
| It's not a physical speed, just a benchmark number. Think of it
| as arbitrary units, which allows you to compare different
| version of browsers on the same machine.
| julienwaj wrote:
| If the analogy isn't working for you, you can see the actual
| durations when you click on "Details".
| lapcat wrote:
| > You want more speed.
|
| On the other hand, premature optimization is the root of all
| evil.
|
| > Think of it as arbitrary units, which allows you to compare
| different version of browsers on the same machine.
|
| That's precisely the problem. It's arbitrary, meaningless.
| Without any physical units, I don't know what's good or bad,
| fast or slow. And why do the scores go from 0 to 140 when the
| web browsers are all getting approximately 20?
| julienwaj wrote:
| The score goes from 0 to 140 so that there's some room for
| when the computers will get faster. When we started working
| on this, all browsers were maxed at 140, so the computation
| got changed.
| jeffbee wrote:
| I thought the front page goes to 140 just because it is
| modeled after actual GM dashboard speedometers produced
| ~1960-1990, sometimes having range 0-85 MPH, or 0-140km/h
| in metric markets.
| om2 wrote:
| Nothing actually stops the score from going higher than
| 140, it will just max out the visual dashboard at that
| point. On Speedometer 2, Safari on M3 Macs ended up over
| 500. At scores that high it's harder to have intuition,
| thus the changed scale of the new test.
| IainIreland wrote:
| Yes.
|
| The speedometer graphic was inherited from Speedometer 2.
| When Speedometer 2 was released, scores were in a
| reasonable car-speed range. The combination of hardware
| and software improvements meant that early versions of
| Speedometer 3 (which includes a subset of Speedometer 2
| tests) were consistently scoring above 140, so we
| adjusted the scaling factor (IIRC, by ~20x) to give
| plenty of room for future improvements.
| treyd wrote:
| > On the other hand, premature optimization is the root of
| all evil.
|
| The web ecosystem is extremely mature and widely used. The
| workloads are fairly well understood. It is a magic unit,
| but the factors that go into it have a lot of thought from
| real-world scenarios. Bringing up "premature optimization"
| is completely irrelevant because that's not what this is,
| it's about as far as you can get from that.
| lapcat wrote:
| > that's not what this is, it's about as far as you can
| get from that.
|
| I don't know what it is. How exactly does the score
| relate to the experience of the web browser user?
|
| I'm a browser extension developer, and I've occasionally
| had people ask me about Speedometer scores, but I have no
| idea what they're supposed to mean or what to tell these
| people.
| Vinnl wrote:
| They say something about the speed of the browser, so it
| doesn't really make sense to ask extension developers
| about it, I don't think. Possibly that your extension
| might make the browser slower, so you could compare
| scores with and without the extension and see whether it
| negatively affects performance? (Although I'm not sure it
| can necessarily tell you anything about to what _extent_
| it affects performance, only that it does.)
| lapcat wrote:
| > Although I'm not sure it can necessarily tell you
| anything about to what extent it affects performance
|
| Exactly.
| om2 wrote:
| Speedometer measures web app responsiveness. Roughly, it
| simulates a series of user operations on web apps built
| with various frameworks (as well as vanilla JS), and
| measures the time it takes to complete them and paint the
| results to the screen.
|
| The score is a rescaled version of inverse time - if it
| goes up, that implies the browser can handle more user
| operations per second, or alternately, it takes fewer
| milliseconds to complete a user operation in a complex
| web app.
| lapcat wrote:
| > Speedometer measures web app responsiveness.
|
| We know that, but you haven't said anything specific
| about scores other than higher scores are faster, in an
| abstract sense, which has already been established.
| IainIreland wrote:
| "The score is a rescaled version of inverse time" is the
| key here.
|
| If you run all the tests in half the time, your
| Speedometer score will double. If your score improves by
| 1%, it implies that you are 1% faster on the subtests.
|
| (There are probably some subtleties here because we're
| using the geometric mean to avoid putting too much weight
| on any individual subtest, but the rough intuition should
| still hold.)
| IainIreland wrote:
| (I work on SpiderMonkey.)
|
| Benchmarking is hard. It is very easy to write a
| benchmark where improving your score does not improve
| real-world performance, and over time even a good
| benchmark will become less useful as the important
| improvements are all made. This V8 blog post about Octane
| is a good description of some of the issues:
| https://v8.dev/blog/retiring-octane
|
| Speedometer 3, in my experience, is the least bad browser
| benchmark. It hits code that we know from independent
| evidence is important for real-world performance. We've
| been targeting our performance work at Speedometer 3 for
| the last year, and we've seen good results. My favourite
| example: a few years ago, we decided that initial
| pageload performance was our performance priority for the
| year, and we spent some time trying to optimize for that.
| Speedometer 3 is not primarily a pageload benchmark.
| Nevertheless, our pageload telemetry improved more from
| targeting Speedometer 3 than it did when we were
| deliberately targeting pageload. (See the pretty graphs
| here: https://hacks.mozilla.org/2023/10/down-and-to-the-
| right-fire...) This is the advantage of having a good
| benchmark; it speeds up the iterative cycle of
| identifying a potential issue, writing a patch, and
| evaluating the results.
| lapcat wrote:
| This doesn't say anything about what the scores mean.
|
| 21 is apparently better than 20, but how much better? You
| could say "1 better", tautologically, but how does that
| relate to the real world?
|
| Driving a car 1 mile per hour faster may be better, in a
| sense, but even if you drove 24 hours straight, it would
| only gain you 24 total miles, which is almost negligible
| on such a long trip. Nobody would be impressed by that
| difference.
| bigfudge wrote:
| I guess that's why it's fairly interesting to see scores
| thrown out in this thread on random hardware. It's
| anexdata, but gives a sense of the spread/variance of
| scores for common platforms. I don't think this is a
| number that is ever going to make much sense for
| consumers to use because without this sort of context
| it's just going to be like the spinal tap 'this one goes
| to 11' sort of problem.
| charcircuit wrote:
| It means it is 5% faster. You are overcomplicating it.
| lapcat wrote:
| Percentages are rarely informative without an absolute
| reference.
|
| A 5% raise for someone who makes $20k per year is $1k,
| whereas a 5% raise for someone who makes $200k is $10k,
| which would be a 50% raise for the former.
| iknowstuff wrote:
| Crashes, or gets killed, by Safari on iOS 17.3 on iPhone 15 Pro.
| CharlesW wrote:
| Try it with a Private tab. Mine did the same until I did that,
| after which I got 16.5 on an iPhone 13 Pro Max.
| tdudhhu wrote:
| On Firefox mobile I got a score of 3.
|
| So while Firefox is now super fast the performance of webapps
| might still be very bad on mobile.
|
| And I think this applies to all modern browsers: they are fast at
| rendering very slow webapps and websites.
| nusl wrote:
| Are you on iOS or Android?
| kiwijamo wrote:
| Probably hardware dependent too as I get 8.11 on my fairly old
| Samsung S21. My newer laptop gets 14.9 so I'm not sure whether
| I agree performance is "very bad" on mobile -- it may be less
| performant but that is surely to be expected given the hardware
| constraints on mobile. YMMV.
| vient wrote:
| Got 7.89 on laptop with Core Ultra 155H in battery mode, 17.9
| in AC mode. Zenfone 10 shows 7.75. All in Firefox.
| wswope wrote:
| If you're on iOS, Apple gates Firefox from using JIT JS
| compilation which massively hinders performance.
|
| E: I was wrong/extremely out-of-date - it does have JIT but
| relies on the Safari/Webkit implementation. In ancient versions
| of iOS, the WebView widget that third-party browsers were
| forced to use had JIT disabled, but that's long since changed.
| ornornor wrote:
| For this benchmark I get 12.0 in FF and 13.9 on safari. I'm
| glad it's not so big of a gap as I already pay a penalty for
| not wanting to use safari on iOS (in terms of integration
| with iOS and usability from Apple's artificial limitations on
| third party browsers)
| capitainenemo wrote:
| It's more than that right? They have to use all of webkit. So
| it's pretty much a reskinned Safari in terms of
| layout/rendering/JS.
| om2 wrote:
| That's not accurate. Firefox and all third party WebKit apps
| get the same JOT as Safari.
| capitainenemo wrote:
| Right, but expecting the same behaviour from "Firefox" on
| iOS as on desktop is just not going to happen, since they
| have no control over the core engine. It's why, in general
| using iOS devices for cross-browser testing is pretty
| useless.
| om2 wrote:
| This is a fair point, though it is possible for app-level
| things that the browsers do to regress performance from
| the baseline pure engine level.
|
| In this case, I think the 3 score must be either very
| old/low-end Android hardware or a measurement error. I
| don't think any iOS browser gets 3.x scores, on even
| remotely modern hardware.
| mattlondon wrote:
| Approx 6.3 and 6.7 in chrome and Firefox respectively on my
| low-end Pixel 6a
| SushiHippie wrote:
| Weird I have the Google Pixel 8 Pro and get a score of 4.83
| in Fennec and 6.61 in Vanadium (hardened chromium fork).
| jeppesen-io wrote:
| 11.6 on s23
|
| Sounds like old hardware or some other issue
| WarOnPrivacy wrote:
| 10.9 on Win10 2017 Xeon @ 4.3GHz w/ 64GB. This instance of Ffx
| has had ~50 tabs (across 5 containers) open for a couple of
| weeks.
|
| What do these values represent? I can guess the last. Unsure of
| the first 2. 96.84 +-(5.0%) 4.87 ms
| dvngnt_ wrote:
| 4.7 brave on a pixel 7
| havaloc wrote:
| 13.1 on MBA 15" M2.
| butz wrote:
| We should stop "speed-shaming" browsers and focus on websites,
| which negate all performance improvements made by browser
| developers by adding more useless features.
| ceeam wrote:
| Amen.
| 12345hn6789 wrote:
| Why is the scale 0-140? My modern windows 10 desktop using
| FireFox latest gives 15.0/140 with no other programs running
| besides FF and discord. Surely 15 is horrible in that context? I
| have 1 extension, ublock origin, allowed to run on the site by
| default.
|
| I have never felt performance has ever lacked, outside of a few
| outlier sites (youtube, facebook, twitch). But those are tightly
| coupled with their (crappy) implementations.
| kleiba wrote:
| By far the biggest speed complaint I have about Firefox is not in
| everyday use, but whenever I restore a previously saved session -
| it basically stops reacting for a few minutes(!) before
| eventually I can use it again. I suppose it's due to the anti-
| virus interfering with some kind of memory image or whatever but
| whatever it is, it's so annoying.
| WarOnPrivacy wrote:
| > but whenever I restore a previously saved session - it
| basically stops reacting for a few minutes(!)
|
| I haven't run into anything like this; you may be an outlier. I
| interact with ~7 Firefox instances (Win) each week. Each has
| diff configs and plugins.
| emayljames wrote:
| Yeah, I've used on linux/win/mac restoring sometimes 20+tabs
| and never had that issue.
| Lacusch wrote:
| I've the same problem on Linux with no antivirus installed
| atlas_hugged wrote:
| In iOS, all browsers (at the moment) use Safari under the hood.
| Imagine my surprise to see these noticeable differences in some
| of them.
|
| Vivaldi:12.2
|
| Brave:18.1
|
| Safari:18.2
|
| Chrome:19
|
| Firefox Focus:21
| freediver wrote:
| This is because browsers on iOS do not 'use Safari' but use
| WebKit and there is huge amount of browser app software built
| on top of it, which will contribute to variance on benchmarks
| (and also to these being very different browsers ultimately).
| CraftThatBlock wrote:
| Some of my results:
|
| Desktop Firefox: 25
|
| Desktop Chrome: 26
|
| Laptop Firefox: 16
|
| Laptop Chrome: 20
|
| Laptop Safari: 21
|
| Phone Firefox: 12
|
| Phone Chrome: 10
|
| ---
|
| Desktop: 5900X, 3090, Linux
|
| Laptop: M1 Pro 14"
|
| Phone: S24 Ultra
|
| Ran all tests in private window to avoid extensions, and gave a
| minute to cool between tests. Laptop/phone was plugged in.
| aPoCoMiLogin wrote:
| chrome 27.4
|
| firefox 26.3
|
| desktop: 7800xd3, 2060, linux
| hedgehog wrote:
| M1 MacBook Air:
|
| Safari: 24.1
|
| Firefox: 21.7
|
| Extensions can really slow things down:
|
| Safari w/ Ghostery: 6.83
|
| Safari w/ AdBlock: 13.0
| sedatk wrote:
| On my machine, Firefox got 12.3, and Edge (Chromium) got 12.8. I
| don't believe that the performance characteristics of these two
| are that close unless I'm missing something. For example, audio
| players on Edge stutter a lot while Firefox plays them smoothly.
| An example is: https://deepsid.chordian.net/ I believe Edge is
| slower not because Chromium is slow, but because of Microsoft's
| overreaching efforts on energy conservation.
|
| Machine: AMD 5950X, 32GB RAM, 3080 GPU, Windows 11 Pro 23H2
|
| Firefox v123.0.1 Edge v122.0.2365.80
|
| EDIT: Interesting, I tried both in private windows later to
| bypass extensions, and Edge got 10.8 this time, Firefox got 16.9.
| I now have more questions.
| rezonant wrote:
| I ran v3 on my machine while listening to "Voyage" by "Yahel &
| Eyal Barkan" in Chrome and doing a bunch of background stuff.
| The background stuff took up about 20% of my CPU. While
| testing, the music played perfectly without any buffer underrun
| pops.
|
| Ran it in each browser one at a time while the music played in
| Chrome.
|
| Chrome 122.0.6261.112: 21.3 +/- 0.64
|
| Edge 122.0.2365.80: 20.1 +/- 0.78
|
| Firefox 121.0.1: 18.5 +/- 0.75
|
| Machine specs: Intel Core i9 12900k (24 core) / 64GB RAM /
| 3080Ti / Windows 11 Pro 23H2
|
| After finishing the tests, I played that same song on Firefox
| and Edge. Both Firefox and Edge played it perfectly.
|
| > audio players on Edge stutter a lot while Firefox plays them
| smoothly
|
| I'm curious about what could be leading to this inconsistency
| as I use Web Audio for a number of projects, so I have a bit of
| a vested interest. It is notoriously easy to do WebAudio wrong
| or to do just a bit too much computation which leads to buffer
| underruns (pops). It also may have a lot to do with specific
| tracks on DeepSID, could you share some tracks that perform
| inconsistently for you?
| sedatk wrote:
| Any track plays completely garbage on Edge. The beat skips,
| the sound cuts off. This one for instance: https://deepsid.ch
| ordian.net/?file=/MUSICIANS/F/Fate/World_R...
|
| I think Edge's problems come from some kind of power
| efficiency setting, not necessarily performance-related.
| (Like a low-granularity JS timer, or something like that)
|
| EDIT: Turning off all efficiency settings on Edge didn't make
| any difference: 11.0
| mdasen wrote:
| Audio wouldn't be going via the DOM or JS, right? I know that
| Firefox has its own codec support and that Safari on Mac uses
| different AV stuff than other browsers.
|
| I don't think that AV stuff would be tested by Speedometer.
| rezonant wrote:
| > don't think that AV stuff would be tested by Speedometer
|
| It probably isn't, but fwiw yes web audio is controlled by
| JavaScript. Doing it right means using web audio worklets,
| which is a special purpose JS context that has no access to
| your main page context.
| freediver wrote:
| This collaboration is pretty exciting. I would expect that the
| teams of all three rendering engines (WebKit, Blink, Gecko) have
| done whatever they could to improve performance for the launch
| and that there won't be any outliers at the beginning with all of
| them having similar performance.
|
| But the title of future performance king is up for grabs! And now
| we have a de-facto standard for browser performance benchmarking.
| jeffbee wrote:
| Posting just because nobody else has posted one this high:
|
| Mac mini M2, macOS 14.4, Chrome 122: 30.2 +- 1.6
|
| Good ol' Apple Silicon.
| mccr8 wrote:
| I got 35.7 +- 2.3 on a MacBook Pro M3, Chrome 122.
| apfsx wrote:
| I tested this with Firefox stable release and Brave stable
| release, 3 runs on each. Same exact extensions across both.
|
| Highest scores across tests: Firefox: 6.34 +- 0.31 Brave: 11.3 +-
| 0.37 on Ryzen 9 7940HS + RTX 3060 mobile
|
| Which really sucks since I highly prefer Firefox but this past
| week I've been trying out Brave and I think its noticeably faster
| and smoother to me. Even with the reduced speed I'm still swayed
| toward Firefox for the customization factor you can achieve with
| userchrome.css file.
| denysonique wrote:
| Even if you have the exact same extensions the fact that you
| have an old Firefox profile may be hindering the results. Try
| comparing with a fresh Firefox profile with the same
| extensions.
| apfsx wrote:
| The Firefox profile I'm using is no more than 3 weeks old.
| Fresh install of Windows was done around that time.
| wolverine876 wrote:
| The 'same' extensions on each browser control for your
| experience, but not for the browsers' performance:
|
| The browsers have the same or very similar APIs for the
| extensions but that is just the interface; each browser
| executes the extension's instructions differently (a lot or a
| little - I don't know the browsers' code). The same extension
| will impact Brave's performance differently than it will impact
| Firefox's. In other words, the same extension is not, in this
| sense, the 'same' on each browser.
|
| In this sense, an extension is part of the user experience,
| like a website. The Speedometer test suite doesn't include
| those extensions (I assume) and that is the experience the
| browsers are optimized for.
|
| The parent's test doesn't represent that; it does represent
| their desired experience, of course.
| fddrdplktrew wrote:
| I'm more worried about Firefox' stability these days...
| troupo wrote:
| > The primary goal of Speedometer 3 is to reflect the real-world
| Web as much as possible, so that users benefit when a browser
| improves its score on the benchmark.
|
| As with any other benchmark its results will be interpreted
| incorrectly and will have little effect on real world.
|
| Google _already_ has vast amounts of real-world data. The end
| result? "Oh, you should aim for a Largest Contentful Paint of
| 2.5 _seconds_ or lower " (emphasis mine):
| https://blog.chromium.org/2020/05/the-science-behind-web-vit...
| Why? Because in real world the vast majority of sites is _worse_.
|
| Browsers are already optimised beyond any reasonable expectation.
| Benchmarks like these focus on all the wrong things with little
| to no benefit to the actual performance of real-life web.
|
| Make all benchmarks you want, but then Google's own Youtube will
| load 2.5 MB of CSS and 12 MB of Javascript to display a grid of
| images, and Google's own Lighthouse will scream at you for the
| _hundreds_ of errors and warnings Youtube embed triggers.
|
| Edit:
|
| Optimise all you want, and run any benchmarks you want for the
| "real world", but performance inequality gap will still be there:
| https://infrequently.org/2024/01/performance-inequality-gap-...
|
| Optimise all you want, and run any benchmarks you want for the
| "real world", but Lighthouse will warn you when you have over 800
| DOM nodes, and will show an error for more than 1400 DOM nodes
| (which are laughably small numbers) for a reason:
| https://developer.chrome.com/docs/lighthouse/performance/dom...
| RantyDave wrote:
| In descending order....
|
| MacBook Pro, M2 Pro, 16GB, plugged in, external display:
| Safari=31.2 Chrome=29.4
|
| iPhone 12 mini, plugged in: Safari=19.4
|
| HP Z2 mini (i7): Edge=15.9
|
| Panasonic Toughbook CF19 (win 10): Edge=4.7 Chrome=5.6
|
| Galaxy Tab S5e: Chrome=2.2
|
| Oculus Quest 2: browser crashed
|
| Tizen TV: displayed, wouldn't run
|
| Nintendo 2DS: displayed, no css, wouldn't run
___________________________________________________________________
(page generated 2024-03-11 23:00 UTC)