[HN Gopher] An Analysis of the Performance of WebSockets in Vari...
___________________________________________________________________
An Analysis of the Performance of WebSockets in Various Programming
Languages (2021)
Author : max0563
Score : 81 points
Date : 2024-11-23 03:34 UTC (19 hours ago)
(HTM) web link (www.researchgate.net)
(TXT) w3m dump (www.researchgate.net)
| paulgb wrote:
| The SSRN link doesn't have a login-wall:
| https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3778525
| chrisweekly wrote:
| Thanks! Here's the direct link to the ungated PDF:
| https://download.ssrn.com/21/02/03/ssrn_id3778525_code456891...
|
| TLDR; NodeJS is the clear winner, and Python far and away the
| worst of the bunch.
| 5Qn8mNbc2FNCiVV wrote:
| Too bad that uWebsockets was used for Node because a lot of
| higher level libraries are built on top of
| https://www.npmjs.com/package/ws
| windlep wrote:
| I was able to make a uWebsockets adapter for NestJS pretty
| easily. It's a bit sensitive of a library to integrate though,
| a single write when the connection is gone and you get a
| segfault, which means a lot of checking before writing if
| you've yielded since you last checked. This was a few years
| ago, perhaps they fixed that.
| travisgriggs wrote:
| Thanks for the free access links. I did read through a bit.
|
| The title is misleading because exactly one implementation was
| chosen for each of the tested languages. They conclude "do not us
| e Python" because the Python websockets library performs pretty
| poorly.
|
| Each language is scored based on the library chosen. I have to
| believe there are more options for some of these languages.
|
| As someone who is implementing an Elixir LiveView app right now,
| I was particularly curious to see how Elixir performed given
| LiveViews reliance on websockets, but as Elixir didn't make the
| cut.
| nelsonic wrote:
| Was also surprised they omitted Elixir/Erlang from the list of
| languages. Crazy considering how many messaging apps use OTP on
| the backend.
| Terretta wrote:
| _> The title is misleading because exactly one implementation
| was chosen for each of the tested languages. They conclude "do
| not use Python" because the Python websockets library performs
| pretty poorly._
|
| On the contrary, they tried autobahn and aiohttp as well:
|
| _For the Python websocket, a generic module is used which is
| simply named "websockets". ... This is most likely a module
| that offers the simplest of websocket functionality. Now, it
| was mentioned that this only partly explains the poor
| performance. While writing this report, it seemed unjust not to
| give Python a fighting chance. So, the websocket server has
| been rebuilt with the more trusted Autobahn library and the
| benchmark test has been rerun. This new server does lead to
| better results ... still unable to finish the benchmark
| test.... [T]he Python server is rebuilt one more time, this
| time with a library by the name of "aiohttp." At last, all 100
| rounds of the benchmark are able to be completed, though not
| very well. Aiohttp still takes longer than Go, and becomes
| substantially unreliable after round 50, dropping anywhere from
| 30-50% of the messages. It can only be concluded that the
| reason for this dreadful performance is Python itself._
| latch wrote:
| Their explanation for why Go performs badly didn't make any sense
| to me. I'm not sure if they don't understand how goroutines work,
| if I don't understand how goroutines work or if I just don't
| understand their explanation.
|
| Also, in the end, they didn't use the JSON payload. It would have
| been interesting if they had just written a static string. I'm
| curious how much of this is really measuring JSON
| [de]serialization performance.
|
| Finally, it's worth pointing out that WebSocket is a standard.
| It's possible that some of these implementations follow the
| standard better than others. For example, WebSocket requires that
| a text message be valid UTF8. Personally, I think that's a dumb
| requirement (and in my own websocket server implementation for
| Zig, I don't enforce this - if the application wants to, it can).
| But it's completely possible that some implementations enforce
| this and others don't, and that (along with every other check)
| could make a difference.
| vandot wrote:
| They didn't use goroutines, which is explains the poor perf.
| https://github.com/matttomasetti/Go-Gorilla_Websocket-Benchm...
|
| Also, this paper is from Feb 2021.
| windlep wrote:
| I was under the impression that the underlying net/http
| library uses a new goroutine for every connection, so each
| websocket gets its own goroutine. Or is there somewhere else
| you were expecting goroutines in addition to the one per
| connection?
| donjoe wrote:
| Which is perfectly fine. However, you will be able to
| process only a single message per connection at once.
|
| What you would do in go is:
|
| - either a new goroutine per message
|
| - or installing a worker pool with a predefined goroutine
| size accepting messages for processing
| jand wrote:
| Another option is to have a read-, and a write-pump
| goroutine associated with each gorilla ws client. I found
| this useful for gateways wss <--> *.
| initplus wrote:
| http.ListenAndServe is implemented under the hood with a new
| goroutine per incoming connection. You don't have to
| explicitly use goroutines here, it's the default behaviour.
| necrobrit wrote:
| Yes _however_ the nodejs benchmark at least is handling
| each message asynchronously, whereas the go implementation
| is only handling connections asynchronously.
|
| The client fires off all the requests before waiting for a
| response:
| https://github.com/matttomasetti/NodeJS_Websocket-
| Benchmark-... so the comparison isn't quite apples to
| apples.
|
| Edit to add: looks like the same goes for the c++ and rust
| implementations. So I think what we might be seeing in this
| benchmark (particularly the node vs c++ since it is the
| same library) is that asynchronously handling each message
| is beneficial, and the go standard libraries json parser is
| slow.
|
| Edit 2: Actually I think the c++ version is async for each
| message! Dont know how to explain that then.
| josephg wrote:
| Well, tcp streams are purely sequential. It's the ideal
| use case for a single process, since messages can't be
| received out of order. There's no computational advantage
| to "handling each message asynchronously" unless the
| message handling code itself does IO or something. And
| that's not the responsibility of the websocket library.
| necrobrit wrote:
| Good point!
| ikornaselur wrote:
| Yeah I thought this looked familiar.. I went through this
| article about a year and a half ago when exploring WebSockets
| in Python for work. With some tuning and using a different
| libraries + libuv we were easily able to get similar
| performance to NodeJS.
|
| I had a blog post somewhere to show the testing and results,
| but can't seem to find it at the moment though.
| tgv wrote:
| > I'm curious how much of this is really measuring JSON
| [de]serialization performance.
|
| Well, they did use the standard library for that, so quite a
| bit, I suppose. That thing is slow. I've got no idea how fast
| those functions are in other languages, but you're right that
| it would ruin the idea behind the benchmark.
| bryancoxwell wrote:
| Are you referring to Go's stdlib?
| klabb3 wrote:
| > Their explanation for why Go performs badly didn't make any
| sense to me.
|
| To me, the whole paper is full of misunderstanding, at least
| the analysis. There's just speculation based on caricatures of
| the language, like "node is async", "c++ is low level" etc. The
| fact that their C++ impl using uWebSocket was _significantly_
| slower than then Node, which used uWebSocket bindings, should
| have led them to question the test setup (they probably used
| threads which defeats the purpose of uWebSocket.
|
| Anyway.. The "connection time" is just HTTP handshake. It could
| be included as a side note. What's important in WS deployments
| are:
|
| - Unique message throughput (the only thing measured afaik).
|
| - Broadcast/"multicast" throughput, i.e. say you have 1k
| subscribers you wanna send the _same_ message.
|
| - Idle memory usage (for say chat apps that have low traffic -
| how many peers can a node maintain)
|
| To me, the champion is uWebSocket. That's the _entire_ reason
| why "Node" wins - those language bindings were written by the
| same genius who wrote that lib. Note that uWebSocket doesn't
| have TLS support, so whatever reverse proxy you put in front is
| gonna dominate usage because all of them have higher overheads,
| even nginx.
|
| Interesting to note is that uWebSocket perf (especially memory
| footprint) can't be achieved even in Go, because of the
| goroutine overhead (there's _no_ way in Go to read /write from
| multiple sockets from a single goroutine, so you have to spend
| 2 gorountines for realtime r/w). It could probably be achieved
| with Tokio though.
| Svenskunganka wrote:
| The whole paper is not only full of misunderstandings, it is
| full of errors and contradictions with the implementations.
|
| - Rust is run in debug mode, by omitting the --release flag.
| This is a very basic mistake.
|
| - Some implementations is logging to stdout on each message,
| which will lead to a lot of noise not only due to the
| overhead of doing so, but also due to lock contention for
| multi-threaded benchmarks.
|
| - It states that the Go implementation is blocking and
| single-threaded, while it in fact is non-blocking and multi-
| threaded (concurrent).
|
| - It implies the Rust implementation is not multi-threaded,
| while it in fact is because the implementation spawns a
| thread per connection. On that note, why not use an async
| websocket library for Rust instead? They're used much more.
|
| - Gives VM-based languages zero time to warm up, giving them
| very little chance to do one of their jobs; runtime
| optimizations.
|
| - It is not benchmarking websocket implementations
| specifically, it is benchmarking websocket implementations,
| JSON serialization and stdout logging all at once. This adds
| so much noise to the result that the result should be
| considered entirely invalid.
|
| > To me, the champion is uWebSocket. That's the entire reason
| why "Node" wins [...]
|
| A big part of why Node wins is because its implementation is
| not logging to stdout on each message like the other
| implementations do. Add a console.log in there and its
| performance tanks.
| austin-cheney wrote:
| There is no HTTP handshake in RFC6455. A client sends a text
| with a pseudo unique key. The server sends a text with a key
| transform back to the client. The client then opens a socket
| to the server.
|
| The distinction is important because assuming HTTP implies
| WebSockets is a channel riding over an HTTP server. Neither
| the client or server cares if you provide any support for
| HTTP so long as the connection is achieved. This is easily
| provable.
|
| It also seems you misunderstand the relationship between
| WebSockets and TLS. TLS is TCP layer 4 while WebSockets is
| TCP layers 5 and 6. As such WebSockets work the same way
| regardless of TLS but TLS does provide an extra step of
| message fragmentation.
|
| There is a difference in interpreting how a thing works and
| building a thing that does work.
| emmanueloga_ wrote:
| If the author is reading this, I think a single repository would
| be more appropriate than multiple repos [1]. It would be nice to
| set things up so we can simply git pull, docker run, and execute
| the benchmarks for each language sequentially.
|
| Something that stood out to me is the author's conclusion that
| "Node.js wins." However, both the Node.js and C++ versions use
| the same library, uWebSockets! I suspect the actual takeaway is
| this:
|
| "uWebSockets wins, and the uWebSockets authors know their library
| well enough that even their JavaScript wrapper outperforms my own
| implementation in plain C++ using the same library!" :-p
|
| Makes me wonder if there's something different that could be done
| in Go to achieve better performance. Alternatively, this may
| highlight which language/library makes it easier to do the right
| thing out of the box (for example, it seems easier to use
| uWebsockets in nodejs than in C++). TechEmpower controversies
| also come to mind, where "winning" implementations often don't
| reflect how developers typically write code in a given language,
| framework, or library.
|
| --
|
| 1:
| https://github.com/matttomasetti?tab=repositories&q=websocke...
| fnordpiglet wrote:
| (2021) Was surprised it used a depreciated Rust crate until I
| noticed how out of date it is
| simpaticoder wrote:
| Interesting that https://github.com/uNetworking/uWebSockets.js
| (which is C++ with node bindings) outperforms the raw C++
| uWebSockets implementation.
|
| It's also interesting that https://github.com/websockets/ws does
| not appear in this study, given that in the node ecosystem it is
| ~3x more likely to be used (not a perfect measurement but ws has
| 28k github stars vs uWebSockets 8k stars)
| zo1 wrote:
| Was this published as-is to some sort of prominent CS journal? I
| honestly can't tell from the link. If that's the case, I'm very
| disappointed and would have a few choice words about the state of
| "academia".
| ndusart wrote:
| Yes, that would be concerning indeed...
|
| The author couldn't tell why he didn't manage to make run the C
| or python program but figured it is probably the blame of the
| language for some obscure reasons.
|
| He also mentioned that he should have implemented
| multithreading in C++ to be comparable with Node, but meh
| that's probably also not of his concern, let compare them as is
| ^^`
|
| Also he doesn't mention the actual language of the library
| used, but that would have voided the interest of the article,
| so I quite may understand that omission :P
|
| But at the end, nothing can be learned from this and it is hard
| to believe it is what "research" can produce
| josephg wrote:
| Yeah it's a rubbish paper. It's just a comparison of some
| websocket implementations at some particular point in time.
| It tells you how fast some of the fastest WS implementations
| are in absolute terms, but there are no broad conclusions you
| can make other than the fact that there's more room for
| optimisation in a few libraries. Whoopty doo. News at 11.
| indulona wrote:
| The DX for websockets in Go(gorilla) is horrible. But i do not
| believe these numbers one bit.
| wuschel wrote:
| Is this a peer reviewed paper? It does not seem to be. At a first
| glance, the researchgate URI and the way the title was formulated
| made me think it would be the case.
| frizlab wrote:
| Not including Swift in such a research seems to be a big
| oversight to me.
| cess11 wrote:
| I'd like to know why Elixir and Erlang were excluded.
| cess11 wrote:
| Seems the author went silent after this, maybe he decided to
| run a cafe or something instead.
| austin-cheney wrote:
| I have a home grown websocket library I wrote in TypeScript for
| node.js. When I measured it a couple of years ago here were my
| findings:
|
| * I was able to send a little under 11x faster than I could
| process the messages on the receiving end. I suspected this was
| due to accounting for processing of frame headers with
| consideration of the various forms of message fragmentation. I
| also ran both send and receive operations on the same machine
| which could have biased the numbers
|
| * I was able to send messages on my hardware at 280,000 messages
| per second. Bun claimed, at that time, a send rate of about
| 780,000 messages per second. My hardware is old with DDR3 memory.
| I suspect faster memory would increase those numbers more than
| anything else, but I never validated that
|
| * In real world practical use switching from HTTP for data
| messaging to WebSockets made my big application about 8x faster
| overall in test automation.
|
| Things I suspect, my other assumptions:
|
| * A WebSocket library can achieve superior performance if written
| in a strongly typed language that is statically compiled and
| without garbage collection. Bun achieved far superior numbers and
| is written in Zig.
|
| * I suspect that faster memory would lower the performance gap
| between sending and receiving when perf testing on a single
| machine
| pier25 wrote:
| I'm surprised at how well php is doing here. I'm guessing they
| are using fibers?
| alganet wrote:
| It uses reactphp event-loop library:
|
| https://github.com/reactphp/event-loop
|
| That library can use either select, libuv, libev or libevent if
| I'm not mistaken. Fibers are not used at this point, although
| other libraries have explored the idea (revoltphp).
|
| If we're assuming the paper author installed a typical PHP,
| then it's using select for async I/O. It's the slowest
| implementation of the event loop. Using something like swoole
| would extract even more performance out of PHP for async io
| scenarios.
| fredtalty5 wrote:
| The 2021 study titled "An Analysis of the Performance of
| WebSockets in Various Programming Languages" benchmarks multiple
| WebSocket implementations to determine which offers the best
| performance. Key findings include:
|
| Node.js emerged as the fastest option, primarily due to its
| asynchronous capabilities, allowing for higher throughput during
| concurrent requests.
|
| Java and C# closely followed Node.js, demonstrating strong
| performance in handling requests.
|
| C++ and rust performed moderately well, while PHP lagged behind
| them.
|
| Python and C struggled significantly, with Python's websocket
| library proving particularly inefficient, leading to high
| failures during stress tests.
|
| The analysis emphasises the importance of using asynchronous
| libraries and suggests avoiding Python for web socket
| implementations due to its performance limitations. The study
| serves as a valuable resource for developers looking to select
| the optimal programming language for WebSocket applications.
| timkofu wrote:
| It would be interesting to this repeated with Starlette and
| Granian on Python 13 (with GIL and JIT).
| fastaguy88 wrote:
| A meta comment: This paper gives an example of a "teaser
| abstract". It says what was done, but does not say anything about
| the actual results. This style is relatively common, but I find
| it very annoying. There was certainly enough room in the abstract
| to provide a concise summary of the actual results, which would
| both inform the reader and perhaps encourage more people to read
| the entire paper.
___________________________________________________________________
(page generated 2024-11-23 23:01 UTC)