[HN Gopher] The hidden complexity of scaling WebSockets
       ___________________________________________________________________
        
       The hidden complexity of scaling WebSockets
        
       Author : atul-jalan
       Score  : 182 points
       Date   : 2025-01-24 19:48 UTC (1 days ago)
        
 (HTM) web link (composehq.com)
 (TXT) w3m dump (composehq.com)
        
       | peteforde wrote:
       | This is all true, but it also serves to remind us that Rails
       | gives developers so much out of the box, even if you're not aware
       | of it.
       | 
       | ActionCable is Rails' WebSockets wrapper library, and it
       | addresses basically every pain point in the post. However, it
       | does so in a way that all Rails developers are using the same
       | battle-tested solution. There's no need for every project to hack
       | together its own proprietary approach.
       | 
       | Thundering herds, heartbeat monitoring are both covered.
       | 
       | If you need a messaging schema, I strongly recommend that you
       | check out CableReady. It's a powerful library for triggering
       | outcomes on the client. It ships with a large set of operations,
       | but adding custom operations is trivial.
       | 
       | https://cableready.stimulusreflex.com/hello-world/
       | 
       | While both ActionCable and CableReady are Rails libraries, other
       | frameworks would score huge wins if they adopted their client
       | libraries.
        
         | atul-jalan wrote:
         | Node has similar libraries like Socket.IO too, but it over-
         | abstracts it a bit in my opinion.
        
           | hombre_fatal wrote:
           | I've done my share of building websocket servers from
           | scratch, but when you don't use libraries like ActiveCable or
           | socket.io, you have to build your own MessageID
           | reconciliation so that you can have request/response cycles.
           | Which is generally what you want (or eventually want) in a
           | websocket-heavy application.
           | send(payload).then(reply => ...)
        
             | atul-jalan wrote:
             | Yep, for our application, we have an `executionId` that is
             | sent in essentially every single WebSocket message.
             | 
             | But client and server use it to maintain a record of
             | events.
        
               | XorNot wrote:
               | Isn't this JSON-RPC's approach?
        
             | mirekrusin wrote:
             | Or just use jsonrpc.
        
               | moralestapia wrote:
               | ???
               | 
               | That solves none of the issues outlined in the post or
               | the comments.
        
               | crabmusket wrote:
               | It solves the very limited problem of bike-shedding
               | envelope shapes for request/reply protocols, which I
               | think was all they meant to say.
               | 
               | At its core, JSON-RPC boils down to "use `id` and
               | `method` and work the rest out", which is acceptably
               | minimal but does leave you with a lot of other issues to
               | deal with.
        
               | mirekrusin wrote:
               | It's a bit misnomer because it defines rpcs _and_
               | notifications.
               | 
               | What people seem to be often missing for some reason is
               | that those two map naturally to existing semantics of the
               | programming language they're already using.
               | 
               | What it means in practice is that you are exposing and
               | consuming functions (ie. on classes) - just like you do
               | in ordinary libraries.
               | 
               | In js/ts context it usually means async functions on
               | classes annotated with decorators (to register method as
               | rpc and perform runtime assertions) that are event
               | emitters - all concepts already familiar to developers.
               | 
               | To summarize you have system that is easy to inspect,
               | reason about and easy to use - almost like any other
               | package in your dependency.
               | 
               | Introducing backward compatible/incompatible changes also
               | becomes straight forward for everybody ie. following
               | semver on api surface just like in any ordinary package
               | you depend on.
               | 
               | Those straight forward facts are often missed and largely
               | underappriciated.
               | 
               | ps. in our systems we're introducing two deviations -
               | error code can also be strings, not just numbers
               | (trivial); and we support async generators (emitting
               | individual objects for array results) - which helps with
               | head of line blocking issues for large resultsets (still
               | compatible with jsonrpc at protocol level, although it
               | would be nice if they supported it upstream as dedicated
               | semantic in jsonrpc 2.1 or something). They could also
               | specify registering and unregistering notification
               | listeners at the spec level so everybody is using the
               | same scheme.
        
               | crabmusket wrote:
               | That do you see as the difference between an RPC and a
               | notification?
               | 
               | The terminology is not ideal, I grant, but a JSON-RPC
               | "notification" (a request with no id) is just a request
               | where the client cannot, and does not, expect any
               | response, not even a confirmation that the request was
               | received and understood by the server. It's like UDP
               | versus TCP.
               | 
               | > emitting individual objects for array results
               | 
               | This is interesting! How does this change the protocol? I
               | assume it's more than just returning multiple responses
               | for the same request?
        
               | mirekrusin wrote:
               | Yes that's all there is in the difference remote
               | procedure call expects response, remote notification
               | doesn't.
               | 
               | Our implementation emits notifications for entries and
               | rpc returns done payload (which is largely irrelevant
               | just the fact of completion is relevant).
               | 
               | As I said it would be nice if they'd support generator
               | functions at the protocol level.
        
             | p_l wrote:
             | If you add Content-Negotiation it will have ALL the OSI
             | layers! /s
             | 
             | Honestly, I'm a little surprised and more than a bit
             | depressed how we effectively reinvent the OSI stack so
             | often...
        
             | dilyevsky wrote:
             | At this point why even use a websocket vs a normal
             | request/reply technology like grpc or json-rpc?
        
               | peteforde wrote:
               | For scenarios requiring a constant exchange of
               | information, such as streaming data or real-time updates.
               | After the initial handshake, data is exchanged directly
               | over the connection with minimal overhead. Lower latency
               | is especially beneficial for high-frequency message
               | exchanges. Gaming, live auctions, or real-time dashboards
               | are well suited. I also think that real time
               | collaboration is under-explored.
               | 
               | JSON-RPC is request-response only; the server cannot send
               | unsolicited messages. gRPC supports bidirectional
               | streaming, but I understand that setting it up is more
               | complex than WebSockets.
               | 
               | I will concede that horizontal scaling of RPC is easier
               | because there's no connection overhead.
               | 
               | Ultimately, it really depends on what you're trying to
               | build. I also don't underestimate the cultural aspect;
               | fair or not, JSON-RPC feels very "enterprise
               | microservices" to me. If you think in schemas, RPC might
               | be a good fit.
        
               | melchizedek6809 wrote:
               | Why can't the server send unsolicited messages in JSON-
               | RPC? I've implemented bidirectional JSON-RPC in multiple
               | projects and things work just fine, even going as far as
               | sharing most of the code.
        
               | crabmusket wrote:
               | Yep, the web server and client can both act as JSON-RPC
               | servers and clients. I've used this pattern before too
               | with Web Workers, where the main thread acts as both
               | client (sending requests to the worker) and server
               | (fielding requests from the worker).
        
         | hinkley wrote:
         | Elixir's lightweight processes are also a good fit. Though I've
         | seen some benchmarks that claim that goroutines can hit even
         | lower overhead per connection.
        
           | ramchip wrote:
           | That makes sense, Erlang/Elixir processes are a much higher-
           | level construct than goroutines, and they trade off
           | performance for fault tolerance and observability.
           | 
           | As an example, with a goroutine you have to be careful to
           | handle all errors, because a panic would take down the whole
           | service. In Elixir a websocket handler can crash anywhere
           | without impacting the application. This comes at a cost,
           | because to make this safe Elixir has to isolate the processes
           | so they don't share memory, so each process has its own
           | individual heap, and data gets copied around more often than
           | in Go.
        
             | flakes wrote:
             | > As an example, with a goroutine you have to be careful to
             | handle all errors, because a panic would take down the
             | whole service.
             | 
             | Unless you're the default `net/http` library and simply
             | recover from the panic: https://github.com/golang/go/blob/m
             | aster/src/net/http/server...
        
               | neillyons wrote:
               | You still need to be careful, as this won't catch panics
               | from go routines launched from your http handler.
        
               | ramchip wrote:
               | Yeah, I'm simplifying a bit. It may not cause an
               | immediate exit, but it can leave the service broken in
               | unpredictable ways. See this discussion for instance:
               | https://iximiuz.com/en/posts/go-http-handlers-panic-and-
               | dead...
        
       | exabrial wrote:
       | I recall another complication with websockets: IIRC it's with
       | proxy load balancers, like binding a connection to a single
       | connection server, even if the backend connection is using
       | HTTP/2. I probably have the details wrong. I'm sure someone will
       | correct my statement.
        
         | hpx7 wrote:
         | Horizontal scaling is certainly a challenge. With traditional
         | load balancers, you don't control which instance your clients
         | get routed to, so you end up needing to use message brokers or
         | stateful routing to ensure message broadcasts work correctly
         | with multiple websocket server instances.
        
         | atul-jalan wrote:
         | I think there is a way to do it, but it likely involves custom
         | headers on the initial connection that the load balancer can
         | read to route to the correct origin server.
         | 
         | I imagine the way it might go is that the client would first
         | send an HTTP request to an endpoint that returns routing
         | instructions, and then use that in the custom headers it sends
         | when initiating the WebSocket connection.
         | 
         | Haven't tried this myself though.
        
         | arccy wrote:
         | I think it's more that WebSockets are held open for a long
         | time, so if you're not careful, you can get "hot" backends with
         | a lot of connections that you can't shift to a different
         | instance. It can also be harder to rotate backends since you
         | know you are disrupting a large number of active clients.
        
           | dboreham wrote:
           | The trick to doing this efficiently is to arrange for the
           | live session state to be available (through replication or
           | some data bus) at the alternative back end before cut over.
        
           | superjan wrote:
           | Assuming you control the client code, you can periodically
           | disconnect and reconnect. This could also simplify
           | deployment.
        
       | 10000truths wrote:
       | The key to managing this complexity is to avoid mixing transport-
       | level state with application-level state. The same approach for
       | scaling HTTP requests also works for scaling WebSocket
       | connections:
       | 
       | * Read, write and track all application-level state in a
       | persistent data store.
       | 
       | * Identify sessions with a session token so that application-
       | level sessions can span multiple WebSocket connections.
       | 
       | It's a lot easier to do this if your application-level protocol
       | consists of a single discrete request and response (a la RPC).
       | But you can also handle unidirectional/bidirectional streaming,
       | as long as the stream states are tracked in your data store and
       | on the client side.
        
         | magicalhippo wrote:
         | Currently another thread is going[1] which advocates very
         | similar things, in order to reduce complexity when dealing with
         | distributed systems.
         | 
         | Then again, the frontend and backend are a distributed system,
         | so not that weird one comes to similar conclusions.
         | 
         | [1]: https://news.ycombinator.com/item?id=42813049 _Every
         | System is a Log: Avoiding coordination in distributed
         | applications_
        
         | hinkley wrote:
         | Functional core, imperative shell makes testing and this fast
         | iteration a lot easier. It's best if your business logic knows
         | very little about transport mechanisms.
         | 
         | I think part of the problem is that early systems wanted to
         | eagerly process requests while they are still coming in. But in
         | a system getting 100s of requests per second you get better
         | concurrency if you wait for entire payloads before you waste
         | cache lines on attempting to make forward progress on
         | incomplete data. Which means you can divorce the concept of a
         | payload entirely from how you acquired it.
        
           | ignoramous wrote:
           | > _system getting 100s of requests per second you get better
           | concurrency if you wait for entire payloads before you waste
           | cache lines_
           | 
           | At what point should one scale up & switch to chips with
           | embedded DRAMs ("L4 cache")?
        
             | hinkley wrote:
             | I haven't been tracking price competitiveness on those.
             | What cloud providers offer them?
             | 
             | But you don't get credit for having three tasks halfway
             | finished instead of one task done and two in flight. Any
             | failover will have to start over with no forward progress
             | having been made.
             | 
             | ETA: while the chip generation used for EC2 m7i instances
             | can have L4 cache, I can't find a straight answer about
             | whether they do or not.
             | 
             | What I can say is that for most of the services I
             | benchmarked at my last gig, M7i came out to be as expensive
             | per request as the m6's on our workload (AMD's was more
             | expensive). So if it has L4 it ain't helping. Especially at
             | those price points.
        
       | SerCe wrote:
       | I wrote about the way we handle WebSocket connections at Canva a
       | while ago [1]. Even though some small things have changed here
       | and there since the post was published, the overall approach has
       | held up pretty well handling many millions of concurrent
       | connections.
       | 
       | That said, even with great framework-level support, it's much,
       | much harder to build a streaming functionality compared to plain
       | request/response if you've got some notion of a "session".
       | 
       | [1]: https://www.canva.dev/blog/engineering/enabling-real-time-
       | co...
        
         | crabmusket wrote:
         | > it's much, much harder to build a streaming functionality
         | compared to plain request/response if you've got some notion of
         | a "session"
         | 
         | This touches something that I think is starting to become
         | understood- the concept of a "session backend" to address this
         | kind of use case.
         | 
         | See the complexity of disaggregation a live session backend on
         | AWS versus CloudFlare:
         | https://digest.browsertech.com/archive/browsertech-digest-cl...
         | 
         | I wrote about session backends as distinct from durable
         | execution: https://crabmusket.net/2024/durable-execution-
         | versus-session...
        
       | dilyevsky wrote:
       | > WebSocket connections can be unexpectedly blocked, especially
       | on restrictive public networks.
       | 
       | What? How would public network even know you're running a
       | websocket if you're using TLS? I dont think it's really possible
       | in general case
       | 
       | > Since SSE is HTTP-based, it's much less likely to be blocked,
       | providing a reliable alternative in restricted environments.
       | 
       | And websockets are not http-based?
       | 
       | What article describes as challenges seems like very pedestrian
       | things that any rpc-based backend needs to solve.
       | 
       | The real reason websockets are hard to scale is because they pin
       | state to a particular backend replica so if the whole bunch of
       | them disconnect _at scale_ the system might run out of resources
       | trying to re-load all that state
        
         | atul-jalan wrote:
         | The initial handshake will usually include an `Upgrade:
         | websocket` header, which can be inspected by networks.
        
           | dilyevsky wrote:
           | No, it literally can not be because by the time Upgrade
           | header appears the connection is already encrypted.
        
             | jauco wrote:
             | Restricted environments in larger corporations can do a
             | full mitm proxy
        
               | PhilipRoman wrote:
               | Eh, if you're dealing with corporate network proxies all
               | bets are already off. They keep blocking connections for
               | the most random reasons until everyone is opening ssh
               | tunnels just to get work done. Websockets are a standard
               | feature of the web, if you cut off your ears don't
               | complain about loss of hearing. Unless, you're explicitly
               | targeting such corporations as clients, in which case -
               | my condolences.
        
               | klabb3 wrote:
               | It's not a very good man-in-the-middle if it can't handle
               | a ubiquitous protocol from 2011 based on http/1.1. More
               | like an incompetent bureaucrat in the middle.
        
         | pk-protect-ai wrote:
         | I agree here. I have had an experience of scaling WebSockets
         | server to 20M connections on a single server (with this one
         | https://github.com/ITpC/LAppS.git). However there are several
         | issues with scaling WebSockets, on the backends as well: mutex
         | locking, non-parallel XOR of input stream, utf8 validation. I
         | do not know the state of the above repository code, it seems
         | that it was never updated for at least 5 years. There were bugs
         | in HTTP parsing in the client part for some cases. Though
         | vertical scalability was excellent. Sad this thing never
         | reached production state.
        
           | klabb3 wrote:
           | > non-parallel XOR of input stream
           | 
           | I remember this one in particular making me upset, simply
           | because of another extra buffer pass for security reasons
           | that I believe are only to prevent proxies doing shit they
           | never should have done in the first place?
        
       | notatoad wrote:
       | for me, the most important lesson i've learned when using
       | websockets is to _not_ use them whenever possible.
       | 
       | i don't hate them, they're great for what they are, but they're
       | for realtime push of small messages only. trying to use them for
       | the rest of your API as well just throws out all the great things
       | about http - like caching and load balancing, and just normal
       | request/response architecture. while you can use websockets for
       | that it's only going to cause you headaches that are already
       | solved by simply using a normal http api for the vast majority of
       | your api.
        
       | austin-cheney wrote:
       | The only complexity I have found with regards to scaling
       | WebSockets is knowing the minimum delay between flush event
       | completion and actual message completion to destination. It takes
       | longer to process a message, even on IPC routing, than it does to
       | kill a socket. That has upstream consequences with consideration
       | of redirection and message pipes between multiple sockets. If you
       | kill a socket too early after a message is flushed from the
       | socket there is a good chance the destination sees the socket
       | collapse before it has processed the final message off the socket
       | and that processing delay is not something a remote location is
       | easily aware of.
       | 
       | I have found for safety you need to allow an arbitrary delay of
       | 100ms before killing sockets to ensure message completion which
       | is likely why the protocol imposes a round trip of control frame
       | opcode 8 before closing the connection the right way.
        
       | tbarbugli wrote:
       | > At Compose, every WebSocket message starts with a fixed 2-byte
       | type prefix for categorizing messages.
       | 
       | some of the complexity is self-inflected by ignoring KISS
       | principle
        
         | atul-jalan wrote:
         | How would you make it simpler?
        
       | andy_ppp wrote:
       | Elixir will get you pretty far along this scaling journey without
       | too many problems:
       | 
       | https://hexdocs.pm/phoenix/channels.html
        
         | cultofmetatron wrote:
         | > Elixir will get you pretty far along this scaling journey
         | without too many problems:
         | 
         | been running a phoenix app in prod for 5 years. 1000+ paying
         | customers. heavy use of websockets. never had an issue with the
         | channels systems. it does what it says on the tin and works
         | great right out of the box
        
       | Rldm840 wrote:
       | Many years ago, we used to start a streaming session with an http
       | request, then upgrading to websockets after obtaining a response
       | (this was our original "StreamSense" mechanism). In recent years,
       | we changed StreamSense to go websocket first and fallback to http
       | streaming or http long polling in case of issues. At
       | Lightstreamer, we started streaming data 25 years ago over http,
       | then moving to websockets. We've seen so many different behaviors
       | in the wild internet and got some much feedback from the fieldsl
       | in these decades that we believe our current version of
       | Lightstreamer includes heuristics and mechanisms for virtually
       | every possible aspect of websockets that could go wrong. From
       | massive disconnections and reconnections, to enterprise proxies
       | with deep inspections, to mobile users continuously switching
       | networks. I recall when a big customer required us to support one
       | million live websocket connections for each server (mid-sized)
       | keeping low latency. It was challenging but forced us to come up
       | with a brand new internal architecture. So many stories to tell
       | covering 25 years of evolution...
        
         | beng-nl wrote:
         | Sounds like you have an interesting book to write :-)
         | 
         | Eg "The 1M challenge"
        
           | Rldm840 wrote:
           | Nice idea for my retirement (not in the very short term)!
        
       | yasserf wrote:
       | I have been working on an idea/Node.js library called
       | vramework.dev recently, and a big part of it focuses on
       | addressing the main complexities mentioned below.
       | 
       | For a bit of background, in order to tackle scalability, the
       | initial approach was to explore serverless architecture. While
       | there are both advantages and disadvantages to serverless, a
       | notable issue with WebSockets on AWS* is that every time a
       | message is received, it invokes a function. Similarly, sending a
       | message to a WebSocket requires invoking an HTTP call to their
       | gateway with the websocket / channel id.
       | 
       | The upside of this approach is that you get out-of-the-box
       | scalability by dividing your code into functions and building
       | things in a distributed fashion. The downside is latency, due to
       | all the extra network hops.
       | 
       | This is where vramework comes in. It allows you to define a few
       | functions (e.g., onConnect, onDisconnect, onCertainMessage) and
       | provides the flexibility to run them locally using libraries like
       | uws, ws, or socket.io, or deploy them in the cloud via AWS or
       | Cloudflare (currently supported).
       | 
       | When running locally, the event bus operates locally as well,
       | eliminating latency issues. If you apply the same framework to
       | serverless, latency increases, but you gain scalability for free.
       | 
       | Additionally, vramework provides the following features:
       | 
       | - Standard Tooling
       | 
       | Each message is validated against its typescript signature at
       | runtime. Any errors are caught and sent to the client. (Note: The
       | error-handling mechanism has not yet been given much thought into
       | as an API). Rate limiting is also incorporated as part of the
       | permissioning system (each message can have permissions checked,
       | one of them could rate limiting)
       | 
       | - Per-Message Authentication
       | 
       | It guards against abuse by ensuring that each message is valid
       | for the user before processing it. For example, you can configure
       | the framework to allow unauthenticated messages for certain
       | actions like authentication or ping/pong, while requiring
       | authentication for others.
       | 
       | - User Sessions
       | 
       | Another key feature is the ability to associate each message with
       | a user session. This is essential not only for authentication but
       | also for the actual functionality of the application. This is
       | done by doing a call to a cache (optionally) which returns the
       | user session associated with the websocket. This session can be
       | updated during the websocket lifetime if needed (if your protocol
       | deals with auth as part of it's messages and not on connection)
       | 
       | Some doc links:
       | 
       | https://vramework.dev/docs/channels/channel-intro
       | 
       | A post that explains vramework.dev a bit more in depth (linked
       | directly to a code example for websockets):
       | 
       | https://presentation.vramework.dev/#/33/0/5
       | 
       | And one last thing, it also produces a fully typed websocket
       | client, so if using routes (where a property in your message
       | indicates which function to use, the approach AWS uses
       | serverless).
       | 
       | Would love to get thoughts and feedback on this!
       | 
       | edit: *and potentially Cloudflare, though I'm not entirely sure
       | of its internal workings, just the Hibernation server and
       | optimising for cost saving
        
         | yasserf wrote:
         | const onConnect: ChannelConnection<'hello!'> = async (services,
         | channel) => {         // On connection (like onOpen)
         | channel.send('hello') // This is checked against the input type
         | }            const onDisconnect: ChannelDisconnection = async
         | (services, channel) => {         // On close         // This
         | can't send anything since channel closed       }
         | const onMessage: ChannelMessage<'hello!' | { name: string },
         | 'hey'> = async (services, channel) => {
         | channel.send('hey')       }            export const
         | subscribeToLikes: ChannelMessage<         { talkId: string;
         | action: 'subscribeToLikes' },         { action: string; likes:
         | number }       > = async (services, channel, { action, talkId
         | }) => {         const channelName =
         | services.talks.getChannelName(talkId)         // This is a
         | service that implements a pubsub/eventhub interface
         | await services.eventHub.subscribe(channelName,
         | channel.channelId)         // we return the action since the
         | frontend can use it to route to specific listeners as well
         | (this could be absorbed by vrameworks runtime in future)
         | return { action, likes: await services.talks.getLikes(talkId) }
         | }            addChannel({         name: 'talks',         route:
         | '/',         auth: true,         onConnect,
         | onDisconnect,         // Default message handler
         | onMessage,         // This will route the message to the
         | correct function if a property action exists with the value
         | subscribeToLikes (or otherwise)         onMessageRoute: {
         | action: {             subscribeToLikes: {               func:
         | subscribeToLikes,               permissions: {
         | isTalkMember: [isTalkMember, isNotPresenter],
         | isAdmin               },             },           },         },
         | })
         | 
         | A code example.
         | 
         | Worth noting you can share functions across websockets as well,
         | which allows you to compose logic across different ones if
         | needed
        
       | jwr wrote:
       | My SaaS has been using WebSockets for the last 9 years. I plan to
       | stop using them and move to very simple HTTP-based polling.
       | 
       | I found that scalability isn't a problem (it rarely is these
       | days). The real problem is crappy network equipment all over the
       | world that will sometimes break websockets in strange and
       | mysterious ways. I guess not all network equipment vendors test
       | with long-lived HTTP websocket connections with plenty of data
       | going over them.
       | 
       | At a certain scale, this results in support requests, and
       | frustratingly, I can't do anything about the problems my
       | customers encounter.
       | 
       | The other problems are smaller, but still annoying, for example
       | it isn't easy to compress content transmitted through websockets.
        
         | tomrod wrote:
         | Found this same issue trying to scale streamlit. It's just not
         | a good idea.
        
         | bob1029 wrote:
         | The last project I worked on went in the same direction.
         | 
         | Everything works great in local/qa/test, and then once we move
         | to production we inevitably have customers with super weird
         | network security arrangements. Users in branch offices on WiFi
         | hardware installed in 2007. That kind of thing.
         | 
         | When you are building software for other businesses to use, you
         | need to keep it simple or the customer will make your life
         | absolutely miserable.
        
         | yesbabyyes wrote:
         | I always recommend looking at Server-Sent Events [0] and
         | EventSource [1]. It's a standardization of old style long-
         | polling, mapping very well to the HTTP paradigm and is built in
         | to the web standard.
         | 
         | It's so much easier to reason about than websockets, and a
         | naive server side implementation is very simple.
         | 
         | A caveat is to only use them with HTTP 2 and/or client side
         | logic to only have one connection open to the server, because
         | of browser limits on simultaneous requests to the same origin.
         | 
         | [0] https://developer.mozilla.org/en-US/docs/Web/API/Server-
         | sent... [1] https://developer.mozilla.org/en-
         | US/docs/Web/API/EventSource
        
         | akshayKMR wrote:
         | What are the typical payload sizes in your WebSocket messages?
         | Could you share the median and p99 values?
         | 
         | I've also discovered similar networking issues in my own
         | application while traveling. For example, in Vietnam right now,
         | I was facing recurring issues like long connection
         | establishment times and loss of responsiveness mid-operation. I
         | thought I was losing my mind - I even configured Caddy to not
         | use HTTP3/QUIC (some networks don't like UDP).
         | 
         | I moved some chunkier messages in my app to HTTP requests, and
         | it has become much more stable (though still iffy at times).
        
         | catlifeonmars wrote:
         | This is surprising to me as I would expect network equipment to
         | just see a TCP connection given both HTTP and Websockets are an
         | application layer protocol and that long lived TCP connections
         | are quite ubiquitous (databases, streaming services, SSH, etc).
        
       | dartos wrote:
       | Question for those in the know:
       | 
       | Why would I use websockets over SSE?
        
         | sisk wrote:
         | Websockets are bidirectional while SSE is unidirectional
         | (server to client). That said, there's nothing stopping you
         | from facilitating client to server communication separately
         | from SSE, you just don't have to build that channel with
         | websockets.
        
       | jFriedensreich wrote:
       | I am really unsure why devs around the world keep defaulting to
       | websockets for things that are made for server sent events. In
       | 90% of the usecases i see, websockets are just not the right fit.
       | Everything is simpler and easier with SSE. Some exceptions are
       | high throughput >BI<directional data streams. But even if eg.
       | your synced multiplayer cursors in something like figma use
       | websockets don't use it for everything else eg. your notification
       | updates.
        
       | Sytten wrote:
       | The comment about Render/Railway gracefully tranferring
       | connections seems weird? I am pretty sure it just kills the
       | service after the new one is alive which will kill the
       | connections. Not some fancy zero downtime reconnect.
        
       ___________________________________________________________________
       (page generated 2025-01-25 23:01 UTC)