[HN Gopher] Back to basics: Why we chose long-polling over webso...
___________________________________________________________________
Back to basics: Why we chose long-polling over websockets
Author : lunarcave
Score : 197 points
Date : 2025-01-05 07:14 UTC (15 hours ago)
(HTM) web link (www.inferable.ai)
(TXT) w3m dump (www.inferable.ai)
| CharlieDigital wrote:
| Would there be any technical benefit to this over using server
| sent events (SSE)?
|
| Both are similar in that they hold the HTTP connection open and
| have the benefit of being simply HTTP (the big plus here). SSE
| (at least to me) feels like it's more suitable for some use cases
| where updates/results could be streamed in.
|
| A fitting use case might be where you're monitoring all job IDs
| on behalf of a given client. Then you could move the job
| monitoring loop to the server side and continuously yield results
| to the client.
| lunarcave wrote:
| Good point! We did consider SSE, but ultimately decided against
| it due to the way we have to re-implement response payloads
| (one for application/json and one for text/event-stream).
|
| I've not personally witnessed this, but people on the internets
| have said that _some_ proxies/LBs have problems with SSE due to
| the way it does buffering.
| gorjusborg wrote:
| > we have to re-implement response payloads (one for
| application/json and one for text/event-stream)
|
| I am curious about what you mean here. The 'text/event-
| stream' allows for abitrary event formats, it just provides
| structure for EventSource to be able to parse.
|
| You should only need one 'text/event-stream' and should be
| able send the same JSON via normal or SSE response.
| josephg wrote:
| What the GP commenter might have meant is that websockets
| support binary message bodies. SSE does not.
| vindex10 wrote:
| I got interested and found this nice thread on SO:
| https://stackoverflow.com/a/5326159
|
| One of the drawbacks, as I learned - SSE have limit on number
| of up to ~6 open connections (browser + domain name). This can
| quickly become a limiting factor when you open the same web
| page in multiple tabs.
| bn-l wrote:
| ...if you're using http/1.1. It's not an issue with 2+
| Klonoar wrote:
| Not an issue if you're using HTTP/2 due to how multiplexing
| happens.
| CharlieDigital wrote:
| As the other two comments mentioned, this is a restriction
| with HTTP/1.1 and it would apply also to long polling
| connections as well.
| _heimdall wrote:
| Syncing state across multiple tabs and windows is always a
| bit tricky. For SSE, I'd probably reach for the
| BroadcastChannel API. Open the SSE connection in the first
| tab and have it broadcast events to any other open tab or
| window.
| bioneuralnet wrote:
| I've gotten around this by using the Page Visibility API -
| https://developer.mozilla.org/en-
| US/docs/Web/API/Page_Visibi.... Close the SSE connection when
| the page is hidden, and re-open when it becomes visible
| again.
| csumtin wrote:
| I tried using SSE and found it didn't work for my use case, it
| was broken on mobile. When users switched from the browser to
| an app and back, the SSE connection was broken and they
| wouldn't receive state updates. Was easier to do long polling
| CharlieDigital wrote:
| Not sure about the default `EventSource` object in
| JavaScript, but the Microsoft polyfill that I use
| (https://github.com/Azure/fetch-event-source) supports `POST`
| and there's an option `openWhenHidden` which controls how it
| reacts to users tabbing away.
| josephg wrote:
| The standard way to fix that is to send ping messages every
| ~15 seconds or something over the SSE stream. If the client
| doesn't get a ping in any 20 second window, assume the sse
| stream is broken somehow and restart it. It's complex but it
| works.
|
| The big downside of sse in mobile safari - at least a few
| years ago - is you got a constant loading spinner on the
| page. Thats bad UX.
| bartvk wrote:
| Refreshing to be reminded of a relatively simple alternative to
| websockets. For a short time, I worked at a now-defunct startup
| which had made the decision for websockets. It was an app that
| would often be used on holiday so testing was done on hotel and
| restaurant wifi. Websockets made that difficult.
| ipnon wrote:
| I feel like WebSockets are already as simple as it gets. It's
| "just" an HTTP request with an indeterminate body. Just make an
| HTTP request and don't close the connection. That's a
| WebSocket.
| bluepizza wrote:
| It's surprisingly complex.
|
| Connections are dropped all the time, and then your code, on
| both client and server, need to account for retries (will the
| reconnection use a cached DNS entry? how will load balancing
| affect long term connections?), potentially missed events
| (now you need a delta between pings), DDoS protections (is
| this the same client connecting from 7 IPs in a row or is
| this a botnet), and so on.
|
| Regular polling great reduces complexity on some of these
| points.
| slau wrote:
| Long polling has nearly all the same disadvantages.
| Disconnections are harder to track, DNS works exactly the
| same for both techniques, as does load balancing, and DDoS
| is specifically about different IPs trying to DoS your
| system, not the same IP creating multiple connections, so
| irrelevant to this discussion.
|
| Yes, WS is complex. Long polling is not much better.
|
| I can't help but think that if front end connections are
| destroying your database, then your code is not structured
| correctly. You can accept both WS and long polls without
| touching your DB, having a single dispatcher then send the
| jobs to the waiting connections.
| bluepizza wrote:
| My understanding is that long polling has these issues
| handled by assuming the connection will be regularly
| dropped.
|
| Clients using mobile phones tend to have their IPs
| rapidly changed in sequence.
|
| I didn't mention databases, so I can't comment on that
| point.
| josephg wrote:
| Well, it's the same in both cases. You need to handle
| disconnection and reconnection. You need a way to
| transmit missed messages, if that's important to you.
|
| But websockets also guarantee in-order delivery, which is
| never guaranteed by long polling. And websockets play way
| better with intermediate proxies - since nothing in the
| middle will buffer the whole response before delivering
| it. So you get better latency and better wire efficiency.
| (No http header per message).
| bluepizza wrote:
| That very in order guarantee is the issue. It can't know
| exactly where the connection died, which means that the
| client must inform the last time it received an update,
| and the server must then crawl back a log to find the
| pending messages and redispatch them.
|
| At this point, long polling seems to carry more benefits,
| IMHO. WebSockets seem to be excellent for stable
| conditions, but not quite what we need for mobile.
| sn0wtrooper wrote:
| If a connection is closed, isn't the browser's
| responsibility to solve DNS when you open it again?
| valenterry wrote:
| Using websockets with graphql, I feel like a lot of the
| challenges are then already solved. From the post:
|
| - Observability: WebSockets are more stateful, so you need to
| implement additional logging and monitoring for persistent
| connections: solved with graphql if the existing monitoring is
| already sufficient.
|
| - Authentication: You need to implement a new authentication
| mechanism for incoming WebSocket connections: solved with
| graphql.
|
| - Infrastructure: You need to configure your infrastructure to
| support WebSockets, including load balancers and firewalls: True,
| firewalls need to be updated.
|
| - Operations: You need to manage WebSocket connections and
| reconnections, including handling connection timeouts and errors:
| normally already solved by the graphql library. For errors, it's
| basically the same though.
|
| - Client Implementation: You need to implement a client-side
| WebSocket library, including handling reconnections and state
| management: Just have to _use_ a graphql library that comes with
| websocket support (I think most of them do) and configure it
| accordingly.
| anonzzzies wrote:
| I hope (never needed this) client implementations that do this
| all for you and pick the best implementation based on what the
| client supports? Not sure why the transport is interesting
| when/if you have freedom to choose.
| josephg wrote:
| Yeah there's plenty of high quality websocket client
| libraries in all languages now. Support and feature are
| excellent. And they've been supported in all browsers for a
| decade or something at this point.
|
| I vomit in my mouth a bit whenever people reach for socket.io
| or junk like that. You don't want or need the complexity and
| bugs these libraries bring. They're obsolete.
| _heimdall wrote:
| Using graphql comes with IRS own list of challenges and issues
| though. Its a good solution for some situations, but it isn't
| so universal that you can just switch to it without a problem.
| rednafi wrote:
| Where did graphql come from? It doesn't solve any of the
| problems mentioned here.
| feverzsj wrote:
| Why not just use chunked encoding and get rid of extra requests.
| justinl33 wrote:
| yeah, authentication complexity with WebSockets is severely
| underappreciated. We ran into major RBAC headaches when clients
| needed to switch between different privilege contexts mid-
| session. Long polling with standard HTTP auth patterns eliminates
| this entire class of problems.
| watermelon0 wrote:
| Couldn't you just disconnect and reconnect websocket if
| privileges change, since the same needs to be done with the
| long polling?
| josephg wrote:
| Yeah, and you can send cookies in the websocket connection
| headers. This used to be a problem in some browsers iirc -
| they wouldn't send cookies properly over websocket connection
| requests.
|
| As a workaround in one project I wrote JavaScript code which
| manually sent cookies in the first websocket message from the
| client as soon as a connection opened. But I think this
| problem is now solved in all browsers.
| baumschubser wrote:
| I like long polling, it's easy to understand from start to finish
| and from client perspective it just works like a very slow
| connection. You have to keep track of retries and client-side
| cancelled connections to have one but only one (and the right
| one) of requests at hand to answer to.
|
| One thing that seems clumsy in the code example is the loop that
| queries the data again and again. Would be nicer if the data
| update could also resolve the promise of the response directly.
| moribvndvs wrote:
| You could have your job status update push an update into an
| in-memory or distributed cache and check that in your long poll
| rather than a DB lookup, but that may require adding a bunch of
| complexity to wire the completion of the task to updating said
| cache. If your database is tuned well and you don't have any
| other restrictions (e.g. serverless where you pay by the IO),
| it may be good enough and come out in the wash.
| josephg wrote:
| Hard disagree. Long polling can have complex message ordering
| problems. You have completely different mechanisms for message
| passing from client-to-server and server-to-client. And middle
| boxes can and will stall long polled connections, stopping
| incremental message delivery. (Or you can use one http query
| per message - but that's insanely inefficient over the wire).
|
| Websockets are simply a better technology. With long polling,
| the devil is in the details and it's insanely hard to get those
| details right in every case.
| _nalply wrote:
| One of them 2001 was that Netscape didn't render correctly if
| the connection is still open. Hah. I am sure this issue has
| been fixed a long, long time ago, but perhaps there are other
| issues.
|
| Nowadays I prefer SSE to long polling and websockets.
|
| The idea is: the client doesn't know that the server has new
| data before it makes a request. With a very simple SSE the
| client is told that new data is there then it can request new
| data separately if it wants. This said, SSE has a few quirks,
| one of them that on HTTP/1 the connection counts to the
| maximum limit of 6 concurrent connections per browser and
| domain, so if you have several tabs, you need a SharedWorker
| to share the connection between the tabs. But probably this
| quirk also appllies to long polling and websockets. Another
| quirk, SSE can't transmit binary data and has some
| limitations in the textual data it represents. But for this
| use case this doesn't matter.
|
| I would use websockets only if you have a real bidirectional
| data flow or need to transmit complex data.
| Cheezmeister wrote:
| Streaming SIMD Extensions?
|
| Server-sent events.
| crop_rotation wrote:
| Streaming SIMD Extensions seems very unlikely to have any
| relevance in the above statement, server-sent events is
| the perfect fit.
| zazaulola wrote:
| WebSocket solves a very different problem. It may be only
| partially related to organizing two-way communication, but
| it has nothing to do with data complexity. Moreover, WS are
| not good enough at transmitting binary data.
|
| If you are using SSE and SW and you need to transfer some
| binary data from client to server or from server to client,
| the easiest solution is to use the Fetch API. `fetch()`
| handles binary data perfectly well without transformations
| or additional protocols.
|
| If the data in SW is large enough to require displaying the
| progress of the data transfer to the server, you will
| probably be more suited to `XMLHttpRequest`.
| snackbroken wrote:
| > if you have several tabs, you need a SharedWorker to
| share the connection between the tabs.
|
| You don't _have_ to use a SharedWorker, you can also do
| domain sharding. Since the concurrent connection limit is
| per domain, you can add a bunch of DNS records like
| SSE1.example.org - > 2001:db8::f00; SSE2.example.org ->
| 2001:db8::f00; SSE3.example.org -> 2001:db8::f00; and so
| on. Then it's just a matter of picking a domain at random
| on each page load. A couple hundred tabs ought to be enough
| for anyone ;)
| wruza wrote:
| _you can use one http query per message - but that's insanely
| inefficient over the wire_
|
| Use one http response per message queue snapshot. Send no
| more than N messages at once. Send empty status if the queue
| is empty for more than 30-60 seconds. Send cancel status to
| an awaiting connection if a new connection opens successfully
| (per channel singleton). If needed, send and accept "last"
| id/timestamp. These are my usual rules for long-polling.
|
| Prevents: connection overhead, congestion latency, connection
| stalling, unwanted multiplexing, sync loss, respectively.
|
| _You have completely different mechanisms for message
| passing from client-to-server and server-to-client_
|
| Is this a problem? Why should this even be symmetric?
| josephg wrote:
| You can certainly you can do all that. You also need to
| handle retransmission. And often you also need a way for
| the client to send back confirmations that each side
| received certain messages. So, as well as sequence numbers
| like you mentioned, you probably want acknowledgement
| numbers in messages too. (Maybe - it depends on the
| application).
|
| Implementing a stable, in-order, exactly once message
| delivery system on top of long polling starts to look a lot
| like implementing TCP on top of UDP. Its a solvable
| problem. I've done it - 14 years ago I wrote the first
| opensource implementation of (the server side) of google's
| Browserchannel protocol, from back before websockets
| existed:
|
| https://github.com/josephg/node-browserchannel
|
| This supports long polling on browsers, all the way back to
| IE5.5. It works even when XHR isn't available! I wrote it
| in literate coffeescript, from back when that was a thing.
|
| But getting all of those little details right is really
| very difficult. Its a lot of code, and there are a lot of
| very subtle bugs lurking in this kind of code if you aren't
| careful. So you also need good, complex testing. You can
| see in that repo - I ended up with over 1000 lines of
| server code+comments (lib/server.coffee), and 1500 lines of
| testing code (test/server.coffee).
|
| And once you've got all that working, my implementation
| really wanted server affinity. Which made load balancing &
| failover across over multiple application servers a huge
| headache.
|
| It sounds like your application allows you to simplify some
| details of this network protocol code. You do you. I just
| use websockets & server-sent events. Let TCP/IP handle all
| the details of in-order message delivery. Its really quite
| good.
| wruza wrote:
| This is a common library issue, it doesn't know and has
| to be defensive and featureful at the same time.
|
| Otoh, end-user projects usually know things and can make
| simplifying decisions. These two are incomparable. I
| respect the effort, but I also think that this level of
| complexity is a wrong answer to the call in general. You
| have to price-break requirements because they tend to
| oversell themselves and rarely feature-intersect as much
| as this library implies. Iow, when a client asks for
| guarantees, statuses or something we just tell them to
| fetch from a suitable number of seconds ago and see
| themselves. Everyone works like this, you need some extra
| - track it yourself based on your own metrics and our
| rate limits.
| peheje wrote:
| What about HTTP/2 Multiplexing, how does it hold up against long-
| polling and websockets?
|
| I have only tried it briefly when we use gRPC:
| https://grpc.io/docs/what-is-grpc/core-concepts/#server-stre...
|
| Here it's easy to specify that a endpoint is a "stream", and then
| the code-generation tool gives all tools really to just keep
| serving the client with multiple responses. It looks deceptively
| simple. We already have setup auth, logging and metrics for gRPC,
| so I hope it just works off of that maybe with minor adjustments.
| But I'm guessing you don't need the gRPC layer to use HTTP/2
| Multiplexing?
| toast0 wrote:
| At least in a browser context, HTTP/2 doesn't address server to
| client unsolicitied messages. So you'd still need a polling
| request open from the client.
|
| HTTP/2 does specify a server push mechanism (PUSH_PROMISE), but
| afaik, browsers don't accept them and even if they did, (again
| afaik) there's no mechanism for a page to listen for them.
|
| But if you control the client and the server, you could use it.
| bigbones wrote:
| I don't know how meaningful it is any more, but with long polling
| with a short timeout and a gracefully ended request (i.e. chunked
| encoding with an eof chunk sent rather than disconnection), the
| browser would always end up with one spare idle connection to the
| server, making subsequent HTTP requests for other parts of the UI
| far more likely to be snappier, even if the app has been left
| otherwise idle for half the day
|
| I guess at least this trick is still meaningful where HTTP/2 or
| QUIC aren't in use
| ipnon wrote:
| Articles like this make me happy to use Phoenix and LiveView
| every day. My app uses WebSockets and I don't think about them at
| all.
| leansensei wrote:
| Same here. It truly is a godsend.
| zacksiri wrote:
| I was thinking this exact thing as I was reading the article.
| tzumby wrote:
| I came here to say exactly this! Elixir and OTP (and by
| extension LiveView) are such a good match for the problem
| described in the post.
| j45 wrote:
| I was kind of wondering how something hadn't solve this at
| all, compared to a solution not readily being on one's path.
| cultofmetatron wrote:
| hah seriously. my app uses web sockets extensively but since we
| are also using Phoenix, its never been source of conflict in
| development. it really was just drop it and scale to thousands
| of users.
| arrty88 wrote:
| Why couldn't nodejs with uWS library or golang + gorilla
| handle 10s of thousands of connections?
| apitman wrote:
| I think GP's point is that they feel Phoenix is simpler to
| use than alternatives, not necessarily that it scales
| better.
| dugmartin wrote:
| The icing on the cake is that you can also enable Phoenix
| channels to fallback to longpolling in your endpoint config.
| The generator sets it to false by default.
| pipes wrote:
| Is this similar to Microsoft's blazer?
| pipes wrote:
| Odd, I wonder why I got down voted for it, it was a genuine
| question
| NicoJuicy wrote:
| Blazor? Raw guess
| jtchang wrote:
| This is using elixir right?
| diggan wrote:
| Articles like this make me happy to use Microsoft FrontPage and
| cPanel, I don't think about HTTP or WebSockets at all.
| wutwutwat wrote:
| every websocket setup is painless when running on a single
| server or handling very few connections...
|
| I was on the platform/devops/man w/ many hats team for an
| elixir shop running Phoenix in k8s. WS get complicated even in
| elxir when you have 2+ app instances behind a round robin load
| balancer. You now need to share broadcasts between app servers.
| Here's a situation you have to solve for w/ any app at scale
| regardless of language
|
| app server #1 needs to send a publish/broadcast message out to
| a user, but the user who needs that message isn't connected to
| app server #1 that generated the message, that user is
| currently connected to app server #2.
|
| How do you get a message from one app server to the other one
| which has the user's ws connection?
|
| A bad option is sticky connections. User #1 always connects to
| server #1. Server #1 only does work for users connected to it
| directly. Why is this bad? Hot spots. Overloaded servers.
| Underutilized servers. Scaling complications. Forecasting
| problems. Goes against the whole concept of horizontal scaling
| and load balancing. It doesn't handle side-effect messages, ie
| user #1000 takes some action which needs to broadcast a message
| to user #1 which is connected to who knows where.
|
| The better option: You need to broadcast to a shared broker.
| Something all app servers share a connection to so they can
| themselves subscribe to messages they should handle, and then
| pass it to the user's ws connection. This is a message broker.
| postgres can be that broker, just look at oban for real world
| proof. Throw in pg's listen/notify and you're off to the races.
| But that's heavy from a resources per db conn perspective so
| lets avoid the acid db for this then. Ok. Redis is a good
| option, or since this is elixir land, use the built in
| distributed erlang stuff. But, we're not running raw elixir
| releases on linux, we're running inside of containers, on top
| of k8s. The whole distributed erlang concept goes to shit once
| the erlang procs are isolated from each other and not in their
| perfect Goldilocks getting started readme world. So ok, in
| containers in k8s, so each app server needs to know about all
| the other app servers running, so how do you do that? Hmm,
| service discovery! Ok, well, k8s has service discovery already,
| so how do I tell the erlang vm about the other nodes that I got
| from k8s etcd? Ah, a hex package cool. lib_cluster to the
| rescue https://github.com/bitwalker/libcluster
|
| So we'll now tie the boot process of our entire app to fetching
| the other app server pod ips from k8s service discovery, then
| get a ring of distributed erlang nodes talking to each other,
| sharing message passing between them, this way no matter which
| server the lb routes the user to, a broadcast from any one of
| them will be seen by all of them, and the one who holds the ws
| connection will then forward it down the ws to the user.
|
| So now there's a non trivial amount of complexity and risk that
| was added here. More to reason about when debugging. More to
| consider when working on features. More to understand when
| scaling, deploying, etc. More things to potentially take the
| service down or cause it not to boot. More things to have race
| conditions, etc.
|
| Nothing is ever so easy you don't have to think about it.
| sgarland wrote:
| The full schema isn't listed, but the indices don't make sense to
| me.
|
| (id, cluster_id) sounds like it could / should be the PK
|
| If the jobs are cleared once they've succeeded, and presumably
| retried if they've failed or stalled, then the table should be
| quite small; so small, that a. The query planner is unlikely to
| use the partial index on (status) b. The bloat from the rapidity
| of DELETEs likely overshadows the live tuple size.
| DougN7 wrote:
| I implemented a long polling solution in desktop software over 20
| years ago and it's still working great. It can even be used as a
| tunnel to stream RDP sessions, through which YouTube can play
| without a hiccup. Big fan of long polling, though I admit I
| didn't get a chance to try web sockets back then.
| jclarkcom wrote:
| I did the same, were you at VMware by any chance? At the time
| it was the only way to get comparability with older browsers.
| mojuba wrote:
| Can someone explain why TTL = 60s is a good choice? Why not more,
| or less?
| notatoad wrote:
| i can't speak for why the author chose it, but if you're
| operating behind AWS cloudfront then http requests have a
| maximum timeout of 60s - if you don't close the request within
| 60s, cloudfront will close it for you.
|
| i suspect other firewalls, cdns, or reverse proxy products will
| all do something similar. for me, this is one of the biggest
| benefits of websockets over long-polling: it's a standard way
| to communicate to proxies and firewalls "this connection is
| supposed to stay open, don't close it on me"
| k__ wrote:
| Half-OT:
|
| What's the most resource efficient way to push data to clients
| over HTTP?
|
| I can send data to a server via HTTP request, I just need a way
| to notify a client about a change and would like to avoid polling
| for it.
|
| I heard talk about SSE, WebSockets, and now long-polling.
|
| Is there something else?
|
| What requires the least resources on the server?
| mojuba wrote:
| I don't think any of the methods give any significant advantage
| since in the end you need to maintain a connection per each
| client. The difference between the methods boils down to
| complexity of implementation and reliability.
|
| If you want to reduce server load then you'd have to sacrifice
| responsiveness, e.g. you perform short polls at certain
| intervals, say 10s.
| k__ wrote:
| Okay, thanks.
|
| What's the least complex to implement then?
| mojuba wrote:
| For the browser and if you need only server-to-client
| sends, I assume SSE would be the best option.
|
| For other clients, such as mobile apps, I think long poll
| would be the simplest.
| amelius wrote:
| > Corporate firewalls blocking WebSocket connections was one of
| our other worries. Some of our users are behind firewalls, and we
| don't need the IT headache of getting them to open up WebSockets.
|
| Don't websockets look like ordinary https connections?
| doublerabbit wrote:
| It does. However DPI firewalls look at and block the upgrade
| handshake. Connection: Upgrade
| Upgrade: websocket
| toast0 wrote:
| Some corporate firewalls MITM all https connections. Websocket
| does not look normal once you've terminated TLS.
| amelius wrote:
| Can websites detect this?
| toast0 wrote:
| AFAIK, only by symptoms. If https fetches work and
| websockets don't, that's a sign. HSTS and assorted
| reporting can help a bit in aggregate, but not if the
| corporate MITM CA has been inserted into the browser's
| trusted CA list. I don't think there's an API to get
| certificate details from the browser side to compare.
|
| A proxy may have a different TLS handshake than a real
| browser would, depending on how good the MITM is, but the
| better they are, the more likely it is that websockets
| work.
| yuliyp wrote:
| I think this article is tying a lot of unrelated decisions to
| "Websocket" vs "Long-polling" when they're actually independent.
| A long-polling server could handle a websocket client with just a
| bit of extra work to handle keep-alive.
|
| For the other direction, to support long-polling clients if your
| existing architecture is websockets which get data pushed to them
| by other parts of the system, just have two layers of servers:
| one which maintains the "state" of the connection, and then the
| HTTP server which receives the long polling request can connect
| to the server that has the connection state and wait for data
| that way.
| harrall wrote:
| It sounded like the author(s) just had existing request-
| oriented code and didn't want to rewrite it to be connection-
| oriented.
|
| Personally I would enjoyed solving that problem instead of
| hacking around it but that's me.
| lunarcave wrote:
| Author here.
|
| Having done this, I don't think I'd reduce it to "just a little
| bit of work" to make it hum in production.
|
| Everything in between your UI components and the database layer
| needs to be reworked to work in the connection-oriented
| (Websockets) model of the world vs request-oriented world.
| vitus wrote:
| I would appreciate if the article spent more time actually
| discussing the benefits of websockets (and/or more modern
| approaches to pushing data from server -> browser) and why the
| team decided those benefits were not worth the purported
| downsides. I could see the same simplicity argument being applied
| to using unencrypted HTTP/1.1 instead of HTTP/2, or TCP Reno
| instead of CUBIC.
|
| The section at the end talking about "A Case for Websockets"
| really only rehashes the arguments made in "Hidden Benefits of
| Long-Polling" stating that you need to reimplement these various
| mechanisms (or just use a library for it).
|
| My experience in this space is from 2011, when websockets were
| just coming onto the scene. Tooling / libraries were much more
| nascent, websockets had much lower penetration (we still had to
| support IE6 in those days!), and the API was far less stable
| prior to IETF standardization. But we still wanted to use them
| when possible, since they provided much better user experience
| (lower latency, etc) and lower server load.
| tguvot wrote:
| Another reason: there is a patent troll suing companies over
| usage of websockets.
| vouwfietsman wrote:
| The points mentioned against websockets are mostly fud, I've used
| websockets in production for a very heavy global data streaming
| application, and I would respond the following to the "upsides"
| of not using websockets:
|
| > Observability Remains Unchanged
|
| Actually it doesn't, many standard interesting metrics will break
| because long-polling is not a standard request either.
|
| > Authentication Simplicity
|
| Sure, auth is different than with http, but not more difficult.
| You can easily pass a token.
|
| > Infrastructure Compatibility
|
| I'm sure you can find firewalls out there where websockets are
| blocked, however for my use case I have never seen this reported.
| I think this is outdated, for sure you don't need "special proxy
| configurations or complex infrastructure setups".
|
| > Operational Simplicity
|
| Restarts will drop any persistent connection, state can be both
| or neither in WS or in LP, it doesn't matter what you use.
|
| > Client implementation
|
| It mentions "no special WebSocket libraries needed" and also "It
| works with any HTTP client". Guess what, websockets will work
| with any websocket client! Who knew!
|
| Finally, in the conclusion:
|
| > For us, staying close to the metal with a simple HTTP long
| polling implementation was the right choice
|
| Calling simple HTTP long polling "close to the metal" in
| comparison to websockets is weird. I wouldn't be surprised if
| websockets scale much better and give much more control depending
| on the type of data, but that's besides the point. If you want to
| use long polling because you prefer it, go ahead. Its a great way
| to stick to request/response style semantics that web devs are
| familiar with. Its not necessary to regurgitate a bunch of random
| hearsay arguments that may influence people in the wrong way.
|
| Try to actually leave the reader with some notion of when to use
| long polling vs when to use websockets, not a post-hoc
| justification of your decision based on generalized arguments
| that do not apply.
| amatuer_sodapop wrote:
| > > Observability Remains Unchanged
|
| > Actually it doesn't, many standard interesting metrics will
| break because long-polling is not a standard request either.
|
| As a person who works in a large company handling millions of
| websockets, I fundamentally disagree with discounting the
| observability challenges. WebSockets completely transform your
| observability stack - they require different logging patterns,
| new debugging approaches, different connection tracking, and
| change how you monitor system health at scale. Observability is
| far more than metrics, and handwaving away these architectural
| differences doesn't make the implementation easier.
| Cort3z wrote:
| I think they are mixing some problems here. They could probably
| have used their original setup with Postgres NOTIFY+triggers in
| stead of polling, and only have one "pickup poller" to catch any
| missed events/jobs. In my opinion transaction medium should not
| be linked to how the data is manage internally, but I know from
| experience that this separation is often hard to achieve in
| practice.
| emilio1337 wrote:
| The article does discuss a lot of mixed concepts. I would prefer
| one process polling new jobs/state and one process handling http
| connections/websockets. Hence no flooding the database and
| completely scalable from the client side. The database process
| pushes everything downstream via some queue while the other
| process/server handles those and sends them to respective clients
| imglorp wrote:
| Since the article mentioned Postgres by name, isn't this a case
| for using its asynchronous notification features? Servers can
| LISTEN to a channel and PG can TRIGGER and NOTIFY them when the
| data changes.
|
| No polling needed, regardless of the frontend channel.
| lunarcave wrote:
| Yes, but the problems of detecting that changeset and
| delivering it to the right connection remains to be solved in
| the app layer.
| cluckindan wrote:
| It would be easier to run Hypermode's Dgraph as the database
| and use GraphQL subscriptions from the frontend. But nobody
| ever got fired for choosing postgres.
| j45 wrote:
| I have relatively recently taken steps towards Postgres from
| it's abiality to be at the center of so much until a project
| outgrows it.
|
| In terms of not getting fired - Postgres is a lot more
| innovative than most databases, and the insinuation of IBM.
|
| By innovative I mean uniquely putting in performance related
| items for the last 10-20 years.
| wereHamster wrote:
| Unrelated to the topic in the article... await
| new Promise(resolve => setTimeout(resolve, 500));
|
| In Node.js context, it's easier to: import {
| setTimeout } from "node:timers/promises"; await
| setTimeout(500);
| treve wrote:
| Is that easier? The first snippet is shorter and works on any
| runtime.
| joshmanders wrote:
| In the context of Node.js, where op said, yes it is easier.
| But it's a new thing and most people don't realize timers in
| Node are awaitable yet, so the other way is less about "works
| everywhere" and more "this is just what I know"
| wereHamster wrote:
| I guess most Node.js developers also don't realize that
| there's "node:fs/promises" so you don't have to use
| callbacks or manually wrap functions from "node:fs" with
| util.promisify(). Doesn't mean need to stick with old
| patterns forever.
|
| When I said 'in the context of Node.js' I meant if you are
| in a JS module where you already import other node:
| modules, ie. when it's clear that code runs in a Node.js
| runtime and not in a browser. Of course when you are
| writing code that's supposed to be portable, don't use it.
| Or don't use setTimeout at all because it's not guaranteed
| to be available in all runtimes - it's not part of the
| ECMA-262 language specification after all.
| hombre_fatal wrote:
| I haven't used that once since I found out that it exists.
|
| I just don't see the point. It doesn't work in the browser and
| it shadows global.setTimeout which is confusing. Meanwhile the
| idiom works everywhere.
| joshmanders wrote:
| You can alias it if you're worried about shadowing.
| import { setTimeout as loiter } from "node:timers/promises";
| await loiter(500);
| hombre_fatal wrote:
| Sure, and that competes with a universal idiom.
|
| To me it's kinda like adding a shallowClone(old) helper
| instead of writing const obj = { ...old }.
|
| But no point in arguing about it forever.
| bob1029 wrote:
| I think we could have the best of both worlds. E.g.:
|
| https://socket.io/docs/v4/how-it-works/#upgrade-mechanism
| j45 wrote:
| I wonder if people only look for solutions in their language
| when a tech like websockets is language independent.
| rednafi wrote:
| Neither Server-Sent Events nor WebSockets have replaced all use
| cases of long polling reliably. The connection limit of SSE comes
| up a lot, even if you're using HTTP/2. WebSockets, on the other
| hand, are unreliable as hell in most environments. Also, WS is
| hard to debug, and many of our prod issues with WS couldn't even
| be reproduced locally.
|
| Detecting changes in the backend and propagating them to the
| right client is still an unsolved problem. Until then, long
| polling is surprisingly simple and a robust solution that works.
| pas wrote:
| Robust WS solutions need a fallback anyway, and unless you are
| doing something like Discord long polling is a reasonable
| option.
| Animats wrote:
| Long polling has some problems of its own.
|
| Second Life has an HTTPS long polling channel between client and
| server. It's used for some data that's too bulky for the UDP
| connection, not too time sensitive, or needs encryption. This has
| caused much grief.
|
| On the client side, the poller uses libcurl. Libcurl has
| timeouts. If the server has nothing to send for a while, libcurl
| times out. The client then makes the request again. This results
| in a race condition if the server wants to send something between
| timeout and next request. Messages get lost.
|
| On top of that, the real server is front-ended by an Apache
| server. This just passes through relevant requests, blocking the
| endless flood of junk HTTP requests from scrapers, attacks, and
| search engines. Apache has a timeout, and may close a connection
| that's in a long poll and not doing anything.
|
| Additional trouble can come from middle boxes and proxy servers
| that don't like long polling.
|
| There are a lot of things out there that just don't like holding
| an HTTP connection open. Years ago, a connection idle for a
| minute was fine. Today, hold a connection open for ten seconds
| without sending any data and something is likely to disconnect
| it.
|
| The end result is an unreliable message channel. It has to have
| sequence numbers to detect duplicates, and can lose messages. For
| a long time, nobody had discovered that, and there were
| intermittent failures that were not understood.
|
| In the original article, the chart section labelled "loop"
| doesn't mention timeout handling. That's not good. If you do long
| polling, you probably need to send something every few seconds to
| keep the connection alive. Not clear what a safe number is.
| rednafi wrote:
| Yeah, some servers close connections when there's no data
| transfer. When the backend holds the connection while polling
| the database until a timeout occurs or the database returns
| data, it needs to send something back to the client to keep the
| connection alive. I wonder what could be sent in this case and
| whether it would require special client-side logic.
| wutwutwat wrote:
| Every problem you just listed is 100% in your control and able
| to be configured, so the issue isn't long polling, it's your
| setup/configs. If your client (libcurl) times out a request,
| set the timeout higher. If apache is your web server and it
| disconnects idle clients, increase the timeout, tell it not to
| buffer the request and to pass it straight back to the app
| server. If there's a cloud lb somewhere (sounds like it because
| alb defaults to a 10s idle timeout), increase the timeouts...
|
| Every timeout in every hop of the chain is within your control
| to configure. Setup a subdomain and send long polling requests
| through that so the timeouts can be set higher and not impact
| regular http requests or open yourself up to slow client ddos.
|
| Why would you try to do long polling and not configure your
| request chain to be able to handle them without killing idle
| connections? The problems you have only exist because you're
| allowing them to exist. Set your idle timeouts higher. Send
| keepalives more often. Tell your web servers to not do request
| buffering, etc.
|
| All of that is extremely easy to test and verify functioanlity.
| Does the request live longer than your polling interval? Yes?
| Great you're done! No? Tune some more timeouts and log the
| request chain everywhere you can until you know where the
| problems lie. Knock them out one by one going back to the
| origin until you get what you want.
|
| Long polling is easy to get right from an operations
| perspective.
| moritonal wrote:
| Whilst it's possible you may be correct, I do have to point
| out you are, I believe, lecturing John Nagle, known for
| Nagle's algorithm, used in most TCP stacks in the world.
| motorest wrote:
| > Whilst it's possible you may be correct, I do have to
| point out you are, I believe, lecturing John Nagle, known
| for Nagle's algorithm, used in most TCP stacks in the
| world.
|
| Thank you for pointing that out. This thread alone is bound
| to become a meme.
| wutwutwat wrote:
| Oh fuck a famous person!? I told a famous person to
| adjust their server timeouts?!!
|
| This changes everything!
|
| I better go kill myself from this embarrassment so my
| family doesn't have to live with my shame! Hopefully with
| enough generations of time passing my family will rise
| above the stain I've created for them!
| motorest wrote:
| > I better go kill myself from this embarrassment so my
| family doesn't have to live with my shame!
|
| There's no need to go to extremes, no matter how
| embarrassing and notably laughable your comment was. I'd
| say enjoy your fame.
| wutwutwat wrote:
| All the more reason that the poster should know that the
| things they are complaining about are self inflicted
| then...?
|
| The comment makes long polling out to be the issue, but the
| issue is the timeouts of the user's software, which are
| preventing the long polling from being effective.
|
| Names mean very little to me. Am I supposed to feel
| embarrassed or stupid now that you've pointed out that I
| gave advise to a tech OG that knows his shit? I don't care
| who the person is, and in this case, the way I read their
| message, they didn't know their shit.
|
| That's the problem with attaching a name to things, because
| if that name is known for being some God like person, they
| then become incapable of being a human being capable of
| saying something less informed than anyone else in the
| room, even if they might say something that others could
| give them some advise or insight on. Nope. This person's
| name is known, so there's no way this other no-name person
| could possibly be able to offer advise to them. That's now
| how it works. The known name is the smartest one in the
| room and that's just a fact of the universe, duh!
| exe34 wrote:
| Oh my gosh :-D
| rezmason wrote:
| I bet there's an online college credit transfer program
| that'll accept this as a doctoral defense. Depending on how
| Nagle's finagled.
| tbillington wrote:
| > Every timeout in every hop of the chain is within your
| control to configure.
|
| lol
| wutwutwat wrote:
| I wasn't talking about network switch hops and if you're
| trying to long polling and don't have control over the web
| servers going back to your systems then wtf are you trying
| to do long polling for anyway.
|
| I don't try to run red lights because I don't have control
| over the lights on the road.
| interroboink wrote:
| I'm new to websockets, please forgive my ignorance -- how is
| sending some "heartbeat" data over long polling different from
| the ping/pong mechanism in websockets?
|
| I mean, in both cases, it's a TCP connection over (eg) port 443
| that's being kept open, right? Intermediaries can't snoop the
| data if its SSL, so all they know is "has some data been sent
| recently?" Why would they kill long-polling sessions after
| 10sec and not web socket ones?
| mike-cardwell wrote:
| That race condition has nothing to do with long polling, it's
| just poor design. The sender should stick the message in a
| queue and the client reads from that queue. Perhaps with the
| client specifying the last id it saw.
| sneak wrote:
| Given long polling, I have never ever understood why websockets
| became a thing. I've never implemented them and never will, it's
| a protocol extension where none is necessary.
| ekkeke wrote:
| Websockets can operate outside the request/response model used
| in this long polling example, and allow you to stream data
| continuously. They're also a lot more efficient in terms of
| framing and connections if there are a lot of individual pieces
| of data to push as you don't need to spin a up a connection +
| request for each bit.
| LeicaLatte wrote:
| Long polling is my choice for simple, reliable and plug and play
| like interfaces. HTTP requests tend to be standard and simplify
| authentication as well. Systems with frequent but not constant
| updates are ideal. Text yes. Voice maybe not.
|
| Personal Case Study: I built mobile apps which used Flowise
| assistants for RAG and found websockets compeletely out of line
| with the rest of my system and interactions. Suddenly I was
| fitting a round peg in a square hole. I switched to OpenAI
| assistants and their polling system felt completely "natural" to
| integrate.
___________________________________________________________________
(page generated 2025-01-05 23:00 UTC)