[HN Gopher] The Surprising gRPC Client Bottleneck in Low-Latency...
___________________________________________________________________
The Surprising gRPC Client Bottleneck in Low-Latency Networks
Author : eivanov89
Score : 66 points
Date : 2025-07-23 13:23 UTC (9 hours ago)
(HTM) web link (blog.ydb.tech)
(TXT) w3m dump (blog.ydb.tech)
| yuliyp wrote:
| If you have a single TCP connection, all the data flows through
| that connection, ultimately serializing at least some of the
| processing. Given that the workers are just responding with OK,
| no matter how many CPU cores you give to that you're still bound
| by the throughput of the IO thread (well by the minimum of the
| client and server IO thread). If you want more than 1 IO thread
| to share the load, you need more than one TCP connection.
| xtoilette wrote:
| classic case of head of line blocking!
| yuliyp wrote:
| I don't think this is head-of-line blocking. That is, it's not
| like a single slow request causes starvation of other requests.
| The IO thread for the connection is grabbing and dispatching
| data to workers as fast as it can. All the requests are
| uniform, so it's not like one request would be bigger/harder to
| handle for that thread.
| otterley wrote:
| > First, we checked the number of TCP connections using lsof
| -i TCP:2137 and found that only a single TCP connection was
| used regardless of in-flight count.
|
| It's head-of-line blocking. When requests are serialized, the
| queue will grow as long as the time to service a request is
| longer than the interval between arriving requests. Queue
| growth is bad if sufficient capacity exists to service
| requests in parallel.
| lacop wrote:
| Somewhat related, I'm running into a gRPC latency issue in
| https://github.com/grpc/grpc-go/issues/8436
|
| If request payload exceeds certain size the response latency goes
| from network RTT to double that, or triple.
|
| Definitely something wrong with either TCP or HTTP/2 windowing as
| it doesn't send the full request without getting ACK from server
| first. But none of the gRPC windowing config options nor linux
| tcp_wmem/rmem settings work. Sending one byte request every few
| hundred milliseconds fixes it by keeping the gRPC channel / TCP
| connection active. Nagle / slow start is disabled.
| littlecranky67 wrote:
| sounds like classic tcp congestion window scaling delay. Sounds
| like your payload exceeds 10x initcwnd.
| lacop wrote:
| Doesn't initcwnd only apply as the initial value? I don't
| care that the first request on the gRPC channel is slow, but
| subsequent requests on the same channel reuse the TCP
| connection and should have larger window size. This works as
| long as the channel is actively being used, but after short
| inactivity (few hundred ms, unsure exactly) something appears
| to revert back.
| littlecranky67 wrote:
| Yes, in case of hot tcp connections congestion control
| should not be the issue.
| lacop wrote:
| Yeah that was my understanding too, hence I filed the bug
| (actually duplicate of older bug that was closed because
| poster didn't provide reproduction).
|
| Still not sure if this is linux network configuration
| issue or grpc issue, but something is for sure broken if
| I can't send a ~1MB request and get response within
| roughly network RTT + server processing time.
| eivanov89 wrote:
| That's indeed interesting, thank you for sharing.
| ltbarcly3 wrote:
| gRPC is a very badly implemented system. I have gotten 25%-30%+
| improvements in throughput just by monkeypatching client
| libraries for google cloud to force json api endpoint usage.
|
| At least try something else besides gRPC when building systems so
| you have a baseline performance understanding. gRPC is OFTEN
| introducing performance breakdowns that goes unnoticed.
| stock_toaster wrote:
| Have you done any comparisons with connect-rpc?
___________________________________________________________________
(page generated 2025-07-23 23:01 UTC)