[HN Gopher] Using gRPC for (local) inter-process communication -...
___________________________________________________________________
Using gRPC for (local) inter-process communication - F. Werner's
Research Page
Author : zardinality
Score : 74 points
Date : 2024-11-20 13:42 UTC (9 hours ago)
(HTM) web link (www.mpi-hd.mpg.de)
(TXT) w3m dump (www.mpi-hd.mpg.de)
| jeffbee wrote:
| Interesting that it is taken on faith that unix sockets are
| faster than inet sockets.
| dangoodmanUT wrote:
| Are there resources suggesting otherwise?
| aoeusnth1 wrote:
| Tell me more, I know nothing about IPC
| eqvinox wrote:
| That's because it's logical that implementing network capable
| segmentation and flow control is more costly than just moving
| data with internal, native structures. And looking up random
| benchmarks yields anything from equal performance to 10x faster
| for Unix domain.
| bluGill wrote:
| It wouldn't surprise me if inet sockets were more optimized
| though and so unix sockets ended up slower anyway just
| because nobody has bothered to make them good (which is
| probably why some of your benchmarks show equal performance).
| Benchmarks are important.
| eqvinox wrote:
| I agree, but practically speaking they're used en masse all
| across the field and people did bother to make them good
| [enough]. I suspect the benchmarks where they come up equal
| are cases where things are limited by other factors (e.g.
| syscall overhead), though I don't want to make unfounded
| accusations :)
| sgtnoodle wrote:
| I've spent several years optimizing a specialized IPC
| mechanism for a work project. I've spent time reviewing the
| Linux Kernel's unix socket source code to understand
| obscure edge cases. There isn't really much to optimize -
| it's just copying bytes between buffers. Most of the
| complexity of the code has to do with permissions and
| implementing the ability to send file descriptors. All my
| benchmarks have unambiguously showed unix sockets to be
| more performant than loopback TCP for my particular use
| case.
| pjmlp wrote:
| As often in computing, profiling is a foreign word.
| yetanotherdood wrote:
| Unix Domain Sockets are the standard mechanism for app->sidecar
| communication at Google (ex: Talking to the TI envelope for
| logging etc.)
| jeffbee wrote:
| Search around on Google Docs for my 2018 treatise/rant about
| how the TI Envelope was the least-efficient program anyone
| had ever deployed at Google.
| eqvinox wrote:
| Ok, now it sounds like you're blaming unix sockets for
| someone's shitty code...
|
| No idea what "TI Envelope" is, and a Google search doesn't
| come up with usable results (oh the irony...) - if it's a
| logging/metric thing, those are hard to get to perform well
| regardless of socket type. We ended up using batching with
| mmap'd buffers for crash analysis. (I.e. the mmap part only
| comes in if the process terminates abnormally, so we can
| recover batched unwritten bits.)
| jeffbee wrote:
| > Ok, now it sounds like you're blaming unix sockets for
| someone's shitty code...
|
| No, I am just saying that the unix socket is not Brawndo
| (or maybe it is?), it does not necessarily have what IPCs
| crave. Sprinkling it into your architecture may or may
| not be relevant to the efficiency and performance of the
| result.
| eqvinox wrote:
| Sorry, what's brawndo? (Searching only gives me movie
| results?)
|
| We started out discussing AF_UNIX vs. AF_INET6. If you
| can conceptually use something faster than sockets that's
| great, but if you're down to a socket, unix domain will
| generally beat inet domain...
| exe34 wrote:
| it's what plants crave! it's got electrolytes.
| sgtnoodle wrote:
| You can do some pretty crazy stuff with pipes, if you
| want to do better than unix sockets.
| yetanotherdood wrote:
| I'm a xoogler so I don't have access. Do you have a TL;DR
| that you can share here (for non-Googlers)?
| ithkuil wrote:
| servo's Ipc-channel doesn't use Unix domain sockets to move
| data. It uses it to share a memfd file descriptor effectively
| creating a memory buffer shared between two processes
| HumblyTossed wrote:
| > Using a full-featured RPC framework for IPC seems like overkill
| when the processes run on the same machine. However, if your
| project anyway exposes RPCs for public APIs or would benefit from
| a schema-based serialisation layer it makes sense to use only one
| tool that combines these--also for IPC.
|
| It _might_ make sense. _Usually_ , if you're using IPC, you need
| it to be as fast as possible and there are several solutions that
| are much faster.
| dmoy wrote:
| I tend to agree. Usually you want as fast as possible.
| Sometimes you don't though.
|
| E.g. Kythe (kythe.io) was designed so that its individual
| language indexers run with a main driver binary written in Go,
| and then a subprocess binary written in.... whatever. There's a
| requirement to talk between the two, but it's not really a lot
| of traffic (relative to e.g. the CPU cost of the subprocess
| doing compilation).
|
| So what happens in practice is that we used Stubby (like gRPC,
| except not public), because it was low overhead* to write the
| handler code for it on both ends, and got us some free other
| bits as well.
|
| * Except when it wasn't lol. It worked great for the first N
| languages written in langs with good stubby support. But then
| weird shit (for Google) crawled out of the weeds that didn't
| have stubby support, so there's some handwaving going on for
| the long tail.
| monocasa wrote:
| I'm not even sure that I'd say usually. Most of the time you're
| just saying "hey daemon, do this thing that you're already
| preconfigured for".
| ks2048 wrote:
| What are the other solutions that are much faster? (besides
| rolling your own mini format).
| pengaru wrote:
| There's a mountain of grpc-centric python code at $dayjob and
| it's been miserable to live with. Maybe it's less awful in c/c++,
| or at least confers some decent performance there. In python it's
| hot garbage.
| andy_ppp wrote:
| Strongly agree, it's has loads of problems, my least favourite
| being the schema is not checked in the way you might think,
| there's not even a checksum to say this message and this
| version of the schema match. So when there's old
| services/clients around and people haven't versioned their
| schema's safely (there was no mechanism for this apart from
| manually checking in PRs) you can get gibberish back for fields
| that should contain data. It's basically just a binary blob
| with whatever schema the client has overlaid so debugging is an
| absolute pain. Unless you are Google scale use a text based
| format like JSON and save yourself a lot of hassle.
| discreteevent wrote:
| - JSON doesn't have any schema checking either.
|
| - You can encode the protocol buffers as JSON if you want a
| text based format.
| jayd16 wrote:
| You can trivially make breaking changes in a JSON blob too.
| GRPC has well documented ways to make non-breaking changes.
| If you're working somewhere where breaking schema changes go
| in with little fanfare and much debugging then I'm not sure
| JSON will save you.
|
| The only way to know is to dig through CLs? Write a test.
|
| There's also automated tooling to compare protobuff schemas
| for breaking changes.
| andy_ppp wrote:
| JSON contains a description of the structure of the data
| that is readable by both machines and humans. JSON can
| certainly go wrong but it's much simpler to see when it has
| because of this. GRPC is usually a binary black box that
| adds loads of developer time to upskill, debug, figure out
| error cases and introduces whole new classes of potential
| bugs.
|
| If you are building something that needs binary performance
| that GRPC provides, go for it, but pretending there is no
| extra cost over doing the obvious thing is not true.
| aseipp wrote:
| > JSON contains a description of the structure of the
| data that is readable by both machines and humans.
|
| No, it by definition does not, because JSON has no
| schema. Only your _application_ contains and knows the
| (expected) structure of the data, but you literally
| cannot know what structure any random blob of JSON
| objects will have without a separate schema. When you
| read a random /docs page telling you "the structure of
| the resulting JSON object from this request is ...",
| that's just a schema but written in English instead of
| code. This has big downstream ramifications.
|
| For example, many APIs make the mistake of parsing JSON
| and only returning some opaque "Object" type, which you
| then have to map onto your own domain objects, meaning
| you actually parse every JSON object _twice_ : once into
| the opaque structure, and once into your actual
| application type. This has major efficiency ramifications
| when you are actually dealing with a lot of JSON. The
| only way to do better than this is to have a schema in
| some form -- any form at all, even English prose -- so
| you can go from the JSON text representation directly
| into your domain type at parse-time. This is part of the
| reason why so many JSON libraries in every language tend
| to have some high level way of declaring a JSON object in
| the host language, typically as some kind of 'struct' or
| enum, so that they can automatically derive an actually
| efficient parsing step and skip intermediate objects.
| There's just no way around it. JSON doesn't have any
| schema, and that's part of its appeal, but in practice
| one always exists somewhere.
|
| You can use protobuf in text-based form too, but from
| what you said, you're probably screwed anyway if your
| coworkers are just churning stuff and changing the values
| of fields and stuff randomly. They're going to change the
| meaning of JSON fields willy nilly too and there will be
| nothing to stop you from landing back in step 1.
|
| I will say that the quality of gRPC integrations tends to
| vary wildly based on language though, which adds debt,
| you're definitely right about that.
| andy_ppp wrote:
| If I gave you a JSON object with name, age, position,
| gender etc. etc. would you not say it has structure? If I
| give you a GRPC binary you need the separate schema and
| tools to be able to comprehend it. That's all I'm saying
| is the separation of the schema from some minimal
| structure makes the debugging of services more difficult.
| I would also add the GRPC implementation I used in
| Javascript (long ago) was not actually checking the types
| of the field in a lot of cases so rather than being a
| schema that rejects if some field is not a text field it
| would just return binary junk. JSON Schema or almost
| anything else will give you a parsing error instead.
|
| Maybe the tools are fantastic not but I still think being
| able to debug messages without them is an advantage in
| almost all systems, you probably don't need the level of
| performance GRPC provides.
|
| If you're using JSON Protobufs why would you add this
| extra complexity - it will mean messaging is just as slow
| as using JSON. What are the core advantages of GRPC under
| these conditions?
| aseipp wrote:
| > If I gave you a JSON object with name, age, position,
| gender etc. etc. would you not say it has structure?
|
| That's too easy. What if I give you a 200KiB JSON object
| with 40+ nested fields that's whitespace stripped and has
| base64 encoded values? Its "structure" is a red herring.
| It is not a matter of text or binary. The net result is I
| still have to use a tool to inspect it, even if that's
| only something like gron/jq in order to make it actually
| human readable. But at the end of the day the structure
| is a concern of _the application_ , I have to evaluate
| its structure in the context of that application. I don't
| just look at JSON objects for fun. I do it mostly to
| debug stuff. I still need the schematic structure of the
| object to even know what I need to write.
|
| FWIW, I normally use something like grpcurl in order to
| do curl-like requests/responses to a gRPC endpoint and
| you can even have it give you the schema for a given
| service. This has worked quite well IME for almost all my
| needs, but I accept with this stuff you often have lots
| of "one-off" cases that you have to cobble stuff together
| or just get dirty with printf'ing somewhere inside your
| middleware, etc.
|
| > I would also add the GRPC implementation I used in
| Javascript (long ago) was not actually checking the types
| of the field in a lot of cases so rather than being a
| schema that rejects if some field is not a text field it
| would just return binary junk. JSON Schema or almost
| anything else will give you a parsing error instead.
|
| Yes, I totally am with you on this. Many of the
| implementations just totally suck and JSON is common
| enough nowadays that you kind of _have_ to at least have
| something that doesn 't completely fall over, if you want
| to be taken remotely seriously. It's hard to write a
| _good_ JSON library, but it 's definitely harder to write
| a good full gRPC stack. I 100% have your back on this. I
| would probably dislike gRPC even more but I'm lucky
| enough to use it with a "good" toolkit (Rust/Prost.)
|
| > If you're using JSON Protobufs why would you add this
| extra complexity - it will mean messaging is just as slow
| as using JSON. What are the core advantages of GRPC under
| these conditions?
|
| I mean, if your entire complaint is about text vs binary,
| not efficiency or correctness, JSON Protobuf seems like
| it fits your needs. You still get the other benefits of
| gRPC you'd have anywhere (an honest-to-god schema, better
| transport efficiency over mandated HTTP/2, some amount of
| schema-generic middleware, first-class streaming, etc
| etc.)
|
| FWIW, I don't particularly love gRPC. And while I admit I
| loathe JSON, I'm mainly pushing back on the notion that
| JSON has some "schema" or structure. No, it doesn't! Your
| _application_ has and knows structure. A JSON object is
| just a big bag of stuff. For all its failings, gRPC
| having a schema is a matter of it actually putting the
| correct foot first and admitting that your schema is
| real, it exists, and most importantly _can be written
| down precisely and checked by tools_!
| jeffbee wrote:
| There is an art to having forwards and backwards compatible
| RPC schemas. It is easy, but it is surprisingly difficult to
| get people to follow easy rules. The rules are as follows:
| 1) Never change the type of a field 2) Never change the
| semantic meaning of a field 3) If you need a different
| type or semantics, add a new field
|
| Pretty simple if you ask me.
| andy_ppp wrote:
| If I got to choose my colleagues this would be fine,
| unfortunately I had people who couldn't understand eventual
| consistency. One of the guys writing Go admitted he didn't
| understand what a pointer was etc. etc.
| lrem wrote:
| How does JSON protect you from that?
| andy_ppp wrote:
| People understand JSON fairly commonly as they can see
| what is happening in a browser or any other system - what
| is the equivalent for GRPC if I want to do
| console.log(json)?
|
| GRPC for most people is a completely black box with
| unclear error conditions that are not as clear to me at
| least. For example what happens if I have an old schema
| and I'm not seeing a field, there's loads of things that
| can be wrong - old services, old client, even messages
| not being routed correctly due to networking settings in
| docker or k8s.
|
| Are you denying there is absolutely tones to learn here
| and it is trickier to debug and maintain?
| ryanisnan wrote:
| Can you say more about what the pain points are?
| cherryteastain wrote:
| C++ generated code from protobuf/grpc is pretty awful in my
| experience.
| bluGill wrote:
| Do you need to look at that generated code though? I haven't
| used gRPC yet (some poor historical decisions mean I can't
| use it in my production code so I'm not in a hurry -
| architecture is rethinking those decisions in hopes that we
| can start using it so ask me in 5 years what I think). My
| experience with other generated code is that it is not
| readable but you never read it so who cares - instead you
| just trust the interface which is easy enough (or is terrible
| and not fixable)
| cherryteastain wrote:
| I meant the interfaces are horrible. As you said, as long
| as it has a good interface and good performance, I wouldn't
| mind.
|
| For example, here's the official tutorial for using the
| async callback interfaces in gRPC:
| https://grpc.io/docs/languages/cpp/callback/
|
| It encourages you to write code with practices that are
| quite universally considered bad in modern C++ due to a
| very high chance of introducing memory bugs, such as
| allocating objects with new and expecting them to clean
| themselves up via delete this;. Idiomatic modern C++ would
| be using smart pointers, or go a completely different route
| with co-routines and no heap-allocated objects.
| bluGill wrote:
| ouch. I'm temped to look up the protocol and build my own
| grpc implementation from scratch. Generated code isn't
| that hard to create, but it is something I'd expect to be
| able to just use someone else's version instead of
| writing (and supporting) myself.
| seanw444 wrote:
| I'm using it for a small-to-medium sized project, and the
| generated files aren't too bad to work with at that scale. The
| actual generation of the files is very awful for Python
| specifically, though, and I've had to write a script to bandaid
| fix them after they're generated. An issue has been open for
| this for years on the protobuf compiler repo, and it's
| basically a "wontfix" as Google doesn't need it fixed for their
| internal use. Which is... fine I guess.
|
| The Go part I'm building has been much more solid in contrast.
| lima wrote:
| I guess you're talking about the relative vs. absolute import
| paths?
|
| This solves it: https://github.com/cpcloud/protoletariat
| seanw444 wrote:
| Yeah, my script works perfectly fine though, without
| pulling in another dependency. The point is that this
| shouldn't be necessary. It feels wrong.
| eqvinox wrote:
| It's equally painful in C, you have to wrap the C++ library :(
| DashAnimal wrote:
| What I loved about Fuchsia was its IPC interface, using FIDL
| which is like a more optimized version of protobufs.
|
| https://fuchsia.dev/fuchsia-src/get-started/learn/fidl/fidl
| DabbyDabberson wrote:
| loved - in the past tense?
| solarpunk wrote:
| fuchsia was pretty deeply impacted by google layoffs iirc
| pjmlp wrote:
| > Using a full-featured RPC framework for IPC seems like overkill
| when the processes run on the same machine.
|
| That is exactly what COM/WinRT, XPC, Android Binder, D-BUS are.
|
| Naturally they have several optimisations for local execution.
| jeffbee wrote:
| Binder seriously underappreciated, IMHO. But I think it makes
| sense to use gRPC or something like it if there is any
| possibility that in the future an "IPC" will become an "RPC" to
| a foreign host. You don't want to be stuck trying to change an
| IPC into an RPC if it was foreseeable that it would eventually
| become remote due to scale.
| pjmlp wrote:
| Kind of, as anyone with CORBA, DCOM, RMI, .NET Remoting
| experience has plenty of war stories regarding distributed
| computing with the expectations of local calls.
| dietr1ch wrote:
| In my mind the abstraction should allow for RPCs and being on
| the same machine should allow to optimise things a bit, this
| way you simply build for the general case and lose little to
| no performance.
|
| Think of the loopback, my programs don't know (or at least
| shouldn't know) that IPs like 127.0.0.5 are special, but then
| the kernel knows that messages there are not going to go on
| any wire and handles that differently.
| bunderbunder wrote:
| This could possibly even be dead simple to accomplish if
| application-level semantics aren't being communicated by co-
| opting parts of the communication channel's spec.
|
| I think that this factor might be the ultimate source of my
| discomfort with standards like REST. Things like using HTTP
| verbs and status codes, and encoding parameters into the
| request's URL, mean that there's almost not even an option to
| choose a communication channel that's lighter-weight than
| HTTP.
| loeg wrote:
| Binder is totally unavailable outside of Android, right? IIRC
| it's pretty closely coupled to Android's security model and
| isn't a great fit outside of that ecosystem.
| p_l wrote:
| It's usable outside Android, it's more that there's "some
| assembly required" involved - you need essentially a
| nameserver process which also handles permissions and the
| main one is pretty much the one in android. Also Binder
| itself doesn't really specify what data it exchanges
| around, the userland on Android uses one based on BeOS
| (which is where Binder comes from) but also has at least
| one other variant used for drivers.
| mgsouth wrote:
| One does not simply walk into RPC country. Communication
| modes are architectual decisions, and they flavor everything.
| There's as much difference between IPC and RPC as there is
| between popping open a chat window to ask a question, and
| writing a letter on paper and mailing it. In both cases you
| can pretend they're equivalent, and it will work after a
| fashion, but your local communication will be vastly more
| inefficient and bogged down in minutia, and your remote comms
| will be plagued with odd and hard-to-diagnose bottlenecks and
| failures.
|
| Some _generalities_ :
|
| Function call: The developer just calls it. Blocks until
| completion, errors are due to bad parameters or a resource
| availability problem. They are handled with exceptions or
| return-code checks. Tests are also simple function calls.
| Operationally everything is, to borrow a phrase from aviation
| regarding non-retractable landing gear, "down and welded".
|
| IPC: Architectually, and as a developer, you start worrying
| about your function as a resource. Is the IPC recipient
| running? It's possible it's not; that's probably treated as
| fatal and your code just returns an error to its caller.
| You're more likely to have a m:n pairing between caller and
| callee instances, so requests will go into a queue. Your code
| may still block, but with a timeout, which will be a fatal
| error. Or you might treat it as a co-routine, with the extra
| headaches of deferred errors. You probably won't do retries.
| Testing has some more headaches, with IPC resource
| initialization and tear-down. You'll have to test queue
| failures. Operations is also a bit more involved, with an
| additional resource that needs to be baby-sat, and co-
| ordinated with multiple consumers.
|
| RPC: IPC headaches, but now you need to worry about lost
| messages, and messages processed but the acknowledgements
| were lost. Temporary failures need to be faced and re-tried.
| You will need to think in terms of "best effort", and
| continually make decisions about how that is managed. You'll
| be dealing with issues such as at-least-once delivery vs. at-
| most-once. Consistency issues will need to be addressed much
| more than with IPC, and they will be thornier problems.
| Resource availability awareness will seep into everything;
| application-level back-pressure measures _should_ be built-
| in. Treating RPC as simple blocking calls will be a continual
| temptation; if you or less-enlightened team members subcumb
| then you'll have all kinds of flaky issues. Emergent, system-
| wide behavior will rear its ugly head, and it will involve
| counter-intuitive interactions (such as bigger buffers
| reducing throughput). Testing now involves three non-trivial
| parts--your code, the called code, and the communications
| mechanisms. Operations gets to play with all kinds of fun
| toys to deploy, monitor, and balance usage.
| merb wrote:
| Btw. Modern windows also superports Unix domain sockets, so if
| you have an app that has another service that will run on the
| same machine or on a different one it is not so bad to use grpc
| over uds.
| pjmlp wrote:
| Nice idea, although it is still slower than COM.
|
| COM can run over the network (DCOM), inside the same computer
| on its own process (out-proc), inside the client (in-proc),
| designed for in-proc but running as out-proc (COM host).
|
| So for max performance, with the caveat of possibly damaging
| the host, in-proc will do it, and be faster than any kind of
| sockets.
| tgma wrote:
| > COM can run over the network (DCOM)
|
| Ah the good ol' Blaster worm...
|
| https://en.wikipedia.org/wiki/Blaster_(computer_worm)
| jauntywundrkind wrote:
| I _super_ dug the talk _Building SpiceDB: A gRPC-First Database -
| Jimmy Zelinskie, authzed_ which is about a high-performance auth
| system, which talks to this. https://youtu.be/1PiknT36218
|
| It's a 4-tier arhcitecture (clients - front end service - query
| service - database) auth system, and all communication is over
| grpc (except to the database). Jimmy talks about the advantages
| of having a very clear contract between systems.
|
| There's a ton of really great nitty gritty detail about being
| super fast with gRPC. https://github.com/planetscale/vtprotobuf
| for statical-size allocating protobuf rather than slow
| reflection-based dynamic size. Upcoming memory pooling work to
| avoid allocations at all. Tons of advantages for observability
| right out of the box. It's subtle but I also get the impression
| most gRPC stubs are miserably bad, that Authzed had to go long
| and far to get away from a lot of gRPC tarpits.
|
| This is one of my favorite talks from 2024, and strongly sold
| me.on how viable gRPC is for internal services. Even if I were
| doing local multi-process stuff, I would definitely consider gRPC
| after this talk. The structure & clarity & observability are huge
| wins, and the performance can be really good if you need it.
|
| https://youtu.be/1PiknT36218#t=12m 12min is the internal cluster
| details.
| jzelinskie wrote:
| Thank you! That really means a lot!
|
| >It's subtle but I also get the impression most gRPC stubs are
| miserably bad, that Authzed had to go long and far to get away
| from a lot of gRPC tarpits.
|
| They aren't terrible, but they also aren't a user experience
| you want to deliver directly to your customers.
| zackangelo wrote:
| Had to reach for a new IPC mechanism recently to implement a
| multi-GPU LLM inference server.
|
| My original implementation just pinned one GPU to its own thread
| then used message passing between them in the same process but
| Nvidia's NCCL library hates this for reasons I haven't fully
| figured out yet.
|
| I considered gRPC for IPC since I was already using it for the
| server's API but dismissed it because it was an order of
| magnitude slower and I didn't want to drag async into the child
| PIDs.
|
| Serializing the tensors between processes and using the Servo
| team's ipc-channel crate[0] has worked surprisingly well. If
| you're using Rust and need a drop-in (ish) replacement for the
| standard library's channels, give it a shot.
|
| [0] https://github.com/servo/ipc-channel
| anilakar wrote:
| In addition to cloud connectivity, we've been using MQTT for IPC
| in our Linux IIoT gateways and touchscreen terminals and honestly
| it's been one of the better architectural decisions we've made.
| Implementing new components for specific customer use cases could
| not be easier and the component can be easily placed on the
| hardware or on cloud servers wherever it fits best.
|
| I don't see how gRPC could be any worse than that.
|
| (The previous iteration before MQTT used HTTP polling and
| callbacks worked on top of an SSH reverse tunnel abomination.
| Using MQTT for IPC was kind of an afterthought. The SSH Cthulhu
| is still in use for everyday remote management because you cannot
| do Ansible over MQTT, but we're slowly replacing it with
| Wireguard. I gotta admit that out of all VPN technologies we've
| experimented with, SSH transport has been the most reliable one
| in various hostile firewalled environments.)
| goalieca wrote:
| MQTT also lends itself to async very well. The event based
| approach is a real winner.
| justinsaccount wrote:
| > In our scenario of local IPC, some obvious tuning options
| exist: data is exchanged via a Unix domain socket (unix://
| address) instead of a TCP socket
|
| AFAIK at least on linux there is no difference between using a
| UDS and a tcp socket connected to localhost.
| sgtnoodle wrote:
| There's definitely differences, whether or not it matters for
| most usages. I've worked on several IPC mechanisms that
| specifically benefited from one vs. the other.
| palata wrote:
| I have been in a similar situation, and gRPC feels heavy. It
| comes with quite a few dependencies (nothing compared to npm or
| cargo systems routinely bringing hundreds of course, but enough
| to be annoying when you have to cross-compile them). Also at
| first it sounds like you will benefit from all the languages that
| protobuf supports, but in practice it's not that perfect: some
| python package may rely on the C++ implementation, and therefore
| you need to compile it for your specific platform. Some language
| implementations are just maintain by one person in their free
| time (a great person, but still), etc.
|
| On the other hand, I really like the design of Cap'n Proto, and
| the library is more lightweight (and hence easier) to compile.
| But there, it is not clear on which language implementation you
| can rely other than C++. Also it feels like there are maintainers
| paid by Google for gRPC, and for Cap'n Proto it's not so clear:
| it feels like it's essentially Cloudflare employees improving
| Cap'n Proto for Cloudflare. So if it works perfectly for your
| use-case, that's great, but I wouldn't expect much support.
|
| All that to say: my preferred choice for that would technically
| be Cap'n Proto, but I wouldn't dare making my company depend on
| it. Whereas nobody can fire me for depending on Google, I
| suppose.
| kentonv wrote:
| > it feels like it's essentially Cloudflare employees improving
| Cap'n Proto for Cloudflare.
|
| That's correct. At present, it is not anyone's objective to
| make Cap'n Proto appeal to a mass market. Instead, we maintain
| it for our specific use cases in Cloudflare. Hopefully it's
| useful to others too, but if you choose to use it, you should
| expect that if any changes are needed for your use case, you
| will have to make those changes yourself. I certainly
| understand why most people would shy away from that.
|
| With that said, gRPC is arguably weird in its own way. I think
| most people assume that gRPC is what Google is built on,
| therefore it must be good. But it actually isn't -- internally,
| Google uses Stubby. gRPC is inspired by Stubby, but very
| different in implementation. So, who exactly is gRPC's target
| audience? What makes Google feel it's worthwhile to have
| 40ish(?) people working on an open source project that they
| don't actually use much themselves? Honest questions -- I don't
| know the answer, but I'd like to.
|
| (FWIW, the story is a bit different with Protobuf. The Protobuf
| code _is_ the same code Google uses internally.)
|
| (I am the author of Cap'n Proto and also was the one who open
| sourced Protobuf originally at Google.)
| mgsouth wrote:
| My most vivid gRPC experience is from 10 years or so ago, so
| things have probably changed. We were heavily Go and micro-
| services. Switched from, IIRC, protobuf over HTTP, to gRPC
| "as it was meant to be used." Ran into a weird, flaky bug--
| after a while we'd start getting transaction timeouts. Most
| stuff would get through, but errors would build and
| eventually choke everything.
|
| I finally figured out it was a problem with specific pairs of
| servers. Server A could talk to C, and D, but would timeout
| talking to B. The gRPC call just... wouldn't.
|
| One good thing is you _do_ have the source to everything.
| After much digging through amazingly opaque code, it became
| clear there was a problem with a feature we didn 't even
| need. If there are multiple sub-channels between servers A
| and B. gRPC will bundle them into one connection. It also
| provides protocol-level in-flight flow limits, both for
| individual sub-channels and the combined A-B bundle. It does
| it by using "credits". Every time a message is sent from A to
| B it decrements the available credit limit for the sub-
| channel, and decrements another limit for the bundle as a
| whole. When the message is _processed by the recipient
| process_ then the credit is added back to the sub-channel and
| bundle limits. Out of credits? Then you 'll have to wait.
|
| The problem was that failed transactions were not credited
| back. Failures included processing time-outs. With time-outs
| the sub-channel would be terminated, so that wasn't a
| problem. The issue was with the bundle. The protocol spec was
| (is?) silent as to who owned the credits for the bundle, and
| who was responsible for crediting them back in failure cases.
| The gRPC code for Go, at the time, didn't seem to have been
| written or maintained by Google's most-experienced team (an
| intern, maybe?), and this was simply dropped. The result was
| the bundle got clogged, and A and B couldn't talk. Comm-level
| backpressure wasn't doing us any good (we needed full app-
| level), so for several years we'd just patch new Go libraries
| and disable it.
| lyu07282 wrote:
| Cool that they mention buf, it's such a massive improvement over
| Google's own half abandoned crappy protobuf implementation
|
| https://github.com/bufbuild/buf
___________________________________________________________________
(page generated 2024-11-20 23:01 UTC)