[HN Gopher] Cap'n Proto 1.0
___________________________________________________________________
Cap'n Proto 1.0
Author : kentonv
Score : 478 points
Date : 2023-07-28 15:28 UTC (7 hours ago)
(HTM) web link (capnproto.org)
(TXT) w3m dump (capnproto.org)
| binary132 wrote:
| I always liked the idea of capnp, but it bothers me that what is
| ultimately a message encoding protocol has an opinion on how I
| should architect my server.
|
| FWIW, gRPC certainly has this problem too, but it's very clearly
| distinct from protobuf, although pb has gRPC-related features.
|
| That entanglement makes me lean towards flatbuffers or even
| protobuf every time I weigh them against capnp, especially since
| it means that fb and pb have much simpler implementations, and I
| place great value on simplicity for both security and maintenance
| reasons.
|
| I think the lack of good third-party language implementations
| speaks directly to the reasonability of that assessment. It also
| makes the bus factor and longevity story very poor. Simplicity
| rules.
| cmrdporcupine wrote:
| Part of the problem with cap'n'proto whenever I've approached
| it is that not only does it have an opinion on how to architect
| your server (fine, whatever) but in C++ it ends up shipping
| with its own very opinionated alternative to the STL ("KJ") and
| when I played with it some years ago it really ended up getting
| its fingers everywhere and was hard to work into an existing
| codebase.
|
| The Rust version also comes with its own normative lifestyle
| assumptions; many of which make sense in the context of its
| zero-copy world but still make a lot of things hard to express,
| and the documentation was hard to parse.
|
| I tend to reach for flatbuffers instead, for this reason alone.
|
| Still I think someday I hope to have need and use for
| cap'n'proto; or at least finish one of several hobby projects
| I've forked off to try to use it over the years. There's some
| high quality engineering there.
| kentonv wrote:
| Yes, it's true, the C++ implementation has become extremely
| opinionated.
|
| I didn't initially intend for KJ to become as all-
| encompassing as it has. I guess I kept running into things
| that didn't work well about the standard library, so I'd make
| an alternative that worked well, but then other parts of the
| standard library would not play nicely with my alternative,
| so it snowballed a bit.
|
| At the time the project started, C++11 -- which completely
| changed the language -- was brand new, and the standard
| library hadn't been updated to really work well with the new
| features.
|
| The KJ Promise library in particular, which made asynchronous
| programming much nicer using the newly-introduced lambdas,
| predated any equivalent landing in the standard library by
| quite a bit. This is probably the most opinionated part of
| KJ, hardest to integrate with other systems. (Though KJ's
| even loop does actually have the ability to sit on top of
| other event loops, with some effort.)
|
| And then I ended up with a complete ecosystem of libraries on
| top of Promises, like KJ HTTP.
|
| With the Workers Runtime being built entirely in that
| ecosystem, it ends up making sense for me to keep improving
| that ecosystem, rather than try to make things work better
| across ecosystems... so here we are.
| cmrdporcupine wrote:
| Oh I understand completely how that would happen. I believe
| the first time I played with your work was not long after
| the C++11 transition, and so I could see why it happened.
|
| This is why these days I just work in Rust :-) Less
| heterogenous of an environment (so far).
| kentonv wrote:
| Yes, if I were starting from scratch today I'd use Rust.
| Unfortunately it was a little too early when work on
| Workers started.
| insanitybit wrote:
| How does the serialization layer impact your rpc choice?
| cmrdporcupine wrote:
| Cap'N'Proto comes with a (quite good) RPC facility. Based on
| asynchronous promises and grounded in capabilities.
|
| You don't _have_ to use it. You could just use it just as a
| 'serialization' layer but if you're writing services you
| could be missing half the advantage, really. And if you're
| writing in C++ you'll end up having to use their KJ library
| anyways.
|
| If you take the whole package the zero copy, capability-
| security, and asynchrouny (a word I just coined!) all fit
| together nicely.
| ajkjk wrote:
| Asynchrony?
| insanitybit wrote:
| Yeah I'm aware of all of that. What I'm saying is that I
| don't see what about the Schema Definition Language pushes
| you towards the RPC other than that they obviously go well
| together, just like gRPC is almost always used with
| protobuf, or http with JSON.
|
| > but it bothers me that what is ultimately a message
| encoding protocol has an opinion on how I should architect
| my server.
|
| To me, this is like saying "Using JSON is unfortunate
| because it has an opinion that I should use HTTP" when I
| don't think anyone would argue that at all, and I don't see
| the argument for capnp much either.
| kentonv wrote:
| The main thing that Cap'n Proto PRC really requires about
| the serialization is that object references are a first-
| class type. That is, when you make an RPC, the parameters
| or results can contain references to new, remote RPC
| objects. Upon receiving such a reference, that object is
| now callable.
|
| Making this work nicely requires some integration between
| the serialization layer and the RPC layer, though it's
| certainly possible to imagine Protobuf being extended
| with some sort of hooks for this.
| Timon3 wrote:
| Congrats on the release! It must be very exciting after 10 years
| :)
|
| If you don't mind the question: will there be more work on
| implementations for other languages in the future? I really like
| the idea of the format, but the main languages in our stack
| aren't supported in a way I'd use in a product.
| kentonv wrote:
| This is indeed the main weakness of Cap'n Proto. I only really
| maintain the C++ implementation. Other implementations come
| from various contributors which can lead to varying levels of
| completeness and quality.
|
| Unfortunately I can't really promise anything new here. My work
| on Cap'n Proto is driven by the needs of my main project, the
| Cloudflare Workers runtime, which is primarily C++. We do
| interact with Go and Rust services, and the respective
| implementations seem to get the job done there.
|
| Put another way, Cap'n Proto is an open source project, and I
| hope it is useful to people, but it is not a product I'm trying
| to sell, so I am not particularly focused on trying to get
| everyone to adopt it. As always, contributions are welcome.
|
| The one case where I might foresee a big change is if we
| (Cloudflare) decided to make Cap'n Proto be a public-facing
| feature of the Workers platform. Then we'd have a direct need
| to really polish it in many languages. That is certainly
| something we discuss from time to time but there are no plans
| at present.
| thegagne wrote:
| > if we (Cloudflare) decided to make Cap'n Proto be a public-
| facing feature of the Workers platform.
|
| How likely is this? What would be the benefits and use-cases
| of doing this? Would it be a standardized JS offering, or
| something specific to Workers that is deserialized before it
| hits the runtime?
| kentonv wrote:
| This really hasn't been fleshed out at all, it's more like:
| "Well, we're built on Cap'n Proto, it'd be really easy to
| expose it for applications to use. But is it useful?"
|
| Arguably Cap'n Proto RPC might be an interesting way for a
| Worker running on Cloudflare to talk to a back-end service,
| or to a service running in a container (if/when we support
| containers). Today you mostly have to use HTTP for this
| (which has its drawbacks) or raw TCP (which requires
| bringing your own protocol parser to run in "userspace").
|
| That said there's obviously a much stronger case to make
| for supporting gRPC or other protocols that are more widely
| used.
| Timon3 wrote:
| That's completely understandable, thank you for the answer!
| I'd love to try and help with at least one implementation for
| those languages, but there's a good chance that it would end
| up like the existing implementations due to lack of time.
|
| Anyway, thank you for making it open source and for working
| on it all this time!
| doctorpangloss wrote:
| Hmm, the main weakness of Cap'n Proto is that you have to
| already know so much stuff in order to understand why it
| makes all the great decisions it does. The weakness you're
| talking about matters to me, sure, I don't use Cap'n'Proto
| because it lacks the same tooling as gRPC, but it is better
| than gRPC from an ideas point of view.
|
| I am not going to write those language implementations, I
| have other stuff I need to do, and gRPC is good enough. But
| the people who _love_ writing language implementations _might
| not_ understand why Cap 'n Proto is great, or at least not
| understand as well as they understand Golang and Rust, so
| they will rewrite X in Golang and Rust instead.
|
| Anyway, the great ideas haven't changed in whatever it is,
| almost 10-15 years you've been working on this, they've been
| right all along. So it is really about communication.
|
| A comment on HN that really stuck with me was like: "Man
| dude, this is great, but try to explain to my team that it's
| Not React. They won't care."
|
| I'm just a guy, I don't know how to distill how good Cap'n
| Proto is. But "The Unreasonable Effectiveness of Recurrent
| Neural Networks" is the prototype. What is the unreasonable
| effectiveness of Cap'n Proto? In games, which I'm familiar
| with, entity component systems, user generated content and
| their tooling have a lot in common with Cap'n Proto. "The
| Unreasonable Effectiveness of ECS" is deterministic
| multiplayer, but that is also really poorly communicated, and
| thus limits adoption. Maybe you are already facing the same
| obstacles with Cloudflare Workers. It's all very
| communications related and I hope you get more adoption.
| ocdtrekkie wrote:
| Yeah, this has been the struggle with Sandstorm and self-
| hosting too. Ten years on, I'm still confident it's the
| best way to self-host, but to convince someone of that I
| have to sit them down and figure out how to get them to
| understand capability-based security, and most people lose
| interest about... immediately. :P
|
| I suspect a lot of things will eventually look more like
| Cap'n Proto and Sandstorm, but it will take a lot of time
| for everyone else to get there.
| bsder wrote:
| There are people who have tried to write the RPC layer without
| it simply being a wrapper around the C++ implementation, but
| it's a _LOT_ of code to rewrite for not a lot of direct
| benefit.
|
| Feel free to take a crack at it. People would likely be rather
| cooperative about it. However, know that it's just simply a lot
| of work.
| dtech wrote:
| While I never used Cap'n Proto, I want to thank kentonv for the
| extremely informative FAQ answer [1] on why required fields are
| problematic in a protocol
|
| I link it to people all the time, especially when they ask why
| protobuf 3 doesn't have required fields.
|
| [1] https://capnproto.org/faq.html#how-do-i-make-a-field-
| require...
| alphanullmeric wrote:
| Rustaceans in shambles
| nly wrote:
| Avro solves this problem completely, and more elegantly with
| its schema resolution mechanism. Exchanging schemas at the
| beginning of a connection handshake is hardly burdensome
| throwboatyface wrote:
| Disagree. Avro makes messages slightly smaller by removing
| tags, but it makes individual messages completely
| incomprehensible without the writer schema. For serializing
| data on disk it's fine and a reasonable tradeoff to save
| space, but for communication on the wire tagged formats allow
| for more flexibility on the receiver end.
|
| The spec for evolving schemas is also full of ambiguity and
| relies on the canonical Java implementation. I've built an
| Avro decoder from scratch and some of the evolution behaviour
| is counter-intuitive.
| dtech wrote:
| If by "solving" you mean "refuse to do anything at all unless
| you have the exact schema version of the message you're
| trying to read" then yes. In a RPC context that might even be
| fine, but in a message queue...
|
| I will never use Avro again on a MQ. I also found the schema
| resolution mechanism anemic.
|
| Avro was (is?) popular on Kafka, but it is such a bad fit
| that Confluent created a whole additional piece of infra
| called Schema Registry [1] to make it work. For Protobuf and
| JSON schema, it's 90% useless and sometimes actively harmful.
|
| I think you can also embed the schema in an Avro message to
| solve this, but then you add a massive amount of overhead if
| you send individual messages.
|
| [1] https://docs.confluent.io/platform/current/schema-
| registry/i...
| insanitybit wrote:
| > but it is such a bad fit that Confluent created a whole
| additional piece of infra called Schema Registry [1] to
| make it work.
|
| That seems like a weird way to describe it. It is assumed
| that a schema registry would be present for something like
| Avro. It's just how it's designed - the _assumption_ with
| Avro is that you can share your schemas. If you can 't
| abide by that don't use it.
| dtech wrote:
| I do not think its unfair at all. Schema registry needs
| to add a wrapper and UUID to an Avro payload for it to
| work, so at the very least Avro as-is is unsuitable for a
| MQ like Kafka since you cannot use it efficiently without
| some out-of-band communication channel.
| insanitybit wrote:
| Everyone knows you need an out of band channel for it, I
| don't know why you're putting this out there like it's a
| fault instead of how it's designed. Whether it's RPC
| where you can deploy your services or a schema registry,
| that is literally just how it works.
|
| Wrapping a message with its schema version so that you
| can look up that version is a really sensible way to go.
| A uuid is way more than what's needed since they could
| have just used a serial integer but whatever, that's on
| Kafka for building it that way, not Avro.
| morelisp wrote:
| > a serial integer
|
| And now you can't trivially port your data between
| environments.
| nly wrote:
| Having the schema for a data format I'm decoding has never
| been a problem in my line of work, and i've dealt with
| dozens of data formats. Evolution, versioning and
| deprecating fields on the other hand is always a pain in
| the butt.
| dtech wrote:
| If a n+1 version producer sends a message to the message
| queue with a new optional field, how do the n version
| consumers have the right schema without relying on some
| external store?
|
| In Protobuf or JSON this is not a problem at all, the new
| field is ignored. With Avro you cannot read the message.
| nly wrote:
| I mean a schema registry solves this problem, and you
| just put the schema in to the registry before the
| software is released.
|
| A simpler option is to just publish the schema in to the
| queue periodically. Say every 30 seconds, and then
| receivers can cache schemas for message types they are
| interested in.
| kentonv wrote:
| > Exchanging schemas at the beginning of a connection
| handshake is hardly burdensome.
|
| I dunno, that sounds extremely burdensome to me, especially
| if the actual payload is small.
|
| And how exactly does exchanging schemas solve the problem? If
| my version of the schema says this field is required but
| yours says it is optional, and so you don't send it, what am
| I supposed to do?
| dtech wrote:
| Avro makes that case slightly better because you can
| default value for a missing field in one of the two schemas
| and then it works.
|
| It's not worth the boatload of problems it bring in all
| other and normal use cases though. Having the default value
| in the app or specified by the protocol is good enough.
| oftenwrong wrote:
| Typical provides "asymmetric" fields to assist with evolution
| of types:
|
| https://github.com/stepchowfun/typical#asymmetric-fields-can...
|
| >To help you safely add and remove required fields, Typical
| offers an intermediate state between optional and required:
| asymmetric. An asymmetric field in a struct is considered
| required for the writer, but optional for the reader. Unlike
| optional fields, an asymmetric field can safely be promoted to
| required and vice versa.
| skybrian wrote:
| Yeah, it only works for migrations in fairly closed systems
| where you can upgrade or delete all the old data, though.
| kccqzy wrote:
| This is some very valuable perspective. Personally, I
| previously also struggled to understand why. For me, the thing
| that clicked was to understand protobuf and Cap'n proto as
| serialization formats that need to work across API boundaries
| and need to work with different versions of their schema in a
| backwards- and forwards-compatible way; do not treat them as
| in-memory data structures that represent the world from the
| perspective of a single process running a single version
| without no compatibility concerns. Thus, the widely repeated
| mantra of "making illegal states unrepresentable" does not
| apply.
| chubot wrote:
| Rich Hickey (creator of the Clojure language) has a good talk
| "Maybe Not" that touches on these issues, with a nice way of
| explaining it
|
| https://www.youtube.com/watch?v=YR5WdGrpoug
|
| The capnproto link explains it concretely in terms of a
| message bus example, which is useful.
|
| But more abstractly you can think of the shape of data (aka
| schema, names and types) and field presence
| (optional/required) as separate things
|
| https://lobste.rs/s/zdvg9y/maybe_not_rich_hickey
|
| First, being valid or invalid with respect to a static type
| system is a GLOBAL property of program -- writing a type
| checker will convince you of that. And big distributed
| systems don't have such global properties:
| https://news.ycombinator.com/item?id=36590799
|
| If they did, they'd be small :) Namely you could just reboot
| the whole thing at once. You can't reboot say the entire
| Internet at once, and this also holds for smaller systems,
| like the ones at say Google (and I'm sure Cloudflare, etc.).
|
| So the idea is that the shape/schema is a GLOBAL property --
| you never want two messages called foo.RequestX or two fields
| called "num_bar" with different types -- ever, anywhere.
|
| But optional/required is LOCAL property. It depends on what
| version of a schema is deployed in a particular binary.
| Inherently, you need to be able to handle a mix of
| inconsistent versions running simultaneously.
|
| ---
|
| To be pedantic, I woulds say "making illegal states
| unrepresentable" DOES apply, but you can't do it in a STATIC
| type system. [1] Your Maybe<T> type is not useful for data
| that crosses process boundaries.
|
| A distributed system isn't a state machine.
|
| 1. Lamport showed us one important reason why: the relative
| order of messages means that there is no globally coherent
| state. You need something like Paxos to turn a distributed
| system back into a state machine (and this is very expensive
| in general)
|
| 2. The second reason is probably a consequence of the first.
| You can think of deploying a binary to a node as a message to
| that node. So you don't have a completely consistent state --
| you always have an in-between state, a mix of versions. And
| presumably you want your system to keep working during this
| time period :)
|
| And that coarse-grained problem (code versioning and
| deployment) implies the fine-grained problem (whether a
| specific message in a field is present). This is because
| protobufs generate parsers with validation for you -- or they
| used to!
|
| ---
|
| tl;dr Think of the shape of data (aka schema) and field
| presence (optional/required) as different dimensions of data
| modeling. Maybe<T> mixes those up, which is fine in a single
| process, but doesn't work across processes.
|
| ---
|
| [1] A very specific example of making illegal states
| unrepresentable without static types - my Oils project uses a
| DSL for algebraic data types, borrowed from CPython. The
| funny thing is that in CPython, it generates C code, which
| doesn't have any static notion of Maybe<T>. It has tagged
| unions.
|
| And in Oils we first generated dynamically typed Python at
| first. Somewhat surprisingly, algebraic data types are STILL
| useful there.
|
| Now the generated code is statically typed with MyPy (and
| with C++), and we do pleasant type-driven refactorings. But
| using algebraic data types were still extremely useful before
| static typing. They made illegal states unrepresentable --
| but you would get the error at runtime.
| skybrian wrote:
| I wonder about how to make this play nicely with systems
| that have different perspectives. Yes, a message bus is
| written to deal with any possible message and it can do
| that because it doesn't care what's in the message.
| Incomplete messages are useful to have, too.
|
| This is sort of like the difference between a text editor
| and a compiler. An editor has to deal with code that
| doesn't even parse, which is easiest if just treats it as a
| plain text file, but then you're missing a lot of language-
| specific features that we take for granted these days.
| Meanwhile, a compiler can require all errors to be fixed
| before it emits a binary, but it has to be good at
| reporting what the errors are, because they will certainly
| happen.
|
| It's unclear to me how the type of the field can be a
| global property in a large system. From a text editor's
| point of view, you can just edit the type. How can anyone
| guarantee that a type is always the same?
|
| Also, SQL tables actually do make closed-world assumptions;
| every record meets the schema, or it can't be stored there.
| If you change the schema, there is a migration step where
| all the old rows in the production database gets upgraded.
| This doesn't seem unrealistic?
|
| I guess it's unrealistic that you only have _one_
| production database, and not also a staging database, and
| every developer having their own database? And they will be
| on different versions. As soon as you have lots of
| databases, things get complicated.
| 3cats-in-a-coat wrote:
| Can't we extend this argument to eliminating basically all
| static typing? And frankly that'd not even be wrong, and is why
| Alan Kay defined OOP as one that's dynamically typed and late
| bound, and we went against it anyway to keep relearning the
| same lessons over and over.
| kentonv wrote:
| The argument is really more like: Always defer validation
| until the point where the data is actually consumed, because
| only the consumer actually knows what is valid.
|
| Which is definitely a counterpoint to the oft-stated argument
| that you should validate all data upfront.
|
| Either way though, you can still have types, the question is
| just when and where (in a distributed system, especially)
| they should be checked.
| mike_hearn wrote:
| The argument is actually more like: don't use badly written
| middleman software that tries to parse messages it doesn't
| need to parse.
|
| I was at Google when the "let's get rid of optional"
| crusade started. It didn't make sense to me then and over a
| decade later it still doesn't. If a program expects a field
| to be there then it has to be there, removing the protobuf
| level checking just meant that programs could now read
| garbage (some default value) instead of immediately
| crashing. But the whole reason we have types, assertions,
| bounds checking and so on is because, almost always, we'd
| like our software to NOT just blindly plough on into
| undefined territory when it doesn't understand something
| properly, so in reality it just means everyone ends up
| coding those very same required-ness assertions by hand.
|
| Now, Google's business model is remarkably robust to
| generating and processing corrupt data, so you can argue
| that in the _specific_ case of this _specific_ company, it
| is actually better to silently serve garbage than to crash.
| This argument was made explicitly in other forms, like when
| they deleted all the assertions from the HTTP load
| balancers. But in every case where I examined an anti-
| required argument carefully the actual problem would turn
| out to be elsewhere, and removing assertions was just
| covering things up. The fact that so much of Google 's code
| is written in C++ that not only starts up slowly but also
| just immediately aborts the entire process when something
| goes wrong also contributes to the brittleness that
| encourages this kind of thing. If Google had been built on
| a language with usable exceptions right from the start it'd
| have been easier to limit the blast radius of data
| structure versioning errors to only the requests where that
| data structure turned up, instead of causing them to nuke
| the entire server (and then the RPC stack will helpfully
| retry because it doesn't know why the server died, promptly
| killing all of them).
|
| But this tolerance to undefined behavior is not true for
| almost any other business (except maybe video games?). In
| those businesses it's better to be stopped than wrong. If
| you don't then you can lose money, lose data, lose
| customers or in the worst cases even lose your life. I
| don't think people appreciate the extent to which the
| unique oddities of Google's business model and
| infrastructure choices have leaked out into the libraries
| their staffers/ex-staffers release.
| saghm wrote:
| > The argument is actually more like: don't use badly
| written middleman software that tries to parse messages
| it doesn't need to parse.
|
| > I was at Google when the "let's get rid of optional"
| crusade started. It didn't make sense to me then and over
| a decade later it still doesn't. If a program expects a
| field to be there then it has to be there, removing the
| protobuf level checking just meant that programs could
| now read garbage (some default value) instead of
| immediately crashing. But the whole reason we have types,
| assertions, bounds checking and so on is because, almost
| always, we'd like our software to NOT just blindly plough
| on into undefined territory when it doesn't understand
| something properly, so in reality it just means everyone
| ends up coding those very same required-ness assertions
| by hand.
|
| Yeah, that's what stuck out to me from the linked
| explanation as well; the issue wasn't that the field was
| required, it was that the message bus was not doing what
| was originally claimed. It sounds like either having the
| message bus _just_ process the header and not the entire
| message or having the header have a version number that
| indicated which fields are required (with versions
| numbers that are newer than the latest the bus was aware
| of being considered to have no required fields). I don't
| claim that it's never correct to design a protocol
| optimizing for robustness when consumed by poorly written
| clients, but I similarly struggle to see how making that
| the only possible way to implement a protocol is the only
| valid option. Maybe the goal of cap'n proto is to be
| prescriptive about this sort of thing, so it wouldn't be
| a good choice for uses where there's more rigor in the
| implementation of services using the protocol, but if its
| intended for more general usage, I don't understand this
| design decision at all.
| 3cats-in-a-coat wrote:
| That's valuable what you say, and it's kinda odd some
| people here discard practical experience in favor of
| their subjective flavor of theoretical correctness.
| kentonv wrote:
| > The argument is actually more like: don't use badly
| written middleman software that tries to parse messages
| it doesn't need to parse.
|
| The middleman software in question often needed to
| process some part of the message but not others. It
| wasn't realistic to define a boundary between what each
| middleman might need and what they wouldn't need, and
| somehow push the "not needed" part into nested encoded
| blobs.
|
| I'm not sure the rest of your comment is really
| addressing the issue here. The argument doesn't have
| anything to do with proceeding forward in the face of
| corrupt data or undefined behavior. The argument is that
| validation needs to happen at the consumer. There should
| still be validation.
| naasking wrote:
| > It wasn't realistic to define a boundary between what
| each middleman might need and what they wouldn't need,
| and somehow push the "not needed" part into nested
| encoded blobs.
|
| This is an interesting argument that I would like to see
| more elaboration on, because that's the obvious solution.
| Effectively you're building a pipeline of data processors
| and each stage in the pipeline reads its own information
| and then passes along a payload with the rest of the
| information to the next stage. This would preserve full
| static typing with required fields, but I can see how it
| might inhibit some forms of dynamic instrumentation, eg.
| turning verbose logging on/off might dynamically
| reconfigure the pipeline, which would affect all upstream
| producers if they're wrapping messages for downstream
| consumers.
|
| If this were a programming language I would immediately
| think of row typing to specify the parts that each stage
| depends on while being agnostic about the rest of the
| content, but I'm not sure how that might work for a
| serialization format. Effectively, you're pulling out a
| typed "view" over the underlying data that contains
| offsets to the underlying fields (this is the dictionary-
| passing transform as found in Haskell).
| mike_hearn wrote:
| It's easier to understand in context - some services
| (iirc web search but it might have been ads or something
| else very core) had kept adding fields to some core
| protobufs for years and years. It made sense, was the
| path of least resistance etc, but inevitably some of
| those fields became obsolete and they wanted to remove
| them but found it was hard, because every program that
| did anything with web search was deserializing these
| structures.
|
| Truly generic middleware like RPC balancers did what you
| are saying, but there were also a lot of service specific
| "middlemen" which did need to look at parts of these
| mega-structures.
|
| Now due to how protobufs work, you _can_ do what you
| suggest and "cast" a byte stream to multiple different
| types, so they could have defined subsets of the overall
| structures and maybe they did, I don't remember, but the
| issue then is code duplication. You end up defining the
| same structures multiple times, just as subsets. With a
| more advanced type system you can eliminate the
| duplication, but there was a strong reluctance to add
| features to protobufs.
| kentonv wrote:
| The particular piece of infrastructure I worked on sat in
| the middle of the search pipeline, between the front-end
| that served HTML web pages, and the back-end index. This
| middle piece would request search results from the back-
| end, tweak them in a bunch of ways, and forward them on.
|
| These "tweaks" could be just about anything. Like: "You
| searched for Jaguar, but I don't know if you meant the
| cat or the car. The index decided that pages about the
| car rank higher so the first three pages of results are
| about the car. I'm going to pull some results about the
| cat from page 4 and put them near the top so just in case
| that's what you really wanted, you'll find it."
|
| Google Search, at least when I worked on it, was composed
| of a huge number of such tweaks. People were constantly
| proposing them, testing if they led to an improvement,
| and shipping them if they do. For a variety of reasons,
| our middleman server was a great place to implement
| certain kinds of tweaks.
|
| But what kinds of information are needed for these
| "tweaks"? Could be anything! It's a general-purpose
| platform. Search results were annotated with all kinds of
| crazy information, and any piece of information might be
| useful in implementing some sort of middleman tweak at
| some point.
|
| So you couldn't very well say upfront "OK, we're going to
| put all the info that is only for the frontend into the
| special 'frontend blob' that doesn't get parsed by the
| middleman", because you have no idea what fields are only
| needed by the frontend. In fact, that set would change
| over time.
|
| > If this were a programming language I would immediately
| think of row typing to specify the parts that each stage
| depends on while being agnostic about the rest of the
| content
|
| Indeed, perhaps one could develop an elaborate system
| where in the schemas, we could annotate certain fields as
| being relevant to certain servers. Anywhere else, those
| fields would be unavailable (but passed through without
| modification or validation). If you needed the fields in
| a new place, you change the annotations.
|
| But that sounds... complicated to design and cumbersome
| to maintain the annotations. Simply banning required
| fields solved the problem for us, and everything else
| just worked.
| mike_hearn wrote:
| I think you're defining consumer as the literal line of
| code where the field is read, whereas a more natural
| definition would be something like "the moment the data
| structure is deserialized". After all it's usually better
| to abort early than half way through an operation.
|
| It was quite realistic to improve protobufs to help dig
| web search out of their "everything+dog consumes an
| enormous monolithic datastructure" problem, assuming
| that's what you're thinking of (my memory of the details
| of this time is getting fuzzy).
|
| A simple brute-force fix for their situation would have
| been to make validation of required fields toggle-able on
| a per-parse level, so they could disable validation for
| their own parts of the stack without taking it away for
| everyone else (none of the projects I worked on had
| problems with required fields that I can recall).
|
| A better fix would have been for protobufs to support
| composition. They could then have started breaking down
| the mega-struct into overlapping protos, with the
| original being defined as a recursive merge of them.
| That'd have let them start narrowing down semantically
| meaningful views over what the programs really needed.
|
| The worst fix was to remove validation features from the
| language, thus forcing everyone to manually re-add them
| without the help of the compiler.
|
| Really, the protobuf type system was too simple for
| Google even in 2006. I recall during training wondering
| why it didn't have a URL type given that this was a web-
| centric company. Shortly after I discovered a very simple
| and obvious bug in web search in which some local
| business results were 404s even though the URL existed.
| It had been there for months, maybe years, and I found it
| by reading the user support forums (nobody else did this,
| my manager considered me way out of my lane for doing
| so). The bug was that nothing anywhere in the pipeline
| checked that the website address entered by the business
| owner started with https://, so when the result was
| stuffed into an <a> tag it turned into <a
| href="www.business.com"> and so the user ended up at
| https://www.google.com/www.business.com. Oops. These bad
| strings made it all the way from the business owner,
| through the LBC frontend, the data pipeline, the
| intermediate microservices and the web search frontends
| to the user's browser. The URL _did_ pass crawl
| validation because when loaded into a URL type, the
| missing protocol was being added. SREs were trained to do
| post-mortems, so after it got fixed and the database was
| patched up, I naively asked whether there was a
| systematic fix for this, like maybe adding a URL type to
| protobufs so data would be validated right at the start.
| The answer was "it sounds like you're asking how to not
| write bugs" and nothing was done, sigh. It's entirely
| possible that similar bugs reoccurred dozens of times
| without being detected.
|
| Those are just a couple of cases where the simplicity (or
| primitivity) of the protobuf type system led to avoidable
| problems. Sure, there are complexity limits too, but the
| actual languages Googlers were using all had more
| sophisticated type systems than protobuf and bugs at the
| edges weren't uncommon.
| skybrian wrote:
| I think this comes from everyone wanting to use the same
| schema and parser. For example, a text editor and a
| compiler have obvious differences in how to deal with
| invalid programs.
|
| Maybe there need to be levels of validation, like "it's a
| text file" versus "it parses" versus "it type checks."
| mike_hearn wrote:
| Sure, that would also have been a fine solution. There
| are lots of ways to tackle it really and some of it is
| just very subjective. There's a lot of similarities here
| between the NoSQL vs SQL debates. Do you want a
| schemaless collection of JSON documents or do you want
| enforced schemas, people can debate this stuff for a long
| time.
|
| You can also see it as a version control and awareness
| problem rather than a schema or serialization problem.
| The issues don't occur if you always have full awareness
| of what code is running and what's consuming what data,
| but that's hard especially when you take into account
| batch jobs.
| kentonv wrote:
| > I think you're defining consumer as the literal line of
| code where the field is read
|
| I am.
|
| > After all it's usually better to abort early than half
| way through an operation.
|
| I realize this goes against common wisdom, but I actually
| disagree.
|
| It's simply unrealistic to imagine that we can fully
| determine whether an operation will succeed by examining
| the inputs upfront. Even if the inputs are fully valid,
| all sorts of things can go wrong at runtime. Maybe a
| database connection is randomly dropped. Maybe you run
| out of memory. Maybe the power goes out.
|
| So we already have to design our code to be tolerant to
| random failures in the middle. This is why we try to
| group our state changes into a single transaction, or
| design things to be idempotent.
|
| Given we already have to do all that, I think trying to
| validate input upfront creates more trouble than it
| solves. When your validation code is far away from the
| code that actually processes the data, it is easier to
| miss things and harder to keep in sync.
|
| To be clear, though, this does not mean I like dynamic
| typing. Static types are great. But the reason I like
| them is more because they make programming easier,
| letting you understand the structure of the data you're
| dealing with, letting the IDE implement auto-complete,
| jump-to-definition, and error checking, etc.
|
| Consider TypeScript, which implements static typing on
| JavaScript, but explicitly does not perform any runtime
| checks whatsoever validating types. It's absolutely
| possible that a value at runtime does not match the type
| that TypeScript assigned to it. The result is a runtime
| exception when you try to access the value in a way that
| it doesn't support (even though its type says it should
| have). And yet, people love TypeScript, it clearly
| provides value despite this.
|
| This stuff makes a programming language theorist's head
| explode but it practice it works. Look, anything can be
| invalid in ways you never thought of, and no type system
| can fully defend you from that. You gotta get comfortable
| with the idea that exceptions might be thrown from
| anywhere, and design systems to accommodate failure.
| mike_hearn wrote:
| I agree with a lot of this, but:
|
| 1. The advantage of having it in the type system is the
| compiler can't forget.
|
| 2. It's quite hard to unwind operations in C++. I think
| delaying validation to the last moment is easier when you
| have robust exceptions. At the top level the frameworks
| can reject RPCs or return a 400 or whatever it is you
| want to do, if it's found out 20 frames deep into some
| massive chunk of code then you're very likely to lose
| useful context as the error gets unwound (and worse error
| messages).
|
| On forgetting, the risky situation is something like
| this: message FooRequest {
| required string query = 1; optional
| list<string> options = 2; // added later }
|
| The intention is: in v1 of the message there's some
| default information returned, but in v2 the client is
| given more control including the ability to return less
| information as well as more. In proto2 you can query if
| options is set, and if not, select the right default
| value. In proto3 you can't tell the difference between an
| old client and a client that wants no extra information
| returned. That's a bug waiting to happen: the difference
| between "not set" and "default value" is important. Other
| variants are things like adding "int32 timeout" where it
| defaults to zero, or even just having a client that
| forgets to set a required field by mistake.
|
| TypeScript does indeed not do validation of type casts up
| front, but that's more because it's specifically designed
| to be compatible with JavaScript and the runtime doesn't
| do strong typing. People like it compared to raw JS.
| 3cats-in-a-coat wrote:
| Honestly I wonder what is the big win in terms of
| performance by using static types here, because this
| sounds so terribly well fit for dynamic types (of which
| optionality by default is in fact a limited example).
| Such an odd choice to calcify a spec in a places where it
| changes all the time. "Static" optimizations should be
| local, not distributed.
| lanstin wrote:
| The distributed part shifts the problem from "find types
| that represent your solution" to "find a system of types
| that enable evolution of your solution over time." I think
| this is why bad things like json or xml do so well: they
| work fine with a client dev saying, "I need this extra
| data" and the server dev adding it, and then the client dev
| consuming it.
|
| The more modern approaches, like protobuf or capn proto are
| designed with the experience of mutating protocols over
| time.
|
| It works pretty well too unless the new field changes the
| semantics of old field values, e.g. adding a field
| "payment_is_reversal_if_set" to a payment info type, which
| would change the meaning of the signs of the amounts. In
| that case, you have to reason more explicitly about when to
| roll out the protocol readers and when to roll out the
| protocol writers. Or version it, etc.
| insanitybit wrote:
| > Can't we extend this argument to eliminating basically all
| static typing?
|
| No, because static typing exists in all sorts of places. This
| argument is primarily about cases where you're _exchanging
| data_ , which is a very specific use case.
| 3cats-in-a-coat wrote:
| Static types are a partial application/reduction when
| certain mutable or unknown variables become constants (i.e.
| "I for sure only need integers between 0-255 here").
|
| I'm not rejecting static types entirely, and yes I was
| discussing exchanging data here, as Alan Kay's OOP is
| inherently distributed. It's much closer to Erlang than it
| is to Java.
| insanitybit wrote:
| > I'm not rejecting static types entirely, and yes I was
| discussing exchanging data here
|
| OK I guess I'm having a hard time reconciling that with:
|
| > basically all static typing
| 3cats-in-a-coat wrote:
| Sorry, I see how I'm vague. The idea is you have no "pre-
| burned" static types, but dynamic types. And static types
| then become a disposable optimization compiled out of
| more dynamic code, in the same way JIT works in V8 and
| JVM for example (where type specialization is in fact
| part of the optimization strategy).
| insanitybit wrote:
| You're describing dynamic types
| 3cats-in-a-coat wrote:
| But with the benefit of static types, and without the
| drawbacks of static types.
| insanitybit wrote:
| No. "Types only known at runtime" are dynamic types. "And
| also you can optimize by examining the types at runtime"
| is just dynamic types. And it does not have the benefit
| of static types because it is dynamic types.
| 3cats-in-a-coat wrote:
| This is devolving into a "word definition war" so I'll
| leave aside what you call static types and dynamic types
| and get down to specifics. Type info is available in
| these flavors, relative to runtime:
|
| 1. Type info which is available before runtime, but not
| at runtime (compiled away).
|
| 2. Type info which is available at runtime, but not at
| compile time (input, statistics, etc.).
|
| 3. Type info which is available both at compile time and
| runtime (say like a Java class).
|
| When you have a JIT optimizer that can turn [3] and [2]
| into [1], there's no longer a reason to have [1], except
| if you're micro-optimizing embedded code for some device
| with 64kb RAM or whatever. We've carried through legacy
| practices, and we don't even question them, and try to
| push them way out of their league into large-scale
| distributed software.
|
| When I say we don't need [1], this doesn't mean I deny
| [3], which is still statically analyzable type
| information. It's static types, but without throwing away
| flexibility and data at runtime, that doesn't need to be
| thrown away.
| insanitybit wrote:
| Short of time travel one can not turn (3) or (2) into
| (1). I'm not sure where the confusion here is or what
| you're advocating for because this isn't making sense to
| me.
|
| > there's no longer a reason to have [1]
|
| I guess if you're assuming the value of static types is
| just performance? But it's not, not by a long shot -
| hence 'mypy', a static typechecker that in no way impacts
| runtime.
|
| I think this conversation is a bit too confusing for me
| so I'm gonna respectfully walk away :)
| 3cats-in-a-coat wrote:
| The confusion is to assume "runtime" is statically
| defined. JIT generates code which omits type information
| that's determined not to be needed in the context of the
| compiled method/trace/class/module. That code still
| "runs" it's "runtime".
| insanitybit wrote:
| Yes, the types that JIT omits are dynamic types.
| [deleted]
| klabb3 wrote:
| To elaborate on your point:
|
| Static type systems in programming languages are _designed_
| to break at _compilation-time_. The reason this works is
| because all users are within the same "program unit", on
| the same version.
|
| In other words, static typing allows more validation to be
| automated, and removes the need for multiple simultaneous
| versions, but assumes that the developer has access and
| ability to change _all other users_ at the same "time" of
| their own change.
|
| I find this whole topic fascinating. It seems like
| programmers are limited to an implicit understanding of
| these differences but it's never formalized (or even
| properly conceptualized). Thus, our intuition often fails
| with complex systems (eg multiple simultaneous versions,
| etc). Case in point: even mighty Google distinguished
| engineers made this "billion-dollar mistake" with required
| fields, even though they had near-perfect up-front
| knowledge of their planned use-cases.
| lanstin wrote:
| No one has near-perfect up-front knowledge of a software
| system designed to change and expand. The solution space
| is too large and the efficient delivery methods are a
| search thru this space.
| klabb3 wrote:
| I may have phrased it poorly. What I should have said is
| that Google absolutely could have "anticipated" that many
| of their subsystems would deal with partial messages and
| multiple versions, because they most certainly already
| did. The designers would have maintained, developed and
| debugged exactly such systems for years.
| lanstin wrote:
| Makes sense: they knew arbitrary mutability was a
| requirement but did not think it thru for the required
| keyword.
| mike_hearn wrote:
| It's actually the opposite. The billion dollar mistake is
| to have pervasive implicit nullability, not to have the
| concept of optionality in your type system. Encoding
| optionality in the type system and making things required
| by default is usually given as the _fix_ for the billion
| dollar mistake.
| klabb3 wrote:
| Huh? Did you read the link, from the guy who was there
| during the major failure at Google that led to proto3
| being redesigned without that flaw?
|
| The whole lesson is that you _can't_ apply the lessons
| from static type systems in PLs when you have multiple
| versions and fragmented validation across different
| subsystems. Counter-intuitively! Everyone thought it was
| a good idea, and it turned out to be a disaster.
| [deleted]
| 3cats-in-a-coat wrote:
| It remains a big asterisk to me, why was some random
| middleware validating an end-to-end message between two
| systems, instead of treating it as just an opaque
| message.
|
| Why are we not having this debate about "everything must
| be optional" for Internet Packets (IP) for example?
| Because it's just binary load. If you want to ensure
| integrity you checksum the binary load.
| klabb3 wrote:
| Things like distributed tracing, auth data, metrics,
| error logging messages and other "meta-subsystems" is
| certainly typical use cases. Reverse proxies and other
| http middleware do exactly this with http headers all the
| time.
| mike_hearn wrote:
| I did read the link and I was at Google at the time
| people started arguing for that. With respect, I think
| the argument was and still is incorrect, that the wrong
| lessons were drawn and that proto3 is worse than proto2.
| hgsgm wrote:
| OK, what do you do when a message comes in missing a
| field? Crash the server?
| klabb3 wrote:
| Alright, fair enough. Apologies for the dismissive tone.
| Could you elaborate (or point to) these wrong lessons or
| an alternative?
| chubot wrote:
| See my sibling comment, e.g. with respect to Rich Hickey's
| framing - https://news.ycombinator.com/item?id=36911033
| kccqzy wrote:
| Do you need to make different versions of a program exchange
| information even though they do not agree on the types? No?
| Then this argument cannot be extended this way.
| mrkeen wrote:
| It's up to you.
|
| It's easy to imagine any statically typed language having a
| general-purpose JSON type. You could imagine all functions
| accepting and returning such objects.
|
| Now it's your turn to implement the sum(a,b) function. Would
| you like to allow the caller to pass anything in as a and b?
| lanstin wrote:
| This is like when people use protobuf to send a list of
| key-value mappings, and call that a protocol. (I've seen
| that same design in many protocol description arenas, even
| SQL database schemas that are just (entityId INT, key CLOB,
| value BLOB).
| CodesInChaos wrote:
| I find it surprising how few protocols (besides Cap'n Proto) have
| promise pipelining. The only other example I can think of is 9p,
| but that's not a general purpose protocol.
|
| https://capnproto.org/news/2013-12-13-promise-pipelining-cap...
| mananaysiempre wrote:
| There is also CapnP's moral ancestor CapTP[1]/VatTP aka
| Pluribus developed to accompany Mark Miller's E language (yes,
| it's a pun, there is also a gadget called an "unum" in there).
| For deeper genealogy--including a reference to Barbara Liskov
| for promise pipelining and a number of other relevant ideas in
| the CLU extension Argus--see his thesis[2].
|
| (If I'm not misremembering, Mark Miller later wrote the promise
| proposal for JavaScript, except the planned extension for RPC
| never materialized and instead we got async/await, which don't
| seem compatible with pipelining.)
|
| The more recent attempts to make a distributed capability
| system in the image of E, like Spritely Goblins[3] and the
| OCapN effort[4], also try for pipelining, so maybe if you hang
| out on cap-talk[5] you'll hear about a couple of other
| protocols that do it, if not ones with any real-world usage.
|
| (And I again reiterate that, neat as it is, promise pipelining
| seems to require programming with actual explicit promises, and
| at this point it's well-established how gnarly that can get.)
|
| One idea that I find interesting and little-known from the
| other side--event loops and cooperatively concurrent "active
| objects"--is "causality IDs"[6] from DCOM/COM+ as a means of
| controlling object reentrancy, see
| CoGetCurrentLogicalThreadId[7] in the Microsoft documentation
| and the discussion of CALLTYPE_TOPLEVEL_CALLPENDING in
| _Effective COM_ [8]--I think they later tried to sell this as a
| new feature in Win8/UWP's ASTAs[9]?
|
| [1] http://erights.org/elib/distrib/captp/index.html
|
| [2] http://erights.org/talks/thesis/index.html
|
| [3] https://spritely.institute/goblins/
|
| [4] https://github.com/ocapn/ocapn
|
| [5] https://groups.google.com/g/captalk/
|
| [6]
| https://learn.microsoft.com/openspecs/windows_protocols/ms-d...
|
| [7]
| https://learn.microsoft.com/windows/win32/api/combaseapi/nf-...
|
| [8] https://archive.org/details/effectivecom50wa00boxd/page/150
|
| [9]
| https://devblogs.microsoft.com/oldnewthing/20210224-00/?p=10...
| jayd16 wrote:
| As neat as it is I guess it's hard optimize the backend for it
| compared to explicitly grouping the queries. I imagine a
| bespoke RPC call that results in a single SQL query is better
| than several pipelined but separate RPC calls, for example.
|
| But even still, you would think it would be more popular.
| kentonv wrote:
| If you're thinking strictly about stateless backends that
| just convert every request into a SQL query, then yeah,
| promise pipelining might not be very helpful.
|
| I think where it shines is when interacting with stateful
| services. I think part of the reason everyone tries to make
| everything stateless is because we don't have good protocols
| for managing state. Cap'n Proto RPC is actually quite good at
| it.
| cmrdporcupine wrote:
| The issue is that having per-session/transaction state on
| the server makes load balancing requests more difficult;
| especially when that state is long-lived.
| kentonv wrote:
| While it's true that load-balancing long-lived resources
| is harder than short-lived, a lot of the difficulty of
| load balancing with stateful servers is actually in the
| protocol, because you somehow have to make sure
| subsequent requests for the same state land on the
| correct server.
|
| Cap'n Proto actually does really well with this basic
| difficulty, because it treats object references as a
| first-class thing. When you create some state, you
| receive back a reference to the state, and you can make
| subsequent requests on that reference. The load balancer
| can _see_ that this has happened, even if it doesn 't
| know the details of the application, because object
| references are marked as such at the RPC layer
| independent of schema. Whereas in a system that returns
| some sort of "object ID" as a string, and expects you to
| pass that ID back to the server on subsequent requests,
| the load balancer is not going to have any idea what's
| going on, unless you do extra work to teach the load
| balancer about your protocol.
| dontlaugh wrote:
| io_uring supports that too, although not a network protocol.
| mananaysiempre wrote:
| Last time I checked (a couple of years ago) they wanted to
| use eBPF to handle this sort of problem. Did they end up
| doing something simpler?
| dontlaugh wrote:
| Yes. io_uring lets you issue multiple syscalls together,
| with the result from some being parameters for others.
| giovannibonetti wrote:
| Redis transactions [1] also apply pipelining, but AFAICT there
| is no practical way to use them for implementing generic RPC.
|
| [1] https://redis.com/ebook/part-2-core-
| concepts/chapter-4-keepi...
| ackfoobar wrote:
| Does the pipelining in Redis allow you to have the second
| command depend on the result of the first command?
| byroot wrote:
| No, but for that use case you have EVAL which execute an
| arbitrary lua script on the server.
|
| https://redis.io/commands/eval/
| jauntywundrkind wrote:
| That assumes you know & can generate the complete
| pipeline ahead of time. The elegance of promise
| pipelining is that your pipeline can also be
| asynchronously grown.
| dan-robertson wrote:
| Without knowing how exactly capnproto promise pipelining works,
| when I thought about it, I was concerned about cases like
| reading a directory and stating everything in it, or getting
| back two response values and wanting to pass only one to the
| next call. The latter could be made to work, I guess, but the
| former depends on eg the number of values in the result list.
| kentonv wrote:
| In the actual implementation, when making a pipelined call on
| a result X, you actually say something like "Call
| X.foo.bar.baz()" -- that is, you can specify a nested
| property of the results which names the object that you
| actually want to operate on.
|
| At present, the only operation allowed here is reading a
| nested property, and that seems to solve 99% of use cases.
| But one could imagine allowing other operations, like "take
| the Nth element of this array" or even "apply this call to
| _all_ elements in the array, returning an array of results ".
| catern wrote:
| I didn't know 9p had promise pipelining!
|
| Or more specifically, it seems to have client-chosen file
| descriptors, so the client can open a file, then immediately
| send a read on that file, and if the open fails, the read will
| also fail (with EBADF). Awesome!
|
| This is great, but "promise pipelining" also needs support in
| the client. Are there 9p clients which support promise
| pipelining? For example, if the user issues several walks,
| they're all sent before waiting for the reply to the first
| walk?
|
| Also, it only has promise pipelining for file descriptors. That
| gives you a lot, definitely, but if for example you wanted to
| read every file in a directory, you'd want to be able to issue
| a read and then walk to the result of that read. Which 9p
| doesn't seem to support. (I actually support this in my own
| remote syscall protocol library thing, rsyscall :) )
| shdh wrote:
| We have a great plethora of binary serialization libraries now,
| but I've noticed none of them offer the following:
|
| * Specification of the number of bits I want to cap out a field
| at during serialization, ie: `int` that only uses 3 bits.
|
| * Delta encoding for serialization and deserialization, this
| would further decrease the size of each message if there is an
| older message that I can use as the initial message to delta
| encode/decode from.
| no_circuit wrote:
| Take a look at FAST protocol [1]. It has been around for a
| while. Was created for market/trading data. There appears to be
| some open source implementations, but I don't think in general
| they'd be maintained well since trading is, well, secretive.
|
| [1] https://en.wikipedia.org/wiki/FAST_protocol
| jeffbee wrote:
| > `int` that only uses 3 bits.
|
| CBOR approximates this, since it has several different widths
| for integers.
|
| > an older message that I can use as the initial message to
| delta encode/decode from.
|
| General-purpose compression on the encoded stream would do
| something toward this goal, but some protocol buffers library
| implementations offer merge functions. The question is what
| semantics of "merge" you expect. For repeated fields do you
| want to append or clobber?
| IshKebab wrote:
| Most formats use varints, so you can't have a 3-bit int but
| they will store a 64-bit int in one byte if it fits. Going to
| smaller than a byte isn't worth the extra complexity and
| slowness. If you're _that_ space sensitive you need to add
| proper compression.
|
| By delta compression you mean _across messages_? Yeah I 've
| never seen that but it's hard to imagine a scenario where it
| would be useful and worth the insane complexity.
| wichert wrote:
| zserio [1] has the former at least. It isn't intended for the
| same use cases as protobuf/capnproto/flatbutter though; in
| particular it has no backward or forwards compatibility. But
| it's great for situations where you know exactly what software
| is used on both ends and you need small data and fast
| en-/decoding.
|
| [1] http://zserio.org/doc/ZserioLanguageOverview.html#bit-
| field-...
| emtel wrote:
| I used capnproto years ago as the network serialization format
| for a multiplayer RTS game. Although the API can be quite
| awkward, it was overall a joy to use and I wish I was able to use
| it in more projects.
| insanitybit wrote:
| Any plans to improve the Rust side of things? The API could
| definitely use some more work/ docs around it.
| dwrensha wrote:
| I intend to continue work on capnproto-rust, at my own pace and
| according to my own priorities.
|
| Are there any particular pain points that you want to call
| attention to?
| vicaya wrote:
| Worker should really adopt Apache Arrow, which has a much bigger
| ecosystem.
|
| https://github.com/apache/arrow
| unixhero wrote:
| Congratulations to kentonv and the team
| s17n wrote:
| It's a testament to the subtlety of software engineering that
| even after four tries (protobuf 1-3, capn proto 1) there are
| still breaking changes that need to be made to the solution of
| what on the surface appears to be a relatively constrained
| problem.
| kentonv wrote:
| Of course, nothing is ever "solved". :)
|
| I assume you are talking about the cancellation change. This is
| interesting, actually. When originally designing Cap'n Proto, I
| was convinced by a capabilities expert I talked to that
| cancellation should be considered dangerous, because software
| that isn't expecting it might be vulnerable to attacks if
| cancellation occurs at an unexpected place. Especially in a
| language like C++, which lacks garbage collection or borrow
| checking, you might expect use-after-free to be a big issue. I
| found the argument compelling.
|
| In practice, though, I've found the opposite: In a language
| with explicit lifetimes, and with KJ's particular approach to
| Promises (used to handle async tasks in Cap'n Proto's C++
| implementation), cancellation safety is a natural side-effect
| of writing code to have correct lifetimes. You have to make
| cancellation safe because you have to cancel tasks all the time
| when the objects they depend on are going to be destroyed.
| Moreover, in a fault-tolerant distributed system, you have to
| assume any code might not complete, e.g. due to a power outage
| or maybe just throwing an unexpected exception in the middle,
| and you have to program defensively for that anyway. This all
| becomes second-nature pretty quick.
|
| So all our code ends up cancellation-safe by default. We end up
| with way more problems from cancellation unexpectedly being
| prevented when we need it, than happening when we didn't expect
| it.
|
| EDIT: Re-reading, maybe you were referring to the breaking
| changes slated for 2.0. But those are primarily changes to the
| KJ toolkit library, not Cap'n Proto, and is all about API
| design... I'd say API design is not a constrained problem.
| [deleted]
| dannyobrien wrote:
| I'm excited by Cap'n Proto's participation in the OCAPN
| standardization effort. Can you speak to if that's going to be
| part of the Cap'n Proto 2.0 work?
|
| https://github.com/ocapn/ocapn
| kentonv wrote:
| Sadly, the person leading that participation, Ian "zenhack"
| Denhardt, recently and unexpectedly passed away.
|
| For my part, I'm a fan of OCapN, but I am not sure how much
| time I can personally commit to it, with everything on my
| plate.
|
| I wish I had better news here. This was a tragic loss for all
| of us.
| bryanlarsen wrote:
| The guy making Tempest? That is tragic and deserves an HN
| front page. Is there an obituary or good link you can submit
| and we can upvote? Or would Ian have preferred not to have
| such?
| [deleted]
| omginternets wrote:
| Indeed, Ian was the driving force behind Tempest. I am not
| aware of any obituary.
| LukeShu wrote:
| I would have sworn that HN is where I saw
| https://staticfree.info/ian/ but at the time I didn't
| realize why it was timely.
|
| The best link I know is Christine Lemmer-Webber's
| post:https://octodon.social/@cwebber/110712988569475393
| the_common_man wrote:
| That's tragic. Was he also not helping maintain sandstorm?
| Would appreciate a blog post or a note about him there
| kentonv wrote:
| Yes, he was the most active developer over the last few
| years, although it wasn't a huge amount of activity. And
| that activity had dropped off this year as Ian shifted his
| focus to Tempest, a mostly-from-scratch rewrite.
| https://github.com/zenhack/tempest
|
| For my part I stopped pushing monthly Sandstorm updates
| this year as there hasn't really been anything to push.
| Unfortunately Sandstorm's biggest dependencies can't even
| be updated anymore because of breaking changes that would
| take significant effort to work around.
|
| I agree a blog post is probably in order.
| nyanpasu64 wrote:
| Looking at https://sandstorm.io/news/2014-08-19-why-not-
| run-docker-apps from 9 years ago, it seems you think that
| Docker was/is poorly suited for individual user
| administration. Since the Internet has started
| enshittifying in the last few years I've been turning to
| self-hosting apps, but this requires researching distro-
| specific instructions for each new app I install, and I
| got burned when Arch updated their Postgres package which
| broke compatibility with the old database formats until I
| ran a manual upgrade script which required
| troubleshooting several errors along the way. (In
| hindsight, I should've picked a fixed-release distro like
| Debian or something.)
|
| Would you say that there have been user-friendly
| Docker/etc. wrappers for self-hosting LAN services?
| Someone has recommended casaOS (or the proprietary
| Umbrel), though I haven't tried either yet.
| ocdtrekkie wrote:
| I have thought a bit on this but I have also been pretty
| much just been recovering lately. We will probably need
| to assemble a blog post at some point soonish, but we
| need to talk to a few people first.
| hgsgm wrote:
| Who are you in this? Your profile just shows a username.
| ocdtrekkie wrote:
| I'm part of Sandstorm's community team and the maintainer
| of some apps and tools relating to the project.
| kentonv wrote:
| Can confirm, ocdtrekkie is one of Sandstorm's most active
| maintainers over the last few years.
| omginternets wrote:
| Kenton, I'm @lthibault on GitHub. I was working closely with
| Ian on the Go capnp implementation, and I would be happy to
| take over this initiative. Can you point me in the right
| direction?
|
| Also, are you on Matrix or Telegram or something of the sort?
| I was hoping I could ping you with the occasional question as
| I continue work on go-capnp.
| omginternets wrote:
| I have some very unfortunate news to share with the Cap'n Proto
| and Sandstorm communities.
|
| Ian Denhardt (zenhack on HN), a lead contributor to the Go
| implementation, suddenly and unexpectedly passed away a few weeks
| ago. Before making a request to the community, I want to express
| how deeply saddened I am by this loss. Ian and I collaborated
| extensively over the past three years, and we had become friends.
|
| As the _de facto_ project lead, it now befalls me to fill Ian 's
| very big shoes. Please, if you're able to contribute to the
| project, I could really use the help. And if you're a contributor
| or maintainer of some other implementation (C++, Rust, etc.), I
| would *REALLY* appreciate it if we could connect. I'm going to
| need to surround myself with very smart people if I am to
| continue Ian's work.
|
| RIP Ian, and thank you. I learned so much working with you.
|
| ------
|
| P.S: I can be reached in the following places
|
| - https://github.com/lthibault
|
| - https://matrix.to/#/#go-capnp:matrix.org
|
| - Telegram: @lthibault
|
| - gmail: louist87
| jcalabro wrote:
| Oh gosh, I didn't know that. Thank you for sharing :( I really
| loved his blog. That's awful.
| omginternets wrote:
| Indeed. His blog was outstanding. It might be a good idea to
| mirror his blog before the domain expires, as it's a real
| treasure-trove.
| ocdtrekkie wrote:
| I already have a copy (it's a static site). I can rehost it
| if needed, but I'd want to get permission to do so from
| someone first.
| LukeShu wrote:
| Christine Lemmer-Webber (https://dustycloud.org/contact/)
| may be able to put you in touch with Ian's surviving
| partner.
| omginternets wrote:
| I'm able to connect you with Ian's partner, if you'd
| like.
| freedomben wrote:
| I've had a couple people suddenly taken from me, and it is soul
| crushing. Every time it happens it reminds me of how fragile
| life is, and how quickly things can change. I've started trying
| to enjoy the small things in life more, and while I don't
| neglect the future, I also try to enjoy the present.
|
| He has left an amazing legacy that has touched a lot of people.
| RIP Ian.
| omginternets wrote:
| Every damn day, something comes up that makes me go, "Oh, I
| should ask Ian about tha-"
|
| It really sucks. And I know exactly what you mean.
| pja wrote:
| It seems @zenhack maintained the Haskell bindings as well.
| omginternets wrote:
| I think the Haskell project was a complete implementation,
| not just bindings. But yes, Ian was truly prolific.
| hgsgm wrote:
| Tell me about your uses of capn proto.
| bkiran wrote:
| I'm using Cap'N Proto in a message broker application(LcuidMQ)
| I'm building for serialization. It has allowed me to created
| client applications rather quickly. There are some quirks can
| be difficult to wrap your head around, but once you understand
| it is really solid.
|
| There are some difference between the language libraries and
| documentation can be lacking around those language specific
| solutions. I'm hoping to add blog articles and or contribute
| back to the example of these repositories to help future users
| who want to dabble.
|
| Check out my repo here for how I use it across Rust and Python,
| with Golang coming soon: https://github.com/lucidmq/lucidmq
| [deleted]
| IshKebab wrote:
| Great achievement. To be honest I wouldn't recommend Capnp. The
| C++ API is very awkward.
|
| The zero copy parsing is less of a benefit than you'd expect -
| pretty unlikely you're going to want to keep your data as a Capnp
| data structure because of how awkward it is to use. 99% of the
| time you'll just copy it into your own data structures anyway.
|
| There's also more friction with the rest of the world which has
| more or less settled on Protobuf as the most popular binary
| implementation of this sort of idea.
|
| I only used it for serialisation. Maybe the RPC stuff is more
| compelling.
|
| I really wish Thrift had taken off instead of Protobuf/gRPC. It
| was so much better designed and more flexible than anything I've
| seen before or since. I think it died mainly due to terrible
| documentation. I guess it also didn't have a big name behind it.
| Rapzid wrote:
| I find MessagePack to be pretty great if you don't need schema.
| JSON serialization is unreasonably fast in V8 though and even
| message pack can't beat it; though it's often faster in other
| languages and saves on bytes.
| alfalfasprout wrote:
| Except messagepack is really slow...
| Rapzid wrote:
| Is it? It's quite fast in DotNet..
| insanitybit wrote:
| Historically I've heard and also experienced JSON +
| gzip/zstd to be faster and smaller than msgpack.
| coder543 wrote:
| I do not agree that JSON is faster.
|
| Encoding JSON or MessagePack will be about the same
| speed, although I would expect MessagePack to be
| marginally faster from what I've seen over the years.
| It's easy to encode data in most formats, compression
| excluded.
|
| Parsing is the real problem with JSON, and no, it isn't
| even close. MessagePack knows the length of every field,
| so it is extremely fast to parse, an advantage that grows
| rapidly when large strings are a common part of the data
| in question. I love the simple visual explanation of how
| MessagePack works here: https://msgpack.org/
|
| Anyone who has written parsing code can instantly
| recognize what makes a format like this efficient to
| parse compared to JSON.
|
| With some seriously wild SIMD JSON parsing libraries, you
| can get _closer_ to the parsing performance of a format
| like MessagePack, but I think it is physically impossible
| for JSON to be faster. You simply have to read every byte
| of JSON one way or another, which takes time. You also
| don't have any ability to pre-allocate for JSON unless
| you do two passes, which would be expensive to do too.
| You have no idea how many objects are in an array, you
| have no idea how long a string will be.
|
| MessagePack objects are certainly smaller than JSON but
| larger than compressed JSON. Even compressed MessagePack
| objects are larger than the equivalent compressed JSON,
| in my experience, likely because the field length
| indicators add a randomness to the data that makes
| compression less effective.
|
| For applications where you need to handle terabytes of
| data flowing through a pipeline every hour, MessagePack
| can be a huge win in terms of cost due to the increased
| CPU efficiency, and it's a much smaller lift to switch to
| MessagePack from JSON than to switch to something
| statically typed like Protobuf or CapnProto, just due to
| how closely MessagePack matches JSON. (But, if you _can_
| switch to Protobuf or CapnProto, those should yield
| similar and perhaps even modestly better benefits.)
|
| Compute costs are much higher than storage costs, so I
| would happily take a small size penalty if it reduced my
| CPU utilization by a large amount, which MessagePack
| easily does for applications that are very data-heavy.
| I'm sure there is at least one terribly slow
| implementation of MessagePack out there somewhere, but
| most of them seem quite fast compared to JSON.
|
| Some random benchmarks in Go:
| https://github.com/shamaton/msgpack#benchmark
|
| Also take note of the "ShamatonGen" results, which use
| codegen before compile time to do things even more
| efficiently for types known ahead of time, compared to
| the normal reflection-based implementation. The "Array"
| results are a weird version that isn't strictly
| comparable, the encoding and decoding steps assume that
| the fields are in a fixed order, so the encoded data is
| just arrays of values, and no field names. It can be
| faster and more compact, but it's not "normal"
| messagepack.
|
| I've personally seen crazy differences in performance vs
| JSON.
|
| If you're not handling a minimum of terabytes of JSON per
| day, then the compute costs from JSON are probably
| irrelevant and not worth thinking too hard about, but
| there can be other benefits to switching away from JSON.
| c-cube wrote:
| Why don't you compare JSON + gzip to msgpack + gzip?
| That'd be a more fair comparison.
| crabmusket wrote:
| It depends on your data. We ran comparisons on objects
| with lots of numbers and arrays (think GeoJSON) and
| messagepack came out way ahead. Of course, something like
| Arrow may have fared even better with its focus on
| columnar data, but we didn't want to venture that far
| afield just yet.
| nvarsj wrote:
| fbthrift is still alive and kicking.
| kentonv wrote:
| I do agree that the API required for zero-copy turns out a bit
| awkward, particularly on the writing side. The reading side
| doesn't look much different. Meanwhile zero-copy is really only
| a paradigm shift in certain scenarios, like when used with
| mmap(). For network communications it doesn't change much
| unless you are doing something hardcore like RDMA. I've always
| wanted to add an optional alternative API to Cap'n Proto that
| uses "plain old C structures" (or something close to it) with
| one-copy serialization (just like protobuf) for the use cases
| where zero-copy doesn't really matter. But haven't gotten
| around to it yet...
|
| That said I personally have always been much more excited about
| the RPC protocol than the serialization. I think the RPC
| protocol is actually a paradigm shift for almost any non-
| trivial use case.
| foobiekr wrote:
| One thing about google proto is that, at least in many
| languages, every message throws off a ton of garbage that
| stresses the GC. On the send side, you can obviously re-use
| objects, but on the receive side no.
| haberman wrote:
| More and more languages are being built on top of the "upb"
| C library for protobuf
| (https://github.com/protocolbuffers/upb) which is designed
| around arenas to avoid this very problem.
|
| Currently Ruby, PHP, and Python are backed by upb.
|
| Disclosure: I work on the protobuf team, and created the
| upb library.
| tignaj wrote:
| This is also because Google's Protobuf implementations
| aren't doing a very good job with avoiding unnecessary
| allocations. Gogoproto is better and it is possible to do
| even better, here is an example prototype I have put
| together for Go (even if you do not use the laziness part
| it is still much faster than Google's implementation):
| https://github.com/splunk/exp-lazyproto
| cmrdporcupine wrote:
| I've always been excited about zero copy messages in the
| context of its potential in database systems; the thought of
| tuples working their way all the way from btree nodes in a
| pager, to query results on the network without copies seems
| fantastic.
|
| But every time I've tried to prototype or implement around
| this model I've run into conceptual blocks. It's a tricky
| paradigm to fully wrap one's head around, and to squeeze into
| existing toolsets.
| mgaunard wrote:
| You mean flatbuffers, not protobuf.
|
| It has established itself as the de-facto standard, with a few
| other places using SBE instead.
|
| In any case the main problems with binary serialization are:
|
| - schemas and message version management
|
| - delta-encoding
|
| If you ignore these, flat binary serialization is trivial.
|
| No library provides a good solution that covers the two points
| above.
| dtech wrote:
| Protobuf is very widely used, I just had to Google
| flatbuffers...
| kentonv wrote:
| What part of the industry are you in where flatbuffers is
| seen as the de facto standard? Personally I've never randomly
| encountered a project using flatbuffers. I see protobuf all
| the time.
|
| (I've randomly run into Cap'n Proto maybe 2-3 times but to be
| fair I'm probably more likely to notice that.)
| ynx wrote:
| As of the last time I was close, flatbuffer usage is or was
| near ubiquitous for use in FB's (ha ha, go figure) mobile
| apps, across Android and iOS at least.
| cmrdporcupine wrote:
| Flatbuffers seems to have penetration in the games
| industry. And it sounds like from other posters that
| Facebook uses it.
|
| I recently started a job doing work on autonomy systems
| that run in tractors, and was surprised to see we use it
| (flatbuffers) in the messaging layer (in both C++ and Rust)
| Timothycquinn wrote:
| Congrats in 10 years! Question: Can Cap'n Proto be used as an
| alternative to Python Pickle library for serializing and de-
| serializing python object structures?
| kentonv wrote:
| If your goal is to serialize an arbitrary Python object, Pickle
| is the way to go. Cap'n Proto requires you to define a schema,
| in Cap'n Proto schema language, for whatever you wan to
| serialize. It can't just take an arbitrary Python value.
| jFriedensreich wrote:
| i love how the main reference for workerd can be just one capnp
| file.
|
| https://github.com/cloudflare/workerd/blob/main/src/workerd/...
|
| this changed my world how i think about computing on the web.
|
| if there was just a good enough js library as for lua and you
| could directly send capnp messages to workerd instead of always
| going through files. I guess one day i have to relearn c++ and
| understand how the internals actually work.
| [deleted]
| maccam912 wrote:
| If any cloudflare employees end up here who helped decide on Capn
| Proto over other stuff (e.g. protobuf), what considerations went
| into that choice? I'm curious if the reasons will be things
| important to me, or things that you don't need to worry about
| unless you deal with huge scale.
| coolsunglasses wrote:
| I don't work at Cloudflare but follow their work and
| occasionally work on performance sensitive projects.
|
| If I had to guess, they looked at the landscape a bit like I do
| and regarded Cap'n Proto, flatbuffers, SBE, etc. as being in
| one category apart from other data formats like Avro, protobuf,
| and the like.
|
| So once you're committed to record'ish shaped (rather than
| columnar like Parquet) data that has an upfront parse time of
| zero (nominally, there could be marshalling if you transmogrify
| the field values on read), the list gets pretty short.
|
| https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-...
| goes into some of the trade-offs here.
|
| Cap'n Proto was originally made for https://sandstorm.io/. That
| work (which Kenton has presumably done at Cloudflare since he's
| been employed there) eventually turned into Cloudflare workers.
|
| Another consideration:
| https://github.com/google/flatbuffers/issues/2#issuecomment-...
| hgsgm wrote:
| Aside from CF Workers using capn proto, how is it related to
| capn proto or sandstorm?
| kentonv wrote:
| They are all projects I started.
|
| But other than who worked on them, and sharing some
| technology choices under the hood, there's mostly no
| relationship between Workers and Sandstorm.
| hblanks wrote:
| To summarize something from a little over a year after I joined
| there: Cloudflare was building out a way to ship logs from its
| edge to a central point for customer analytics and serving logs
| to enterprise customers. As I understood it, the primary
| engineer who built all of that out, Albert Strasheim,
| benchmarked the most likely serialization options available and
| found Cap'n Proto to be appreciably faster than protobuf. It
| had a great C++ implementation (which we could use from nginx,
| IIRC with some lua involved) and while the Go implementation,
| which we used on the consuming side, had its warts, folks were
| able to fix the key parts that needed attention.
|
| Anyway. Cloudflare's always been pretty cost efficient machine
| wise, so it was a natural choice given the performance needs we
| had. In my time in the data team there, Cap'n Proto was always
| pretty easy to work with, and sharing proto definitions from a
| central schema repo worked pretty well, too. Thanks for your
| work, Kenton!
| kentonv wrote:
| Here's a blog post about Cloudflare's use of Cap'n Proto in
| 2014, three years before I joined:
| https://blog.cloudflare.com/introducing-lua-capnproto-better...
|
| To this day, Cloudflare's data pipeline (which produces logs
| and analytics from the edge) is largely based on Cap'n Proto
| serialization. I haven't personally been much involved with
| that project.
|
| As for Cloudflare Workers, of course, I started the project, so
| I used my stuff. Probably not the justification you're looking
| for. :)
|
| That said, I would argue the extreme expressiveness of Cap'n
| Proto's RPC protocol compared to alternatives has been a big
| help in implementing sandboxing in the Workers Runtime, as well
| as distributed systems features like Durable Objects.
| https://blog.cloudflare.com/introducing-workers-durable-obje...
| matlin wrote:
| The lead dev of Cloudflare workers is the creator of Cap'n
| Proto so that likely made it an easy choice
| mikesurowiec wrote:
| The post states "they used Cap'n Proto before they hired me"
| oxygen_crisis wrote:
| He helped build the Workers platform after they hired him.
| saghm wrote:
| The article says they were using it before hiring him though,
| so there must have been some prior motivation:
|
| > In fact, you are using Cap'n Proto right now, to view this
| site, which is served by Cloudflare, which uses Cap'n Proto
| extensively (and is also my employer, although they used
| Cap'n Proto before they hired me)
| batch12 wrote:
| I know this isn't new, but I wonder if the name is an intentional
| nod to Star Trek Voyager or is there another reference I'm not
| aware of.
|
| https://memory-alpha.fandom.com/wiki/Captain_Proton
| azornathogron wrote:
| Given that it's billed as a "cerealization protocol", I always
| assumed it was a reference to Cap'n Crunch cereal.
| [deleted]
| kentonv wrote:
| Huh, that reference actually never occurred to me.
|
| The name Cap'n Proto actually originally meant "Capabilities
| and Protobufs" -- it was a capability-based RPC protocol based
| on Protocol Buffers. However, early on I decided I wanted to
| try a whole different serialization format instead. "Proto"
| still makes sense, since it is a protocol, so I kept the name.
|
| The pun "cerealization protocol" is actually something someone
| else had to point out to me, but I promptly added it to the
| logo. :)
| mi_lk wrote:
| What does capacity-based mean in this context?
| kentonv wrote:
| Capability, not capacity.
|
| https://en.wikipedia.org/wiki/Capability-based_security
|
| https://capnproto.org/rpc.html#distributed-objects
|
| The idea really goes way beyond security and RPC. It's hard
| to explain concisely but it's sort of a way of thinking
| about software architecture.
| richardfey wrote:
| Amazing to see Cap'n Proto come this far! I wonder how easy it
| would be to swap it for gRPC, and still have advanced load
| balancing support for it.
| synthetigram wrote:
| After exploring a few constant access serialization formats, I
| had to pass on Capn Proto in favor of Apache Avro. Capn has a
| great experience for C++ users, but Java codegen ended up being
| too annoying to get started with. If Capn Proto improved the
| developer experience for the other languages people write, I
| think it would really help a lot.
| hiddencost wrote:
| For context: Kenton ran Google's in house proto system for many
| years, before leaving and building his own open source version.
| AceJohnny2 wrote:
| I believe he was the one who open-sourced protobufs.
___________________________________________________________________
(page generated 2023-07-28 23:00 UTC)