[HN Gopher] Scuttlebutt Protocol Guide
___________________________________________________________________
Scuttlebutt Protocol Guide
Author : tosh
Score : 167 points
Date : 2021-12-24 10:51 UTC (12 hours ago)
(HTM) web link (ssbc.github.io)
(TXT) w3m dump (ssbc.github.io)
| formerly_proven wrote:
| > Before signing a message it must be serialized according to a
| specific canonical JSON format. This means for any given message
| there is exactly one way to serialize it as a sequence of bytes,
| which is necessary for signature verification to work.
|
| Design mistake. Don't sign abstract messages, sign bags of bytes.
| Doing otherwise means (1) have to parse messages completely, even
| if their signatures are invalid (2) requires canonical
| representations, a major PITA and source of bugs (3) is overall
| uglier to implement.
| Rygian wrote:
| On the contrary, lack of canonical format means plenty of room
| for implementations to fiddle with the format and introduce
| bugs and incompatibilities. The same bag of bytes ends up
| parsed as two different things on two different systems, which
| leads to SSRF-like vulnerabilities.
|
| Throw away anything at the first byte that breaks canonical
| representation, don't bother verifying its signature.
| yencabulator wrote:
| If you do that, then there's no point in using JSON -- you
| won't be able to use preexisting JSON parsers or writers
| anyway.
|
| Meanwhile, the sign-the-bytes crowd can use any preexisting
| format.
| mlyle wrote:
| > Throw away anything at the first byte that breaks canonical
| representation, don't bother verifying its signature.
|
| IMO, the opposite order is better because it means you verify
| that the message is sent by someone you minimally trust
| before exposing parsers, etc, to attack, and someone
| attempting attacks must sign the messages. Here, getting an
| identity is so easy that it doesn't make much of a
| difference.
|
| As far as attack surface, the cryptographic implementation is
| hopefully smaller than a higher level parser. The performance
| implications are unclear (both parsing and signature
| verification can be expensive).
| legutierr wrote:
| > Don't sign abstract messages, sign bags of bytes.
|
| Isn't the relevant question, "what are the bytes that are being
| signed?" That's what the specification seems to be answering
| here.
|
| I believe it's implicit that the signature is actually of a
| hash of the message itself, which is why they need to serialize
| the message in a deterministic fashion, because otherwise the
| hash would be different in different circumstances for the same
| message. If you want to validate the signature, you need to
| make sure that the hash being signed is of the same exact
| string that you are verifying.
|
| > requires canonical representations, a major PITA and source
| of bugs
|
| Yes, absolutely, especially with regards to JSON, which doesn't
| have a fully deterministic serialization standard. The
| scuttlebutt specification seems to assume that JSON.stringify()
| will produce a consistent output, but that is not really the
| case, per my understanding, at least not with regards to
| objects.
|
| From the Scuttlebutt specification:
|
| > The canonical format is defined by the ECMA-262 6th Edition
| section JSON.stringify. For an example, see how the above
| message is formatted.
|
| JSON object serialization in Javascript depends on the
| insertion order of the members of an object, so if you somehow
| change the ordering in which your keys are updated as the
| object is built, you will get a different string output, even
| if the data of the underlying object is the same. And this is
| just within the JavaScript runtime--if you are implementing
| this using another programming language, you can't just rely on
| the ECMA standard to determine the ordering of your object
| members at the time of serialization.
|
| If I recall properly, JSON serialization is ambiguous in the
| specification in a variety of ways, which results in different
| implementations outputting subtly different strings, even under
| ordinary circumstances.
|
| It seems weird that you would write a formal, generalized
| protocol specification relying on the idiosyncratic
| implementation details of JavaScript, for such an important
| thing as cryptographic signatures, as the Scuttlebutt
| specification seems to do here.
| progval wrote:
| Matrix does it like this as well. Servers have to serialize
| an object as JSON (specified as Python code), then add the
| signature to the object and serialize it again to send it on
| the network.
| https://spec.matrix.org/v1.1/appendices/#signing-json
| EGreg wrote:
| If you need determinism, I recommend avoiding objects and
| using pairs of arrays, plus they can generalize to matrices.
| Just use the same index in all.
| sbazerque wrote:
| Exactly this. You need deterministic serialization, because
| you need to be sure that when the _same_ object is
| constructed in different settings, it is going to hash
| consistently. In Hyper Hyper Space [1], the set of basic
| types as well as the composition primitives used to construct
| all data structures have built-in deterministic
| serialization, just for this reason (e.g. a set will
| serialize into a deterministically ordered list, etc.)
|
| [1] https://www.hyperhyperspace.org
| formerly_proven wrote:
| > Exactly this. You need deterministic serialization,
| because you need to be sure that when the _same_ object is
| constructed in different settings, it is going to hash
| consistently.
|
| I can see how this might matter in some specific systems,
| but when we're talking about signatures only the signer
| constructs the object. Stuff like the "JWS/CT using JWS and
| JSON Canonicalization" recommended in a sibling comment is
| a complete misdesign for virtually all signing use cases.
| That's why "our signature scheme _requires_ canonical
| representations " is a red flag.
| sbazerque wrote:
| But "the signer" here is a cryptographic identity, that
| may be present in more than one device. So, even when
| conceptually it is just one entity, in practice it may be
| several computers doing something independently, and one
| may need the result to be the same given identical
| inputs.
| cel wrote:
| > It seems weird that you would write a formal, generalized
| protocol specification relying on the idiosyncratic
| implementation details of JavaScript, for such an important
| thing as cryptographic signatures, as the Scuttlebutt
| specification seems to do here.
|
| The Protocol Guide was created after the initial
| implementation and its protocol were already in wide use, and
| the quirks were discovered while re-implementing it. More
| info about implementations here (Node.js, Go, Rust x2, and
| Python; additionally there are implementations of varying
| states in Java, C, Haskell, Erlang and probably others):
| https://dev.scuttlebutt.nz/#/?id=implementations
|
| ---
|
| If making a new protocol using signatures over JSON objects,
| one might use this: JWS Clear Text JSON Signature Option
| (JWS/CT) https://datatracker.ietf.org/doc/html/draft-jordan-
| jws-ct JWS/CT uses JSON Web Signature (JWS) [RFC7515], JSON
| Canonicalizion Scheme (JCS) [RFC8785], and I-JSON [RFC7493]
| (subset of JSON for interoperability)
|
| Or for signatures over JSON-LD objects / RDF datasets:
| https://w3c-ccg.github.io/ld-proofs/ https://json-
| ld.github.io/rdf-dataset-canonicalization/spec/
| crypt0x wrote:
| Glad you picked up on it. It's not on the protocol guide yet
| but there are two or three new formats in discussion which all
| just sign opaque bytes. Wider adoption pending but the meta
| feeds is in JS already.
|
| Here is the most recent one and it links to the other two as
| well: https://github.com/ssb-ngi-pointer/bendy-butt-spec
| crypt0x wrote:
| ps: I did implement the v8 pretty-printer in go and it was a
| nightmare.. I'm sure it still has some corner cases that are
| not covered....
| joshuakelly wrote:
| SSB has a special place in my heart. I wrote this a year ago on
| HN and it seems truer than ever:
|
| > There's no global timeline -- just archipelagos. It assumes
| that network heterogeny is the default, and is transmission layer
| agnostic. Breakages will occur. Maybe you're living on a
| catamaran in the South Pacific and you only have connectivity
| once a month -- SSB will work even then.
|
| > Your own timeline is a sigchain -- a sequenced list of signed
| messages. You replicate the content in your network (2 hops
| away). Bridges between communities can be built or burned. Many
| islands can exist without needing to erase the others from even
| existing -- mutual separation is possible. Consensus is not
| necessary. #againstconsensus
|
| > Is global network culture still possible? If it is, in the
| midst of the national internets we now live inside of, I suspect
| it will look something like this. A little different from what we
| were promised, but maybe a little better too.
| southerntofu wrote:
| Gossip protocols are amazing! I wish there was less of a gap
| between federated and p2p protocols: SSB has "pubs" which like
| "centralized" places of gossip (tongues untie after a beer!) but
| the other way around is really uncommon (federated protocols
| supporting gossip when your server is unreachable).
|
| The only problem with the gossip-first architecture is that every
| message needs to be public which is a not a tradeoff everybody's
| willing to make, if only because you need a lot of storage space
| for everyone's blabber :)
|
| On a more technical level, SSB protocol is really nice but part
| of me really hates that signatures are inlined in the JSON
| message. This means you have to do two levels of
| (de)serialization every message?! It would be more resource-
| efficient to use a special prefix/suffix for signatures, like so:
| \EOFsigtype:signature\EOF { ... }
| soapdog wrote:
| Pubs are being phased out in favour of rooms[1]. You can
| totally have SSB work without pubs and rooms, it just takes
| longer for you to see some messages depending on how far away
| you are from the gossip. Rooms are much simpler than pubs, they
| just provide a tunnel for peers to gossip as if they were on
| LAN. It is more of a convenience than a requirement.
|
| > every message needs to be public
|
| Not at all. SSB has multiple types of private messages. They
| can only be decrypted by the intended recipients since the key
| used to sign it is a derivation of a combination of the keys
| from the sender and the recipients [2]. There is even a new
| spec for private groups[3].
|
| [1]: https://ssb-ngi-pointer.github.io/rooms2 [2]:
| https://ssbc.github.io/scuttlebutt-protocol-guide/#private-m...
| [3]: https://github.com/ssbc/ssb-tribes/
| southerntofu wrote:
| > Not at all. SSB has multiple types of private messages.
|
| I didn't mean that everything has to be plaintext, but has to
| be public in order for gossip to work. Or did i miss
| something? It's fine if metadata aggregation is not a concern
| of yours, and if you're rather certain your encryption
| algorithms won't be broken in the next decades.
|
| That rooms2 proposal looks very interesting (didn't read it
| all), but i'm curious if there's a state-of-the-art review
| you can link me to. I just don't understand the necessity to
| develop yet another protocol if the goal is just to break
| through NAT?
| soapdog wrote:
| I understand your concerns regarding metadata aggregation.
| SSB was not designed with those considerations. I think
| that with those concerns as a constraint you kinda need a
| different protocol.
|
| I think that rooms2 required a new protocol because of the
| handshake required for two peers to connect. I'm not an
| expert on protocol stuff, I write one of the clients but
| I'm usually working on higher levels than protocol work. I
| suspect that rooms2 is not just NAT breakage but it has
| some of the handshake and security involved in it.
|
| If you want to dig deeper, I suggest asking for links on
| their rooms 2.0 repo. I'm not aware of a review like that,
| but I'm usually working with other stuff. I know there were
| some papers published recently, maybe those will appeal to
| you. :-)
| olah_1 wrote:
| are there any ideas to introduce rotating keys?
|
| i dont like that the feeds are permanent and if you leak a
| key, people see the whole history.
| tgsovlerkhgsel wrote:
| Scuttlebutt sounded like exactly something I've always wanted,
| but I got a vague feeling of something about the project behind
| it being weird, disorganized, esoteric - basically a gut feeling
| making me worried about the protocol and whether it's soundly
| designed - and most of the documentation at that time seemed to
| focus on philosophy rather than the intricacies of the protocol.
|
| The 35C3 talk about it
| (https://media.ccc.de/v/35c3-9635-scuttlebutt) turned me off from
| it completely.
|
| I'm glad to see a concise, clean technical spec. Does the
| protocol have meaningful adoption anywhere?
| pabs3 wrote:
| I enjoyed the interview with Joey Hess about Scuttlebutt:
|
| https://librelounge.org/episodes/episode-14-secure-scuttlebu...
|
| Sounds like they made some interesting choices on the protocol.
| tgsovlerkhgsel wrote:
| Would you be willing to share a quick summary?
| gardnr wrote:
| I just updated the dependencies on this project. It helps find
| vanity keys that start with a particular string. E.g.
| @gardner1/lu4h
|
| https://github.com/gardner/vanityssb
| armchairhacker wrote:
| Scuttlebutt is a very interesting protocol, but I wonder how much
| benefits you actually get from decentralization. You could
| implement something similar with a central server and clients
| sending their locations. You could even implement E2E encryption
| for privacy.
|
| Lots of people argue for decentralized protocols (e.g.
| blockchain), but in practice centralization is usually fine. Just
| because you have a trusted central server, it doesn't have to be
| expensive and anti-privacy like Google or Facebook.
| soapdog wrote:
| > I wonder how much benefits you actually get from
| decentralization.
|
| I'm active in the SSB community. These are just some anecdotes
| that might amuse you and maybe help you glimpse how I see the
| protocol and ecosystem.
|
| I was on a transatlantic flight from Brazil to Paris. I had SSB
| on my computer and that gave me access to multiple years worth
| of messages by my friends and their friends. I spent time
| learning all sorts of cool stuff, and reading amazing convos. I
| could reply, like, interact with all of it, even without a
| network connection. Once I landed and my computer found a
| connection again, it gossiped all my changes.
|
| The Internet was not working well at a conference. All the
| decentralization workshops and talks were suffering because of
| it. The local venue firewall was preventing them from reaching
| their DHTs and known services. We didn't noticed any of it as
| our machines were gossiping locally. Our workshop just moved
| on.
|
| Wanted to onboard a friend on SSB. We were at a beach house
| without decent internet. I was using macOS, he was using Linux.
| Some other people were using SSB as storage for NPM and Git
| artefacts. I managed to get the source from a client from the
| SSB feed, copied over to him, and onboarded him by doing git
| clone, npm install, all from SSB.
|
| A machine of mine had broken down -- my fault really, I learned
| the hard way that renaming the single admin user on Windows is
| not an easy task -- and I ended up having to reinstall
| everything. I reinstalled the SSB client, and copied over my
| keys. It restored all my data and feed by asking my friends for
| it.
|
| > in practice centralization is usually fine
|
| For many cases, yes. Centralization doesn't mean expensive or
| anti-privacy. I agree with you there. I don't think it is a
| binary situation. In many cases, centralization is the way to
| go.
|
| I do enjoy decentralization though, especially when it doesn't
| rely on blockchains and cryptocurrency. I want my
| decentralization without financial incentives and cheap to
| compute.
| folex wrote:
| how to use NPM through SSB? is there a doc or guide?
| honungsburk wrote:
| Often centralization is chosen specifically because it is
| easier to monetize. I love FOSS just as much as the next guy
| but it has it downsides too... Just look at the log4j bug and
| all the crap those developers get for work they've done on
| their free time. Sorry, but I get triggered any time anyone
| says they want software without paying for it.
| mlyle wrote:
| Some things are products. Some things are bits of
| infrastructure and agreed standards that products can run
| on top of, that can't necessarily be directly monetized and
| may be relatively open. The world is better for having
| both.
|
| Something like a gossip protocol is definitely more like
| one of those infrastructure pieces in today's world.
| joek1301 wrote:
| do you have any advice for "bootstrapping" your SSB
| experience as a new user? The original post inspired me to go
| download Patchwork and join a public pub, but I'm having
| difficulty finding real interesting conversations to join in
| on.
| zackmorris wrote:
| Anyone know why they used UTF-16 for the message type, instead of
| UTF-8?
|
| Also, I'm skeptical that two peers can exchange keys without a
| third party, without being susceptible to man-in-the-middle
| attacks. But maybe the paper linked in the article proves that
| it's possible?
|
| https://dominictarr.github.io/secret-handshake-paper/shs.pdf
|
| When I was playing with p2p 20 years ago, I found that NAT
| traversal was far harder than the messaging protocol (I just
| copied a few commands from IRC at the time). So hard in fact,
| that I failed to solve it after 2 years of effort. I realized
| later that NAT is part of a cluster of problems with networking,
| the main one being that TCP should have been a layer above UDP
| (not beside it) and that connectedness was never really a good
| concept (it should have used identifiers like this instead of IP
| addresses so the stream survives changing LAN/Wi-Fi/mobile
| networks). Unfortunately, the article doesn't talk about UDP much
| other than for broadcasting peer identifiers on the LAN. So I'm
| not sure how much use this would have in the real world for stuff
| like game networking, much less state transfer with something
| like a software transactional memory (STM). I wonder if anyone's
| made an STM that runs on a readonly transaction log like this,
| kind of like CouchDB, RethinkDB or Firebase maybe? But I digress.
|
| And I might have used the empty string "" to represent a null
| hash.. wait I misread that, they use actual null to represent the
| null pointer in the linked list of message hash addresses, which
| is great!
|
| Could this be used to build a web of trust? Or is it meant to be
| more transient, like maybe people broadcast on throwaway
| identities? Could we drop PGP into this?
|
| Maybe this is more like an RSS feed than something realtime like
| WebRTC?
|
| Other than that, this seems like a pretty decent protocol, these
| are just some thoughts/concerns that stood out for me.
| cel wrote:
| > Anyone know why they used UTF-16 for the message type,
| instead of UTF-8?
|
| It is an artefact of the original implementation, which was not
| discovered until the protocol was already in wide use and being
| independently implemented [1].
|
| > Also, I'm skeptical that two peers can exchange keys without
| a third party, without being susceptible to man-in-the-middle
| attacks. But maybe the paper linked in the article proves that
| it's possible?
|
| SSB uses the Secret Handshake (SHS) protocol in that paper
| (with some errata [2]). SHS is a authenticated key exchange
| (key agreement) protocol [3]. The two peers authenticate
| eachother to their respective public key, and establish a
| shared secret that is used to bulk-encrypt the rest of the
| session/connection. With SHS, the client (the peer that
| initiates the connection) must know the server's public key
| ahead of time. Both parties must know and previously agree on
| an additional network capability key (that is usually hard-
| coded to a specific value in the SSB implementations).
|
| It should be immune to MitM if the party's private keys are
| kept secret. There are ephemeral keypairs involved, so if a
| later compromise occurs of the long-term (identity) private
| keys, that should not reveal previous/existing sessions. SHS
| has been verified using Tamarin [4].
|
| > I wonder if anyone's made an STM that runs on a readonly
| transaction log like this, kind of like CouchDB, RethinkDB or
| Firebase maybe?
|
| I'm not familiar with STM as such, but I am familiar with
| CouchDB. There are various ways of mutable data structures on
| SSB. Typically messages are indexed, in general ways (e.g.
| message type, author, backlinks) and/or application-specific
| ways. Applications query the indexes to construct some result.
| Graph processing is often done to handle concurrent operations
| by different feeds (i.e. using CRDTs).
|
| Here is a document describing threads, a common data structure
| on SSB: https://hackmd.io/GQ8aTw6STpuSFu6oH5Z63w
|
| > Could this be used to build a web of trust? Or is it meant to
| be more transient, like maybe people broadcast on throwaway
| identities? Could we drop PGP into this?
|
| Yes, the main SSB network constitutes a web of trust. People do
| create throwaway identities though, just trying it out and then
| not returning. But some persist, and people develop and express
| relationships. Some people share other public keys for PGP,
| OMEMO, Briar, RetroShare, Dat/Hyper, etc. PGP-signed messages
| have been published occasionally. Creating a temporary identity
| is not recommended for broadcasting, because it will not have
| much reach: message distribution and visibility depends on the
| social graph.
|
| > Maybe this is more like an RSS feed than something realtime
| like WebRTC?
|
| Yes. A SSB feed is identified by its public key, and contains
| an append-only list of messages. Each message is identified by
| its content hash. However, the RPC protocol used for SSB could
| be extended to support ephemeral content.
|
| WebRTC DataChannels could be used for gossip connections. But
| there is still the problem of needing to exchange message to
| establish the WebRTC connection. Historically SSB has addressed
| P2P network architecture using Pubs [5], more recently with
| Rooms [6].
|
| [1] https://news.ycombinator.com/item?id=29675263
|
| [2] https://github.com/auditdrivencrypto/secret-
| handshake/issues...
|
| [3] https://en.wikipedia.org/wiki/Authenticated_Key_Exchange
|
| [4] https://github.com/keks/tamarin-shs
|
| [5] https://ssbc.github.io/scuttlebutt-protocol-guide/#pubs
|
| [6] https://ssb-ngi-pointer.github.io/rooms2/
___________________________________________________________________
(page generated 2021-12-24 23:01 UTC)