[HN Gopher] RaptorCast: Designing a Messaging Layer
       ___________________________________________________________________
        
       RaptorCast: Designing a Messaging Layer
        
       Author : wwolffrec
       Score  : 32 points
       Date   : 2025-06-23 06:46 UTC (16 hours ago)
        
 (HTM) web link (www.category.xyz)
 (TXT) w3m dump (www.category.xyz)
        
       | wwolffrec wrote:
       | Monad uses RaptorCast to send out block proposals quickly and
       | reliably to a global network of validators. At Category Labs,
       | designing an effective messaging protocol to meet Monad's high
       | performance requirements was challenging and educational. Read
       | more about the design in the link.
        
       | klabb3 wrote:
       | > Assuming zero network latency and a bandwidth of 1 Gbps, this
       | would still take around 16 seconds
       | 
       | I'm not seeing how the new design affects the throughput needs,
       | but I'll say this:
       | 
       | Except for highly controlled environments (OS, NICs etc), you
       | will run into perf issues with UDP-based protocols much sooner
       | than with TCP, even if you're just pushing zeroes. Packet
       | switching is much more difficult to optimize.
       | 
       | If you only use sporadic messages without backpressure, and
       | you're willing and able to handle out-of-order messages and
       | retransmission logic, by all means, use UDP. Like for realtime
       | multiplayer games, it makes sense.
       | 
       | For high throughput on diverse platforms and hardware, the story
       | is very different. Yes, even with Quic. I learnt this the hard
       | way.
       | 
       | All that said, I'm very curious what the results are. Is this
       | designed fully deployed, and if so in what kind of environment
       | and traffic patterns? Even better: benchmarks/stress tests would
       | be fantastic.
        
         | tklenze wrote:
         | Thanks for your insights! Yes, real-life behavior is indeed
         | interesting to look at, and for this purpose, two testnets are
         | running right now (https://www.gmonads.com).
         | 
         | RaptorCast uses erasure coding to break a block proposal into
         | smaller pieces with plenty of redundancy to allow for
         | omissions. This means that if you receive sufficiently many
         | chunks, you can decode the block proposal (no matter which of
         | the chunks you received). The redundancy factor can be tweaked,
         | but it'll likely be >2x, to allow for networking issues and
         | faulty/malicious nodes. Furthermore, the blockchain can make
         | progress as long as >2/3 of the validators receive the block
         | proposal and are honest. This means that at least in theory,
         | you should be able to tolerate a lot of packet losses.
         | 
         | Re throughput: Monad has 2 blocks / s, each 2MB in size. So
         | even with a redundancy factor of 3x, each validator only has to
         | send 12MB per second.
         | 
         | Re backpressure: Not really an option for blockchains. If you
         | have 100 peers and one of them is too slow, what are you going
         | to do? If you back pressure to slow down consensus, you slow
         | down the entire blockchain even though most peers are fast.
         | There's a recent paper about this problem:
         | https://arxiv.org/abs/2410.22080.
         | 
         | What's important is that the amount of bandwidth required per
         | validator remains constant in RaptorCast, no matter how many
         | validators are part of the network. And you always just need
         | one round-trip to broadcast a block proposal, as opposed to
         | Gossip protocols that may involve more steps and have higher
         | latency.
        
           | lxgr wrote:
           | > The redundancy factor can be tweaked, but it'll likely be
           | >2x
           | 
           | If your packet loss is due to your traffic overwhelming a
           | queue at any intermediate hop, sending more redundant packets
           | would be aggravating the problem instead of solving it.
           | 
           | Are you running this on top of something providing congestion
           | control?
        
         | PhilipRoman wrote:
         | It's a shame because the datagram paradigm is so much more
         | elegant. In real world cases you end up having to emulate it by
         | putting length prefixed data in TCP streams, reducing TCP
         | timeouts, constantly reconnecting sockets (with the latency
         | penalty), etc.
         | 
         | Really, the only thing that's missing from UDP is (optional)
         | backpressure.
         | 
         | A lot of software can handle out-of-order datagrams with no
         | performance penalty (like file uploads, etc.). This is
         | especially annoying when you're operating in an environment
         | with link aggregation where the interface insists on limiting
         | your bandwidth to a single link.
        
           | simcop2387 wrote:
           | This is one reason that I'm still upset about the failure
           | that SCTP has ended up. It really did try to create a new
           | protocol for dealing with exactly all of these issues but
           | support and ossification basically meant it's a non-starter.
           | I'd have loved if it was a mandatory part of IPv6 so that
           | it'd eventually get useful support but I'm pretty sure that
           | would have made IPv6 adoption even worse.
        
             | lxgr wrote:
             | As long as you're fine with UDP encapsulation, you can
             | definitely use SCTP today! WebRTC data channels do, for
             | example.
        
             | Veserv wrote:
             | Well we have QUIC now which layers over UDP and is
             | functionally strictly superior to SCTP as SCTP still
             | suffered from head-of-line blocking due to bad
             | acknowledgement design.
        
           | lxgr wrote:
           | > the only thing that's missing from UDP is (optional)
           | backpressure.
           | 
           | The lack of congestion control seems significant too. Most
           | message-oriented protocols layered on top of UDP end up
           | adding that back at the application layer as a consequence.
        
       | yangl1996 wrote:
       | Looks like there is no mentioning in the blogpost of the paper
       | (poster) [1] in which the two-level broadcast idea is proposed.
       | 
       | [1] https://dl.acm.org/doi/pdf/10.1145/3548606.3563494
        
         | ethan_smith wrote:
         | The cited paper is indeed foundational, introducing not just
         | two-level broadcast but also optimizations for validator
         | selection and network topology that RaptorCast appears to build
         | upon.
        
         | compyman wrote:
         | It seems very similar to an earlier paxos optimization called
         | 'pig paxos' https://dl.acm.org/doi/10.1145/3448016.3452834
        
       ___________________________________________________________________
       (page generated 2025-06-23 23:01 UTC)