[HN Gopher] Show HN: Kafka 0.8.0 on Cloudflare Workers
       ___________________________________________________________________
        
       Show HN: Kafka 0.8.0 on Cloudflare Workers
        
       Author : maxwellpeterson
       Score  : 123 points
       Date   : 2022-10-05 12:42 UTC (10 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | losfair wrote:
       | The one feature I miss the most on Cloudflare Workers/Durable
       | Object is TrueTime.
       | 
       | Durable Objects are fundamentally replicated state machines with
       | a nice JavaScript API. You can build an automatically sharded
       | (yet correct) Kafka API implementation, or even an entire
       | Spanner-style distributed database on Workers, given the right
       | primitives (DO + TrueTime).
        
         | bastawhiz wrote:
         | DOs aren't distributed, they're guaranteed to be a single
         | instance of any DO running in exactly one data center, running
         | ~single threaded. So TrueTime doesn't seem helpful, since
         | Date.now() in the worker will always just be one time. TrueTime
         | would offer you a primitive to do the work that Cloudflare
         | gives you for free by virtue of using their infrastructure:
         | would you rather have the primitive and do it yourself? Or
         | would you rather just have the system do that for you so you
         | don't need to think about coordinating a distributed system?
        
           | losfair wrote:
           | TrueTime is an efficient primitive to ensure external
           | consistency _across multiple consensus groups_ , not within
           | one group. The current DO infrastructure does not give the
           | TrueTime guarantees for free: you cannot do transactions
           | across two or more durable objects, and the max throughput
           | within a transaction domain is limited to what a single V8
           | isolate can handle sequentially.
        
             | bastawhiz wrote:
             | DOs are arguably the wrong primitive if you're looking to
             | do transactions across more than one of them. They're
             | essentially partitions, which is how this project uses
             | them. If you do transactions across more than one, you are
             | still making blocking, synchronous requests across DO
             | instances which can be across N data centers. You can't
             | really implement locking for blocking writes if you want to
             | implement transactions at a layer _above_ DOs: short of
             | implementing a spin lock (very expensive with workers), you
             | don 't have a way to wait for an ongoing transaction to
             | complete. Which is to say, if you had a way to implement
             | transactions with TrueTime in Workers, you could just use
             | the standard KV store and avoid DOs entirely, no? The great
             | part about workers is that you don't pay for idle time--if
             | you're implementing your own locking, you're not able to
             | yield the event loop back to CF. At that point, you've lost
             | most of the benefit of being at the edge (you're making
             | blocking cross-DC requests) and most of the cost benefits
             | of Workers.
        
       | nevon wrote:
       | I once made the NodeJS Kafka client KafkaJS run in a browser with
       | a websocket-to-tcp-socket shim. If we combine the two we can now
       | have a browser based client talk to a broker cluster running on
       | edge workers. Just gotta integrate it with Google Sheets and wait
       | for the VC money.
        
         | MuffinFlavored wrote:
         | might as well throw it into v86 or compile it to WASM too if
         | you really want the big investment bucks
        
         | m00dy wrote:
         | what are the advantages of having this design ?
        
           | nevon wrote:
           | None, but it's really cool to see it working.
        
         | maxwellpeterson wrote:
         | Yep, one of the challenges here is that KafkaJS requires Kafka
         | 0.10+, which is true for several other popular clients. For
         | this broker implementation to be somewhat practical, it would
         | need to be extended to 0.10.0 at least. 0.8.0 was a nice,
         | simple starting point for a "proof of concept" project like
         | this.
        
       | teknopurge wrote:
       | this is cool. edge and decentralized applications are going to be
       | a topic of focus (IMO) in the next 5 years.
        
       | [deleted]
        
       | kabircf wrote:
       | This is so neat, great demonstration... good work Max!
        
       | darkwater wrote:
       | Off-topic: which software do you use to make those "simil-hand-
       | drawn" diagrams?
        
         | rocmcd wrote:
         | It looks like these were made with https://excalidraw.com/
         | 
         | Highly recommended!
        
           | maxwellpeterson wrote:
           | Yep, all the diagrams were created with excalidraw. The
           | exported "source" for the diagrams is stored in the
           | diagrams.excalidraw file in the root of the repository.
        
           | darkwater wrote:
           | It saddens me (and my memory) that I have the link in the
           | "already visited" color, and I didn't remember about it.
           | Thanks for the tip anyway!
        
       | kentonv wrote:
       | This is really cool as a demonstration of how distributed systems
       | can be built on top of Durable Objects. In particular I was
       | really happy to see this:
       | 
       | > What about replication? What about leadership election?
       | 
       | > There is none, at least not in the application code. The point
       | of using Durable Objects here is that we can offload these
       | complexities onto the infrastructure layer, and keep our
       | application code focused and minimal.
       | 
       | Yes, exactly. With Durable Objects, these common problems are
       | offloaded to the underlying system, and you can instead focus on
       | the higher-level distributed systems algorithms problems that are
       | unique to the particular infrastructure you want to build.
       | 
       | (I'm the tech lead for Cloudflare Workers and helped design
       | Durable Objects.)
        
         | klabb3 wrote:
         | The abstraction is great, just a single threaded object with
         | strong consistency. It's an awesome building block for a
         | managed service.
         | 
         | I really wish DOs had bundled pricing, or something like it. I
         | found myself thinking about how to reduce # objects to be cost
         | effective :/
        
         | maxwellpeterson wrote:
         | Thanks Kenton!!
        
         | m00dy wrote:
         | What can I do with this ?
        
         | sammy2255 wrote:
         | Thank you lord Kenton ^_^ I'm honestly looking forward to
         | Cloudflare queues which is probably just built on top of DOs
         | but its sooo nice having it as a service
        
       | tommek4077 wrote:
       | But can it run Doom?
        
         | ignoramous wrote:
         | You kid, but https://archive.is/xwEEO /
         | https://twitter.com/OlokobaYusuf/status/1576750919303442433
        
         | maxwellpeterson wrote:
         | You might be interested in https://cloudflare.tv/event/LaBW1BZ6
        
       | citizenpaul wrote:
       | >What about replication? What about leadership election?
       | 
       | How can Kafka run on cloudflare without state? Oh we are not
       | going to tell you that part.
        
       | xani_ wrote:
       | Now someone needs to do that with QEMU and we can go back to
       | running VMs
        
       | greatNespresso wrote:
       | That's super cool ! Also, with D1 coming, I can't wait to see an
       | alternative to BigQuery built on CF workers!
        
       | k__ wrote:
       | Now, do Elasticsearch, please.
        
         | danmcs wrote:
         | I thought about that while back, how would I build an index on
         | top of Durable Objects?
         | 
         | For sorted indexes like a b-tree in a database, I think you
         | would partition into objects by value, so (extremely naive
         | example) values starting with a-m would be in one object, and
         | n-z in the second. You'd end up needing a metadata object to
         | track the partitions, and some reasonably complicated ability
         | to grow and split partitions as you add more data, but this is
         | a relatively mature and well-researched problem space,
         | databases do this for their indexes.
         | 
         | For full text search, particularly if you want to combine
         | terms, you might have to partition by document, though. So
         | you'd have N durable objects which comprise the full text
         | "index", and each would contain 1/N of the documents you're
         | indexing, and you'd build the full text index in each of those.
         | If you searched for docs containing the words "elasticsearch"
         | and "please" you would have to fan out to all the partitions
         | and then aggregate responses.
         | 
         | You could go the other way, and partition by value again, but
         | that makes ANDs (for example) more challenging, those would
         | have to happen at response aggregation time in some way.
         | 
         | You'd do the stemming at index time and at search time, like
         | Solr does.
         | 
         | I have no idea what the documents per partition would be; it
         | would probably depend on the size of the documents, and the
         | number of documents, and the amount you'll be searching them,
         | since each durable object is single-threaded. Adding right
         | truncation or left+right will blow up the index size, so that
         | would probably drive up the partition count. You might be
         | better off doing trigrams or something like it at that point
         | but I'm not as familiar with those.
         | 
         | This is where optimizing would be hard. I don't think you can
         | get from Durable Objects the kind of detailed CPU/disk IO stats
         | you really need to optimize this kind of search engine data
         | structure.
        
           | yazaddaruvala wrote:
           | You're better off creating Lucene Segments in R2 and letting
           | Lucene remotely access them (if Lucene could run on Worker as
           | WASM). Or something very like Lucene but compiled to WASM.
           | 
           | You'd also need to manage the Lucene Segments or
           | Solar/ElasticSearch Shard Metadata in Workers KV. You'd need
           | a pool of Workers that are Coordination Nodes, another pool
           | as "Data Nodes / Shards" and a non-Workers pool creating and
           | uploading Lucene segments to R2.
           | 
           | It shouldn't be so hard to do actually. Cloudflare would need
           | more granular knobs for customers to fine tune the R2
           | replication to be collocated with the Worker execution
           | locations so it's really fast).
        
       ___________________________________________________________________
       (page generated 2022-10-05 23:01 UTC)