[HN Gopher] Ursa: A leaderless, object storage-based alternative...
___________________________________________________________________
Ursa: A leaderless, object storage-based alternative to Kafka
Author : netpaladinx
Score : 82 points
Date : 2025-07-31 15:01 UTC (8 hours ago)
(HTM) web link (streamnative.io)
(TXT) w3m dump (streamnative.io)
| netpaladinx wrote:
| Ursa published a blog post saying their leaderless, stateless,
| object storage-based Kafka replacement can reduce costs by up to
| 95%. Has anyone here tried Ursa in production? How much cost
| reduction have you actually seen compared to Kafka or MSK in real
| workloads?
| x0x0 wrote:
| As near as I can tell, the claims of huge cost savings derive
| from the difficulty dynamically scaling Kafka and improved
| multitenancy. So if different pieces of your company each have
| overprovisioned kafka clusters, they could all move to Ursa and
| save all the overprovisioning.
|
| I have not tried it, and full disclosure, I really like Kafka:
| it's one of the pieces of software that has been rock solid for
| me. I built a project where it quietly ingested low gb/s of
| data with year-long uptimes.
| sijieg wrote:
| We also love Kafka as a protocol. However, the implementation
| can be evolved to adopt the current cloud infrastructure and
| and rethought based on the modern lakehouse paradigm. That
| was one of the reasons we created Ursa.
| davidkj wrote:
| The bulk of the cost savings comes from the use of object
| storage rather than attached disks. This eliminates the
| inter-AZ networking costs associated with Kafka replication
| mechanism.
|
| I break all of the costs down in the following e-book.
| https://streamnative.io/ebooks/reducing-kafka-costs-with-
| lea...
| x0x0 wrote:
| So basically Kafka, to provide availability guarantees,
| requires multi-AZ and the inter-AZ replication gets
| expensive. And Ursa avoids that by using object storage and
| probably then just talking inter-AZ?
|
| And while I like Kafka, nobody would claim it likes being
| scaled up and down dynamically, so probably built-in
| tolerance for that as well? We ran Kafka on-prem so that
| wasn't an issue for us, and given the nature of the
| service, didn't have a lot of usage variance.
|
| This: https://www.youtube.com/watch?v=bb-_4r1N6eg was an
| interesting watch, btw.
| codeaether wrote:
| License? It doesn't seem to be open sourced.
| geodel wrote:
| More than technology it is cloud service. So I think code is
| not the _most interesting_ part here.
| sijieg wrote:
| I am one of the co-founders of StreamNative.
|
| Currently Ursa is only available in our cloud service. But we
| do plan to open-source the core soon. Stay tuned.
| Imustaskforhelp wrote:
| Can't wait for you guys to open source this stuff. If I may
| ask, what's the license you guys are thinking of? Since I am
| interested in hoping to someday live as a developer while
| working on open source too but its a tough line b/w getting
| no sponsors with MIT license and being called non foss and
| being charged in HN for some crimes because you used some
| license like SSPL or some custom license.
|
| The sad reality is that most people in open source want stuff
| for free and won't pay back and that sucks. So what are your
| thoughts on this? I am genuinely curious.
|
| The second part as someone noted, in a comment of the parent
| comment that you are responding, that code is not the most
| important part here, how much do you agree with that
| statement? Since to me, If I can self host it using open
| source without using your cloud service but rather using
| amazon directly, I do think that might be cheaper than using
| the cloud service directly.
| 2Elian wrote:
| Has anyone tried Ursa before? Curious to hear your thoughts!
| codelipenghui wrote:
| Just share a blog post published before, which compares the costs
| of running a 5 GB/s Kafka workload using Ursa, Warpstream, MSK,
| and Redpanda:
|
| https://streamnative.io/blog/how-we-run-a-5-gb-s-kafka-workl...
|
| And the test result was verified by Databricks:
| https://www.linkedin.com/posts/kramasamy_incredible-streamna...
|
| The analysis in the blog is based on two key assumptions:
|
| - Multi-zone deployment on AWS - Tiered storage is not enabled
|
| If you're looking to estimate costs with tiered storage, you can
| ignore the differences in storage costs mentioned in the post.
|
| One important point not covered in the blog is that Ursa compacts
| data directly into a Lakehouse (This is also the major
| differentiator from WarpStream). This means you maintain only a
| single copy of data, shared between both streaming reads and
| table queries. This significantly reduces costs related to:
|
| - Managing and maintaining connectors - Duplicated data across
| streaming and Lakehouse systems
| rohan_ wrote:
| Was the key unlock here the ability to append data to an object?
|
| (https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-s3...)
| zinclozenge wrote:
| Having built a prototype of a system like Ursa myself, this
| isn't something that you need to use at all, especially because
| it seems like this is only available in S3 Express One Zone.
| sijieg wrote:
| Ursa is available across all major cloud providers (GCP,
| Azure, AWS). It also supports pluggable write ahead log
| storage. For latency relaxed workloads, we use object storage
| to get the cost down. So it works with AWS S3, GCP GCS, Azure
| Blob Store. For latency sensitive workloads, we use Apache
| BookKeeper which is a low-latency replicated log storage.
| This allows us to support workloads ranging from milliseconds
| to sub-seconds. You can tune it based on latency and cost
| requirements.
| sijieg wrote:
| There are a few things unlocked by Ursa:
|
| 1. It is leaderless by design. So there is no single lead
| broker you need to route the traffic. So you can eliminate
| majority of the inter-zone traffic.
|
| 2. It is lakehouse-native by design. It is not only just use
| object storage as the storage layer, but also use open table
| formats for storing data. So streaming data can be made
| available in open table formats (Iceberg or Delta) after
| ingestion. One example is the integration with S3 Tables:
| https://aws.amazon.com/blogs/storage/seamless-streaming-to-a...
| This would simplify the Kafka-to-Iceberg integration.
| Kinrany wrote:
| They were asking about changes that enabled Ursa itself.
| akshayshah wrote:
| No, it was S3 becoming strongly consistent in 2020:
| https://www.infoq.com/news/2020/12/aws-s3-strong-consistency...
| supermatt wrote:
| That's probably not as useful as you think. Unless things have
| changed more recently, you need to set the offset from which to
| append, which makes it near useless for most use cases where
| appending would actually be useful.
| oulipo wrote:
| If it's not open-source or at least self-hosteable I don't think
| it will be that useful
| Imustaskforhelp wrote:
| Not a part of Ursa but I think that they are hoping to do so in
| the future. Usefulness can come later, I am more than happy to
| wait in the meanwhile
| geodel wrote:
| To me it seems Pulsar, a stream native sponsored project has not
| picked up. So a wrapper over Kafka/Pulsar with all Kafka
| compatibility and perhaps pulsar technology in cloud streaming
| engine is good business play.
| sijieg wrote:
| There seems to be a confusion here.
|
| Pulsar has been widely adopted in many mission-critical
| business-facing systems like billing, payment, transaction
| processing, or used a unified platform that consolidate
| enterprises diverse streaming & messaging use cases. It has
| quite a lot of adoptions from F500 companies, hyperscalers, to
| startups.
|
| Kafka is used for in data ingestion and streaming pipeline.
| Kafka protocol itself is great. However, the implementation has
| its own challenges.
|
| Both Pulsar and Kafka are great open source projects and their
| protocols are designed for different use cases. We have seen
| many different companies use both technologies.
|
| Ursa is the underlying streaming engine that we re-implemented
| to be leaderless and lakehouse-native so that we can better
| leverage the current cloud infrastructure and natively
| integrate with broader lakehouse ecosystem. It is the engine we
| used to support both in our product offerings.
| zw17 wrote:
| Congrats on the launch! This is Zhenni from PuppyGraph. Shameless
| plug - We recently supported Ursa and here is the joint blog to
| showcase how to integrate Ursa engine with PuppyGraph to enable
| real-time graph analytics for a financial service use case with
| data stored in a lake house (not graphDB):
| https://streamnative.io/blog/integrating-streamnatives-ursa-...
| Imustaskforhelp wrote:
| Lets hope that you guys open source this in a great manner and
| actually still live really nicely.
|
| If I may ask a philosophical question, when would you consider
| your product to "succeed", would it be when someone uses it for
| something important or some money related benchmark or what
| exactly
|
| Wishing Ursa team peace and success. maybe don't ever enshittify
| your product as so many do. Will look at you from the sidebars
| since I don't have a purpose to even kafka but I would recommend
| having some discord or some way to actually form a community I
| suppose. I recommend matrix but there are folks who are discord
| too.
|
| Anyways, have fun building new things!
| _benedict wrote:
| Do you anywhere elaborate what you mean by leaderless, and how
| this affects the semantics and guarantees you offer?
|
| So far as I understand both Kafka and Pulsar use (leader-based)
| consensus protocols to deliver some of their features and
| guarantees, so to match these you must either have developed a
| leaderless consensus protocol, or modify the guarantees you
| offer, or else have a leader-based consensus protocol you utilise
| still?
|
| From one of your other answers, you mention you rely on Apache
| Bookkeeper, which appears to be leader-based?
|
| I ask because I am aware of only one industry leaderless
| consensus protocol under development (and I am working on it),
| and it is always fun to hear about related work.
| wmal wrote:
| How does it compare to AutoMQ? (https://github.com/AutoMQ/automq)
| jauntywundrkind wrote:
| AutoMQ look so so promising. Very happy to see the shift to
| Apache 2.0 license a couple month ago!! I do think it sounds
| like the most obvious comparison to Ursa: object-storage based,
| focus on removing inter-zone traffic. They also have a neat new
| Table Topics, that's super helpful.
| https://www.automq.com/docs/automq/eliminate-inter-zone-traf...
|
| There's an OK high level cruise, _WarpStream is dead, long live
| AutoMQ_ riffing off WarpStream doing similar against Kafka.
| While I loosely got the idea, I had to dig a lot deeper in docs
| for things to start to really click.
| https://github.com/AutoMQ/automq/wiki/WarpStream-is-dead,-lo...
|
| There may be reasons it's a bad fit, but I'm expecting object-
| storage database SlateDB someday makes a very fine streaming
| system too!! https://github.com/slatedb/slatedb
___________________________________________________________________
(page generated 2025-07-31 23:01 UTC)