hngopher.com

       [HN Gopher] What If We Could Rebuild Kafka from Scratch?
       ___________________________________________________________________
        
       What If We Could Rebuild Kafka from Scratch?
        
       Author : mpweiher
       Score  : 204 points
       Date   : 2025-04-25 05:34 UTC (17 hours ago)
        
 (HTM) web link (www.morling.dev)
 (TXT) w3m dump (www.morling.dev)
        
       | hardwaresofton wrote:
       | See also: Warpstream, which was so good it got acquired by
       | Confluent.
       | 
       | Feels like there is another squeeze in that idea if someone
       | "just" took all their docs and replicated the feature set. But
       | maybe that's what S2 is already aiming at.
       | 
       | Wonder how long warpstream docs, marketing materials and useful
       | blogs will stay up.
        
         | dangoodmanUT wrote:
         | i wouldn't say it was so good it got acquired by them, rather
         | confluent had no s3-backed play and it was easier for them to
         | acquire warpstream than to add it to kafka directly
         | 
         | warpstream has latency issues, which downstream turn into cost
         | issues
        
           | hardwaresofton wrote:
           | That's a good point -- I assumed there were other choices,
           | but now that I look, warpstream may have been the only
           | already-kafka-compatible option.
           | 
           | That said, they were at least "good enough" to make "buy"
           | more appealing than "build"
        
           | jjk7 wrote:
           | Confluent has "Freight" which is their integrated s3-backed
           | play.
        
       | Ozzie_osman wrote:
       | I feel like everyone's journey with Kafka ends up being pretty
       | similar. Initially, you think "oh, an append-only log that can
       | scale, brilliant and simple" then you try it out and realize it
       | is far, far, from being simple.
        
         | carlmr wrote:
         | I'm wondering how much of that is bad developer UX and
         | defaults, and how much of that is inherent complexity in the
         | problem space.
         | 
         | Like the article outlines, partitions are not that useful for
         | most people. Instead of removing them, how about having them
         | behind a feature flag, i.e. not on by default. That would ease
         | 99% of users problems.
         | 
         | The next point in the article which to me resonates is the lack
         | of proper schema support. That's just bad UX again, not
         | inherent complexity of the problem space.
         | 
         | On testing side, why do I need to spin up a Kafka
         | testcontainer, why is there no in-memory kafka server that I
         | can use for simple testing purposes.
        
           | ahoka wrote:
           | I think it's just horrible software built on great ideas sold
           | on a false premise (this is a generic message queue and if
           | you don't use this you cannot "scale").
        
             | carlmr wrote:
             | I kind of agree on the horrible software bit, but what do
             | you use instead? And can you convince your company to use
             | that, too?
        
               | buster wrote:
               | I find that many such systems really just need a scalable
               | messaging system. Use RabbitMQ, Nats, Pub/Sub, ... There
               | are plenty.
               | 
               | Confluent has rather good marketing and when you need
               | messaging but can also gain a persistent, super scalable
               | data store and more, why not use that instead? The
               | obvious answer is: Because there is no one-size-fits-all-
               | solution with no drawbacks.
        
             | mrkeen wrote:
             | It's not just about the scaling, it's about solving the
             | "doing two things" problem.
             | 
             | If you take action a, then action b, your system will throw
             | 500s fairly regularly between those two steps, leaving your
             | user in an inconsistent state. (a = pay money, b = receive
             | item). Re-ordering the steps will just make it break
             | differently.
             | 
             | If you stick both actions into a single event _({userid}
             | paid {money} for {item})_ then  "two things" has just
             | become "one thing" in your system. The user either paid
             | money for item, or didn't. Your warehouse team can read
             | this list of events to figure out which items to ship, and
             | your payments team can read this list of events to figure
             | out users' balances and owed taxes.
             | 
             | (You could do the one-thing-instead-of-two-things using a
             | DB instead of Kafka, but then you have to invent some kind
             | of pub-sub so that callers know when to check for new
             | events.)
             | 
             | Also it's silly waiting around to see exceptions build up
             | in your dev logs, or for angry customers to reach out via
             | support tickets. When your implementation depends on
             | publishing literal events of what happened, you can spin up
             | side-cars which verify properties of your system in (soft)
             | real-time. One side-car could just read all the _({userid}
             | paid {money} for {item})_ events and _({item} has been
             | shipped)_ events. It 's a few lines of code to match those
             | together and all of a sudden you have a monitor of "Whose
             | items haven't been shipped?". Then you can debug-in-bulk
             | (before the customers get angry and reach out) rather than
             | scour the developer logs for individual userIds to try to
             | piece together what happened.
             | 
             | Also, read this thread
             | https://news.ycombinator.com/item?id=43776967 from a day
             | ago, and compare this approach to what's going on in there,
             | with audit trails, soft-deletes and updated_at fields.
        
           | gunnarmorling wrote:
           | > why is there no in-memory kafka server that I can use for
           | simple testing purposes.
           | 
           | Take a look at Debezium's KafkaCluster, which is exactly
           | that:
           | https://github.com/debezium/debezium/blob/main/debezium-
           | core....
           | 
           | It's used within Debezium's test suite. Check out the test
           | for this class itself to see how it's being used:
           | https://github.com/debezium/debezium/blob/main/debezium-
           | core...
        
         | mrweasel wrote:
         | The worst part of Kafka, for me, is managing the cluster. I
         | don't really like the partitioning and the almost hopelessness
         | that ensues when something goes wrong. Recovery is really
         | tricky.
         | 
         | Granted it doesn't happen often, if you plan correctly, but the
         | possibility of going wrong in the partitioning and replication
         | makes updates and upgrades nightmare fuel.
        
           | EdwardDiego wrote:
           | Have a look at Strimzi, a K8s operator, gives you a mostly-
           | managed Kafka experience.
        
           | hinkley wrote:
           | There was an old design I encountered in my distributed
           | computing class, and noticed in the world having been primed
           | to look for it, where you break ties in distributed systems
           | with a supervisor whose only purpose was to break ties. In a
           | system that only need 2 or 4 nodes to satisfy demand, the
           | cost of running a 3rd of 5th node only to break ties results
           | in a lot of operational cost. So you created a process that
           | understood the protocol but did not retain the data, whose
           | sole purpose was to break split brain ties.
           | 
           | Then we settled into an era where server rooms grew and
           | workloads demanded horizontal scaling and for the high
           | profile users running an odd number of processes was a
           | rounding error and we just stopped doing it.
           | 
           | But we also see this issue re-emerge with dev sandboxes.
           | Running three copies of Kafka, Redis, Consul, Mongo, or god
           | forbid all four, is just a lot for one laptop, and 50% more
           | EC2 instances if you spin it up in the Cloud, one cluster per
           | dev.
           | 
           | I don't know much Kafka, so I'll stick with Consul as a
           | mental exercise. If you take something like consul, the
           | voting logic should be pretty well contained. It's the logic
           | for catching up a restarted node and serving the data that's
           | the complex part.
        
         | vkazanov wrote:
         | Yeah...
         | 
         | It took 4 years to properly integrate Kafka into our pipelines.
         | Everything, like everything is complicated with it: cluster
         | management, numerous semi-tested configurations, etc.
         | 
         | My final conclusion with it is that the project just doesn't
         | really know what it wants to be. Instead it tries to provide
         | everything for everybody, and ends up being an unbelievably
         | complicated mess.
         | 
         | You know, there are systems that know what they want to be
         | (Amazon S3, Postres, etc), and then there are systems that try
         | to eat the world (Kafka, k8s, systemd).
        
           | 1oooqooq wrote:
           | systemd knows very well what it wants to be, they just don't
           | tell anyone.
           | 
           | it's real goal is to make Linux administration as useless as
           | windows so RH can sell certifications.
           | 
           | tell me the output of systemctl is not as awful as opening
           | the windows service panel.
        
             | hbogert wrote:
             | Tell me systemctl output isn't more beneficial than per
             | distro bash-mess
        
               | vkazanov wrote:
               | Well, systemd IS useful, the same way Kafka is. I don't
               | want to back to crappy bash for service management, and
               | Kafka is a de fact standard event streaming solution.
               | 
               | But both are kind of hard to understand end-to-end,
               | especially for an occasional user.
        
               | 1oooqooq wrote:
               | not really. both requires that you know obscure and badly
               | documented stuff.
               | 
               | systemd whole premise is "people will not read the distro
               | or bash scripting manual"...
               | 
               | then nobody read systemd's (you have even less reason,
               | since it's badly written, ever changing in conflicting
               | ways, and a single use tool)
               | 
               | so you went from complaining your coworkers can't write
               | bash to complaining they don't know they have to use
               | EXEC= EXEC=/bin/x
               | 
               | because random values, without any hint, are lists of
               | commands instead of string values.
        
             | chupasaurus wrote:
             | There are 2 service panels in Windows since 8 and they are
             | quite different...
        
               | 1oooqooq wrote:
               | both (and systemctl) will show two to three times the
               | same useless info on the screen. e.g. service x - start
               | the service x.
        
           | EdwardDiego wrote:
           | > that the project just doesn't really know what it wants to
           | be
           | 
           | It's a distributed log? What else is it trying to do?
        
           | pphysch wrote:
           | > You know, there are systems that know what they want to be
           | (Amazon S3, Postres, etc), and then there are systems that
           | try to eat the world (Kafka, k8s, systemd).
           | 
           | I am not sure about this taxonomy. K8s, systemd, and (I would
           | add) the Linux kernel are all taking on the ambitious task of
           | central, automatic orchestration of general purpose computing
           | systems. It's an extremely complex problem and I think all
           | those technologies have done a reasonably good job of
           | choosing the right abstractions to break down that (ever-
           | changing) mess.
           | 
           | People tend to criticize projects with huge scope because
           | they are obviously complex, and complexity is the enemy, but
           | most of the complexity is necessary in these cases.
           | 
           | If Kafka's goal is to be a general purpose "operating system"
           | for generic data systems, then that explains its complexity.
           | But it's less obvious to me that this premise is a good one.
        
         | munksbeer wrote:
         | I'm not a fan or an anti-fan of kafka, but I do wonder about
         | the hate it gets.
         | 
         | We use it for streaming tick data, system events, order events,
         | etc, into kdb. We write to kafka and forget. The messages are
         | persisted, and we don't have to worry if kdb has an issue. Out
         | of band consumers read from the topics and persist to kdb.
         | 
         | In several years of doing this we haven't really had any major
         | issues. It does the job we want. Of course, we use the aws
         | managed service, so that simplifies quite a few things.
         | 
         | I read all the hate comments and wonder what we're missing.
        
           | DrFalkyn wrote:
           | What happens if the Kafka node fails ?
        
             | radiator wrote:
             | "the" node? Kafka is a cluster of multiple nodes.
        
               | nitwit005 wrote:
               | People do lose clusters occasionally.
        
           | benjaminwootton wrote:
           | That's my experience too. I've deployed it more than ten
           | times as a consultant and never really understood the
           | reputation for complexity. It "just works."
        
             | amazingman wrote:
             | I've deployed it a bunch of times and, crucially,
             | _maintained_ it thereafter. It 's very complex, especially
             | when troubleshooting pathological behavior or recovering
             | from failures, and I don't see why anyone with significant
             | experience with Kafka could reasonably claim otherwise.
             | 
             | Kafka is perhaps the most aptly named software I've ever
             | used.
             | 
             | That said, it's rock solid and I continue to recommend it
             | for cases where it makes sense.
        
               | klooney wrote:
               | I know a team that had their Kafka cluster fall over, and
               | they couldn't get it to stay up until eventually their
               | LOB got shut down for being unreliable. I don't know if
               | they were especially bad at their jobs, or their volumes
               | were unreasonable, or what, but it seemed like a bad time
               | for all.
        
         | Hamuko wrote:
         | > _Initially, you think "oh, an append-only log that can scale,
         | brilliant and simple"_
         | 
         | Really? I got scared by Kafka by just reading through the
         | documentation.
        
         | hinkley wrote:
         | Once you pick an "As simple as possible, but no simpler"
         | solution, it triggers Dunning Kruger in a lot of people who
         | think they can one up you.
         | 
         | There was a time in my early to mid career when I had to defend
         | my designs a lot because people thought my solutions were
         | shallower than they were and didn't understand that the
         | "quirks" were covering unhappy paths. They were often load
         | bearing, 80/20 artifacts.
        
       | Spivak wrote:
       | Once you start asking to query the log by keys, multi-tenancy
       | trees of topics, synchronous commits-ish, and schemas aren't we
       | just in normal db territory where the kafka log becomes the query
       | log. I think you need to go backwards and be like what is the
       | feature a rdbms/nosql db _can 't_ do and go from there. Because
       | the wishlist is looking like CQRS with the front queue being
       | durable but events removed once persisted in the backing db where
       | the clients query events from the db.
       | 
       | The backing db in this wishlist would be something in the vein of
       | Aurora to achieve the storage compute split.
        
       | peanut-walrus wrote:
       | Object storage for Kafka? Wouldn't this 10x the latency and cost?
       | 
       | I feel like Kafka is a victim of it's own success, it's excellent
       | for what it was designed, but since the design is simple and
       | elegant, people have been using it for all sorts of things for
       | which it was not designed. And well, of course it's not perfect
       | for these use cases.
        
         | ako wrote:
         | Warpstream already does Kafka with object storage.
        
           | rad_gruchalski wrote:
           | WarpStream has been acquired by Confluent.
        
         | mrweasel wrote:
         | > people have been using it for all sorts of things for which
         | it was not designed
         | 
         | Kafka is misused for some weird stuff. I've seen it used as a
         | user database, which makes absolutely no sense. I've also seen
         | it used a "key/value" store, which I can't imagine being
         | efficient as you'd have to scan the entire log.
         | 
         | Part of it seems to stem from "We need somewhere to store X. We
         | already have Kafka, and requesting a database or key/value
         | store is just a bit to much work, so let's stuff it into
         | Kafka".
         | 
         | I had a client ask for a Kafka cluster, when queried about what
         | they'd need it for we got "We don't know yet". Well that's
         | going to make it a bit hard to dimension and tune it correctly.
         | Everyone else used Kafka, so they wanted to use it too.
        
         | biorach wrote:
         | > the design is simple and elegant
         | 
         | Kafka is simple and elegant?
        
           | EdwardDiego wrote:
           | The design is.
        
           | peanut-walrus wrote:
           | The core design for producer/broker/consumer certainly is.
           | All the logic is on the ends, broker just makes sure your
           | stream of bytes is available to the consumers. Reliable,
           | scales well, can be used for pretty much any data.
        
         | gunnarmorling wrote:
         | It can increase latency (which can be somewhat mitigated though
         | by having a write buffer e.g. on EBS volumes), but it
         | substantially _reduces_ cost: all cross-AZ traffic (which is
         | $$$) is handled by the object storage layer, where it doesn't
         | get charged. This architecture has been tremendously popular
         | recently, championed by Warpstream and also available by
         | Confluent (Freight clusters), AutoMQ, BufStream, etc. The KIP
         | mentioned in the post aims at bringing this back into the
         | upstream open-source Kafka project.
        
           | peanut-walrus wrote:
           | So it's cheaper *on AWS*. Any cloud provider where cross-AZ
           | traffic is not $$$, I can't imagine this architecture being
           | cheaper.
           | 
           | Engineering solutions which only exist because AWS pricing is
           | whack are...well, certainly a choice.
           | 
           | I can also think of lots of cases where whatever you're
           | running is fine to just run in a single AZ since it's not
           | critical.
        
             | nicksnyder wrote:
             | The other clouds have fees like this too.
             | 
             | Even if this were to change, using object storage results
             | in a lot of operational simplicity as well compared to
             | managing a bunch of disks. You can easily and quickly scale
             | to zero or scale up to handle bursts in traffic.
             | 
             | An architecture like this also makes it possible to achieve
             | a truly active-active multi-region Kafka cluster that has
             | real SLAs.
             | 
             | See: https://buf.build/blog/bufstream-multi-region
             | 
             | (disclosure: I work at Buf)
        
         | openplatypus wrote:
         | > Object storage for Kafka? Wouldn't this 10x the latency and
         | cost?
         | 
         | It will become slower. It will become costlier (to maintain).
         | And we will end up with local replicas for performance.
         | 
         | If only people looked outside AWS bubble and realised they are
         | SEVERELY overcharged for storage, this would be mute point.
         | 
         | I would welcome getting partition dropped in favour of multi-
         | tenancy ... but for my use cases this is often equivalent.
         | 
         | Storage is not the problem though.
        
         | bjornsing wrote:
         | The weird thing driving this thinking is that cross-AZ network
         | data transfer between EC2 instances on AWS is more expensive
         | than shuffling the same data through S3 (which has free data
         | transfer to/from EC2). It's just stupid, but that's how it is.
        
       | vim-guru wrote:
       | https://nats.io is easier to use than Kafka and already solves
       | several of the points in this post I believe, like removing
       | partitions, supporting key-based streams, and having flexible
       | topic hierarchies.
        
         | fasteo wrote:
         | And JetStream[1] adds persistence to make it more comparable to
         | Kafka
         | 
         | [1] https://docs.nats.io/nats-concepts/jetstream
        
         | wvh wrote:
         | I came here to say just that. Nats solves a lot of those
         | challenges, like different ways to query and preserve messages,
         | hierarchical data, decent authn/authz options for multi-
         | tenancy, much lighter and easier to set up, etc. It has more of
         | a messaging and k/v store feel than the log Kafka is, so while
         | there's some overlap, I don't think they fit the exact same use
         | cases. Nats is fast, but I haven't seen any benchmarks for
         | specifically the bulk write-once append log situation Kafka is
         | usually used for.
         | 
         | Still, if a hypothetical new Kafka would incorporate some of
         | Nats' features, that would be a good thing.
        
         | boomskats wrote:
         | Also remember that NATS was donated to the CNCF a while back,
         | and as a result people built a huge ecosystem around it. Easy
         | to forget.
        
           | rmoff wrote:
           | Relevant: https://www.cncf.io/blog/2025/04/24/protecting-
           | nats-and-the-...
        
             | cmckn wrote:
             | Yikes, what a dastardly move from Synadia. This will only
             | have a chilling effect on the project, which has struggled
             | to reach critical mass.
        
             | Zambyte wrote:
             | Wow, what a poor move. Really surprising having worked with
             | the folks at Synadia directly.
        
             | openplatypus wrote:
             | How to kill a project and blemish your brand?
             | 
             | Ding Ding Ding
             | 
             | That's correct!
        
             | nchmy wrote:
             | Also related https://news.ycombinator.com/item?id=43783452
        
         | mountainriver wrote:
         | I greatly prefer redis streams. Not all the same features, but
         | if you just need basic streams, redis has the dead simple
         | implementation I always wanted.
         | 
         | Not to mention you then also have a KV store. Most problems can
         | be solved with redis + Postgres
        
           | ketzo wrote:
           | Actually thinking about building something with Redis streams
           | next week. Any particular advice/sharp edges/etc?
        
             | mountainriver wrote:
             | Honestly no, which was my favorite part. So easy to spin
             | up, "just works" the way you expect. It's night and day
             | from a lot of the other solutions in the ecosystem.
        
             | immibis wrote:
             | Don't, under any circumstances, let it come into contact
             | with an untrusted network. Anyone who can connect to your
             | Redis service gets arbitrary remote code execution.
        
           | Zambyte wrote:
           | > Not to mention you then also have a KV store.
           | 
           | https://docs.nats.io/nats-concepts/jetstream/key-value-store
        
         | raverbashing wrote:
         | Honestly that website has the least amount of information per
         | text I've seen in multiple websites
         | 
         | I had to really dig (outside of that website) to understand
         | even what NATS is and/or does
         | 
         | It goes too hard on the keyword babbling and too little on the
         | "what does this _actually_ do "
         | 
         | > Services can live anywhere and are easily discoverable -
         | decentralized, zerotrust security
         | 
         | Ok cool, this tells me absolutely nothing. What service? Who to
         | whom? Discovering what?
        
           | nchmy wrote:
           | Agreed. It is amazing though, and actually so simple that you
           | don't need much docs. Their YouTube channel is excellent.
        
         | nchmy wrote:
         | NATS is also in the process of a open source license rugpull...
         | 
         | https://news.ycombinator.com/item?id=43783452
        
       | nitwit005 wrote:
       | > When producing a record to a topic and then using that record
       | for materializing some derived data view on some downstream data
       | store, there's no way for the producer to know when it will be
       | able to "see" that downstream update. For certain use cases it
       | would be helpful to be able to guarantee that derived data views
       | have been updated when a produce request gets acknowledged,
       | allowing Kafka to act as a log for a true database with strong
       | read-your-own-writes semantics.
       | 
       | Just don't use Kafka.
       | 
       | Write to the downstream datastore directly. Then you know your
       | data is committed and you have a database to query.
        
         | menzoic wrote:
         | Writing directly to the datastore ignores the need for queuing
         | the writes. How do you solve for that need?
        
           | immibis wrote:
           | Why do you need to queue the writes?
        
             | smitty1e wrote:
             | A proficient coder can write a program to accomplish a task
             | in the singular.
             | 
             | In the plural, accomplishing that task in a performant way
             | at enterprise scale seems to involve turning every function
             | call into an asynchronous, queued service of some sort.
             | 
             | Which then begets additional deployment and monitoring
             | services.
             | 
             | A queued problem requires a cute solution, bringing acute
             | pain.
        
             | vergessenmir wrote:
             | some writes might fail, you may need to retry, the data
             | store may be temporarily available etc.
             | 
             | There may be many things that go wrong and how you handle
             | this depends on your data guarantees and consistency
             | requirements.
             | 
             | If you're not queuing what are you doing when a write
             | fails, throwing away the data?
        
               | bushbaba wrote:
               | Some Kafka writes might fail. Hence the Kafka client
               | having a queue with retries.
        
               | immibis wrote:
               | As bushbaba points out, the same things may happen with
               | Kafka.
               | 
               | The standard regex joke really works for all values of X.
               | Some people, when confronted with a problem, think "I
               | know - I'll use a queue!" Now they have two problems.
               | 
               | Adding a queue to the system does not make it faster or
               | more reliable. It makes it more asynchronous (because of
               | the queue), slower (because the computer has to do more
               | stuff) and less reliable (because there are more "moving
               | parts"). It's possible that a queue is a required
               | component of a design which is more reliable and faster,
               | but this can only be known about the specific design, not
               | the general case.
               | 
               | I'd start by asking why your data store is randomly
               | failing to write data. I've never encountered Postgres
               | randomly failing to write data. There are certainly
               | conditions that can cause Postgres to fail to write data,
               | but they aren't random and most of them are bad enough to
               | require an on-call engineer to come and fix the system
               | anyway - if the problem could be resolved automatically,
               | it would have been.
               | 
               | If you want to be resilient to events like the database
               | disk being full (maybe it's a separate analytics database
               | that's less important from the main transactional
               | database) then adding a queue (on a separate disk) can
               | make sense, so you can continue having accurate analytics
               | after upgrading the analytics database storage. In this
               | case you're using the queue to create a fault isolation
               | boundary. It just kicks the can down the road though,
               | since if the analytics _queue_ storage fills up, you
               | still have to either drop the analytics or fail the
               | client 's request. You have the same problem but now for
               | the queue. Again, it could be a reasonable design, but
               | not by default and you'd have to evaluate the whole
               | design to see whether it's reasonable.
        
         | wvh wrote:
         | The problem is that you don't know who's listening. You don't
         | want all possible interested parties to hammer the database.
         | Hence the events in between. Arguably, I'd not use Kafka to
         | store actual data, just to notify in-flight.
        
           | Spivak wrote:
           | For read only queries, hammer away I can scale reads nigh
           | infinitely horizontally. There's no secret sauce that makes
           | it so that only kafka can do this.
        
             | throwaway7783 wrote:
             | It's not about scaling reads, but coordinating consumers so
             | no more than one consumer processes same messages. That
             | means some kind of locking, that means scaling issues.
        
           | immibis wrote:
           | But you do know who's listening, because you were the one who
           | installed all the listeners. ("you" can be plural.)
           | 
           | This reminds me of the OOP vs DOD debate again. OOP adherents
           | say they don't know all the types of data their code operates
           | on; DOD adherents say they actually do, since their program
           | contains a finite number of classes, and a finite subset of
           | those classes can be the ones called in this particular
           | virtual function call.
           | 
           | What you mean is that your system _is structured in such a
           | way that it 's as if_ you don't know who's listening. Which
           | is okay, but you should be explicit that it's a design
           | _choice_ , not a law of physics, so when that design choice
           | no longer serves you well, you have the right to change it.
           | 
           | (Sometimes you really don't know, because your code is a
           | library or has a plugin system. In such cases, this doesn't
           | apply.)
           | 
           | > Arguably, I'd not use Kafka to store actual data, just to
           | notify in-flight.
           | 
           | I believe people did this initially and then discovered the
           | non-Kafka copy of the data is redundant, so got rid of it, or
           | relegated it to the status of a cache. This type of design is
           | called Event Sourcing.
        
           | mike_hearn wrote:
           | In some databases that's not a problem. Oracle has a built in
           | horizontally scalable message queue engine that's
           | transactional with the rest of the database. You can register
           | a series of SELECT queries and be notified when the results
           | have (probably) changed, either via direct TCP server push or
           | via a queued message for pickup later. It's not polling
           | based, the transaction engine knows what query predicates to
           | keep an eye out for.
           | 
           | Disclosure: I work part time for Oracle Labs and know about
           | these features because I'm using them in a project at the
           | moment.
        
             | dheera wrote:
             | I know as an Oracle employee you don't want to hear this,
             | but part of the problem is that you are no longer database-
             | agnostic if you do this.
             | 
             | The messaging tech being separate from the database tech
             | means the architects can swap out the database if needed in
             | the future without needing to rewrite the producers and
             | consumers.
        
               | mike_hearn wrote:
               | I don't work on the database itself, so it's neither here
               | nor there to me. Still, the benefits of targeting the LCD
               | must be weighed against the costs. Not having scalable
               | transactions imposes a huge drain on the engineering org
               | that sucks up time and imposes opportunity costs.
        
         | thom wrote:
         | Of course, if you don't have separate downstream and upstream
         | datastores, you don't have anything to do in the first place.
        
         | rakoo wrote:
         | Alternatively, your write doesn't have to be fire-and-forget:
         | downstream datastores can also write to kafka (this time fire-
         | and-forget) and the initial client can wait for that event to
         | acknowledge the initial write
        
         | salomonk_mur wrote:
         | Yeah... Not happening when you have scores of clients running
         | down your database.
         | 
         | The reason message queue systems exist is scale. Good luck
         | sending a notification at 9am to your 3 million users and
         | keeping your database alive in the sudden influx of activity.
         | You need to queue that load.
        
           | nitwit005 wrote:
           | Kafka is itself a database. Sending a message requires what
           | is essentially a database insert. You're still doing a DB
           | commit either way.
        
             | briankelly wrote:
             | It's more of a commit log/write-ahead log/replication
             | stream than a DBMS - consider that DBMSs typically include
             | these in addition to their primary storage.
        
       | YetAnotherNick wrote:
       | I wish there is a global file system with node local disks, which
       | has rule driven affinity to nodes for data. We have two extremes,
       | one like EFS or S3 express which doesn't have any affinity to the
       | processing system, and other what Kafka etc is doing where they
       | have tightly integrated logic for this which makes systems more
       | complicated.
        
         | XorNot wrote:
         | I might be misunderstanding but isn't this like, literally what
         | GlusterFS used to do?
         | 
         | Like distinctly recall running it at home as a global unioning
         | filesystem where content had to fully fit on the specific
         | device it was targeted at (rather then being striped or
         | whatever).
        
       | imcritic wrote:
       | Since we are dreaming - add ETL there as well!
        
       | vermon wrote:
       | Interesting, if partitioning is not a useful concept of Kafka,
       | what are some of the better alternatives for controlling consumer
       | concurrency?
        
         | galeaspablo wrote:
         | It is useful, but it is not generally applicable.
         | 
         | Given an arbitrary causality graph between n messages, it would
         | be ideal if you could consume your messages in topological
         | order. And that you could do so in O(n log n).
         | 
         | No queuing system in the world does arbitrary causality graphs
         | without O(n^2) costs. I dream of the day where this changes.
         | 
         | And because of this, we've adapted our message causality
         | topologies to cope with the consuming mechanisms of Kafka et al
         | 
         | To make this less abstract, imagine you have two bank accounts,
         | each with a stream. MoneyOut in Bob's account should come
         | BEFORE MoneyIn when he transfers to Alice's account, despite
         | each bank account having different partition keys.
        
           | dpflan wrote:
           | Can you elaborate on how you have "adapted...message
           | causality topologies to cope with consuming mechanisms" in
           | relation to the example of a bank account? The causality
           | topology being what here, couldn't one day MoneyIn should
           | come before else there can be now true MoneyOut?
        
             | galeaspablo wrote:
             | Right on, great question. Some examples:
             | 
             | Example Option 1
             | 
             | You give up on the guarantees across partition keys (bank
             | accounts), and you accept that balances will not reflect a
             | causally consistent state of the past.
             | 
             | E.g., Bob deposits 100, Bob sends 50 to Alice.
             | 
             | Balances: Bob 0 Alice 50 # the source system was never in
             | this state Bob 100 Alice 50 # the source system was never
             | in this state Bob 50 Alice 50 # eventually consistent final
             | state
             | 
             | Example Option 2
             | 
             | You give up on parallelism, and consume in total order
             | (i.e., one single partition / unit of parallelism - e.g.,
             | in Kafka set a partitioner that always hashes to the same
             | value).
             | 
             | Example Option 3
             | 
             | In the consumer you "wait" whenever you get a message that
             | violates causal order.
             | 
             | E.g., Bob deposits 100 Bob sends 50 to Alice (Bob-MoneyOut
             | 50 -> Alice-MoneyIn 50).
             | 
             | If we attempt to consume Alice-MoneyIn before Bob-MoneyOut,
             | we exponentially back off from the partition containing
             | Alice-MoneyIn.
             | 
             | (Option 3 is terrible because of O(n^2) processing times in
             | the worst case and the possibility for deadlocks (two
             | partitions are waiting for one another))
        
               | dpflan wrote:
               | Thanks. With these of examples of messages appearance in
               | time and in _physical_ location in Kafka, how have you
               | adapted your consumers? Which scenario  / architectural
               | decision (one of the examples?) have you moved forward
               | with and creating support to yield your desired causality
               | handling?
        
       | Mistletoe wrote:
       | I know it's not what the article is about but I really wish we
       | could rebuild Franz Kafka and hear what he thought about the tech
       | dystopia we are in.
       | 
       | >I cannot make you understand. I cannot make anyone understand
       | what is happening inside me. I cannot even explain it to myself.
       | -Franz Kafka, The Metamorphosis
        
         | tannhaeuser wrote:
         | They named it like that for a reason ;)
        
         | gherkinnn wrote:
         | Both der Process and das Schloss apply remarkably well to our
         | times. No extra steps necessary.
        
       | elvircrn wrote:
       | Surprised there's no mention of Redpanda here.
        
         | axegon_ wrote:
         | Having used both Kafka and Redpanda on several occasions, I'd
         | pick Redpanda any day of the week without a second thought.
         | Easier to setup, easier to maintain, a lot less finicky ans
         | uses a fraction of the resources.
        
           | enether wrote:
           | In what way is it materially easier to maintain and less
           | finicky? I read a lot about this but I haven't seen a concise
           | bullet point list of why, which leads me to naturally
           | distrust such claims. Ditto for the resources - Kafka is
           | usually bottlenecked on disk/network, and whether it's c++ or
           | not doesn't solve that
        
             | SkipperCat wrote:
             | I've been using Redpanda for a few years now and here's
             | what I've noticed.
             | 
             | Its a compiled binary - no JVM to manage. Java apps have
             | always been a headache for me. Plus, no zookeeper - one
             | less thing to break.
             | 
             | The biggest benefit I've seen is the performance. Redpanda
             | just outperforms Apache Kakfa on similar hardware. Its also
             | Kafka compliant in every way I've noticed, so all my
             | favorite tools that interact with Kafka work the same with
             | Redpanda.
             | 
             | Redpanda, like Kafka, writes to disk, so you'll always be
             | limited by your hardware no matter what you use (but NVMe's
             | are fast and affordable).
             | 
             | YMMV, but its been a good experience for me.
        
           | dangoodmanUT wrote:
           | same
        
         | EdwardDiego wrote:
         | Redpanda isn't FOSS. Apache Kafka is.
        
           | SkipperCat wrote:
           | Redpanda community edition operates under the BSL license -
           | https://www.redpanda.com/blog/open-source
           | 
           | IANAL, but it looks pretty open source to me.
        
       | fintler wrote:
       | Keep an eye out for Northguard. It's the name of LinkedIn's
       | rewrite of Kafka that was announced at a stream processing meetup
       | about a week ago.
        
         | selkin wrote:
         | It solves some issues, and creates some, since Northguard isn't
         | compatible with the current Kafka ecosystem.
         | 
         | As such, you can no longer use existing software that is built
         | on Kafka as-is. It may not be a grave concern for LinkedIn, but
         | it could be for others that currently benefit from using the
         | existing Kafka ecosystem.
        
           | fintler wrote:
           | Yeah, it's definitely a significant shift. The Xinfra
           | component helps with Kafka compatibility, but that still has
           | quite a bit of complexity to it. Also, it's written in C++,
           | so that requires a different mindset to operate.
        
             | selkin wrote:
             | I understood Xinfra to not use the Kafka protocol as its
             | API.
             | 
             | As such, even with Xinfra deployed, you have to rewrite all
             | the software that connects to Kafka, regardless of
             | programming language.
        
       | ghuntley wrote:
       | I'm going to get downvoted for this, but you can literally
       | rebuild Kafka via AI right now in record time using the steps
       | detailed at https://ghuntley.com/z80.
       | 
       | I'm currently building a full workload scheduler/orchestrator.
       | I'm sick of Kubernetes. The world needs better ->
       | https://x.com/GeoffreyHuntley/status/1915677858867105862
        
         | EdwardDiego wrote:
         | > but you can literally rebuild Kafka via AI right now in
         | record time
         | 
         | Go on then. Post the repo when you have :)
        
       | lewdwig wrote:
       | Ah the siren call of the ground-up rewrite. I didn't know how
       | deep the assumption of hard disks underpinning everything is
       | baked into its design.
       | 
       | But don't public cloud providers already all have cloud-native
       | event sourcing? If that's what you need, just use that instead of
       | Kafka.
        
       | frklem wrote:
       | "Faced with such a marked defensive negative attitude on the part
       | of a biased culture, men who have knowledge of technical objects
       | and appreciate their significance try to justify their judgment
       | by giving to the technical object the only status that today has
       | any stability apart from that granted to aesthetic objects, the
       | status of something sacred. This, of course, gives rise to an
       | intemperate technicism that is nothing other than idolatry of the
       | machine and, through such idolatry, by way of identification, it
       | leads to a technocratic yearning for unconditional power. The
       | desire for power confirms the machine as a way to supremacy and
       | makes of it the modern philtre (love-potion)." Gilbert Simondon,
       | On the mode of existence of technical objects.
       | 
       | This is exactly what I interpret from these kind of articles:
       | engineering just for the cause of engineering. I am not saying we
       | should not investigate on how to improve our engineered
       | artifacts, or that we should not improve them. But I see a
       | generalized lack of reflection on why we should do it, and I
       | think it is related to a detachment from the domains we create
       | software for. The article suggests uses of the technology that
       | come from so different ways of using it, that it looses coherence
       | as a technical item.
        
         | gunnarmorling wrote:
         | For each of the items discussed I explicitly mention why they
         | would be desirable to have. How is this engineering for the
         | sake of engineering?
        
           | frklem wrote:
           | True, for each of the points discussed, there is an explicit
           | mention on why it is desirable. But those are technical
           | solutions, to technical problems. There is nothing wrong with
           | that. The issue is, that the whole article is about
           | technicalities _because of_ technicalities, hence the
           | 'engineering for the cause of engineering' (which is
           | different from '.. for the sake of...'). It is at this point
           | that the 'idea of rebuilding Kafka' becomes a pure technical
           | one, detached from the intention of having something _like_
           | Kafka. Other commenters in the thread also pointed out to the
           | fact of Kafka not having a clear intention. I agree that a
           | lot of software nowadays suffer from the same problem.
        
       | supermatt wrote:
       | > "Do away with partitions"
       | 
       | > "Key-level streams (... of events)"
       | 
       | When you are leaning on the storage backend for physical
       | partitioning (as per the cloud example, where they would
       | literally partition based on keys), doesnt this effectively just
       | boil down to renaming partitions to keys, and keys to events?
        
         | gunnarmorling wrote:
         | That's one way to look at this, yes. The difference being that
         | keys actually have a meaning to clients (as providers of
         | ordering and also a failure domain), whereas partitions in
         | their current form don't.
        
       | olavgg wrote:
       | How many of the Apache Kafka issues are adressed by switching to
       | Apache Pulsar?
       | 
       | I skipped learning Kafka, and jumped right into Pulsar. It works
       | great for our use case. No complaints. But I wonder why so few
       | use it?
        
         | EdwardDiego wrote:
         | Some, but then Pulsar brings its own issues.
        
         | enether wrote:
         | There's inherently a lot of path-dependent network effects in
         | open source software.
         | 
         | Just because something is 10-30% better in certain cases almost
         | never warrants its adoption, if on the other side you get much
         | less human expertise, documentation/resources and battle tested
         | testimonies.
         | 
         | This, imo, is the story of most Kafka competitors
        
         | jsumrall wrote:
         | I've been down this path, and if my experience is more common,
         | then it really boils down to the classic "Nobody gets fired for
         | buying IBM", and here IBM -> Confluent.
         | 
         | StreamNative seems like an excellent team, and I hope they
         | succeed. But as another comment has written, something (puslar)
         | being better (than kafka) has to either be adopted from the
         | start, or be a big enough improvement to change-- and as
         | difficult and feature-poor that Kafka is, it still gets the job
         | done.
         | 
         | I can rant longer about this topic but Pulsar _should_ be more
         | popular, but unfortunately Confluent has dominated here and
         | rent-seeking this field into the ground.
        
       | mgaunard wrote:
       | I can't count the number of bad message queues and buses I've
       | seen in my career.
       | 
       | While it would be useful to just blame Kafka for being bad
       | technology it seems many other people get it wrong, too.
        
         | mgaunard wrote:
         | s/useful/easy/
         | 
         | (can't edit)
        
       | 0x445442 wrote:
       | How about logging the logs so I can shell into the server to
       | search the messages.
        
         | EdwardDiego wrote:
         | What messages? They're just blobs of bytes.
        
         | jjk7 wrote:
         | You can do this with Confluent
        
       | tyingq wrote:
       | He mentions Automq right in the opener. And if I follow the link,
       | they pitch it in a way that sounds very "too good to be true".
       | 
       | Anyone here have some real world experience with it?
        
       | galeaspablo wrote:
       | Agreed. The head of line problem is worth solving for certain use
       | cases.
       | 
       | But today, all streaming systems (or workarounds) with per
       | message key acknowledgements incur O(n^2) costs in either
       | computation, bandwidth, or storage per n messages. This applies
       | to Pulsar for example, which is often used for this feature.
       | 
       | Now, now, this degenerate time/space complexity might not show up
       | every day, but when it does, you're toast, and you have to wait
       | it out.
       | 
       | My colleagues and I have studied this problem in depth for years,
       | and our conclusion is that a fundamental architectural change is
       | needed to support scalable per message key acknowledgements.
       | Furthermore, the architecture will fundamentally require a sorted
       | index, meaning that any such a queuing / streaming system will
       | process n messages in O (n log n).
       | 
       | We've wanted to blog about this for a while, but never found the
       | time. I hope this comment helps out if you're thinking of relying
       | on per message key acknowledgments; you should expect sporadic
       | outages / delays.
        
         | cogman10 wrote:
         | > streaming system will process n messages in O (n log n)
         | 
         | I'm guessing this is mostly around how backed up the stream is.
         | n isn't the total number of messages but rather the current
         | number of unacked messages.
         | 
         | Would a radix structure work better here? If you throw
         | something like a UUID7 on the messages and store them in a
         | radix structure you should be able to get O(n) performance here
         | correct? Or am I not understanding the problem well.
        
           | bjornsing wrote:
           | I think the problem is that if you want quick access to all
           | messages with a particular key then you have to maintain some
           | kind of index over all persisted messages. So n would be
           | total number of persisted messages as I read it, which can be
           | quite large. But even storing them in the first place is
           | O(n), so O(n log n) might not be so bad.
        
             | galeaspablo wrote:
             | That's correct. And keep in mind that you might have new
             | consumers starting from the beginning come into play, so
             | you have to permanently retain the indexes.
             | 
             | And yes, O(n log n ) is not bad at all. Sorted database
             | indexes (whether SQL, NoSQL, or AcmeVendorSQL, etc.)
             | already take O(n log n) to insert n elements into data
             | storage or to read n elements from data storage.
        
         | singron wrote:
         | Check out the parallel consumer:
         | https://github.com/confluentinc/parallel-consumer
         | 
         | It processes unrelated keys in parallel within a partition. It
         | has to track what offsets have been processed between the last
         | committed offset of the partition and the tip (i.e. only what's
         | currently processed out of order). When it commits, it saves
         | this state in the commit metadata highly compressed.
         | 
         | Most of the time, it was only processing a small number of
         | records out of order so this bookkeeping was insignificant, but
         | if one key gets stuck, it would scale to at least 100,000
         | offsets ahead, at which point enough alarms would go off that
         | we would do something. That's definitely a huge improvement to
         | head of line blocking.
        
           | galeaspablo wrote:
           | Disclosure (given this is from Confluent): I'm ex MSK
           | (Managed Streaming for Kafka at AWS) and my current company
           | was competing with Confluent before we pivoted.
           | 
           | Yup, this is one more example, just like Pulsar. There are
           | definitely great optimizations to be made on the average
           | case. In the case of parallel consumer, if you'd like to keep
           | ordering guarantees, you retain O(n^2) processing time in the
           | worst case.
           | 
           | The issues arise when you try to traverse arbitrary
           | dependency topologies in your messages. So you're left with
           | two options:
           | 
           | 1. Make damn sure that causal dependencies don't exhibit
           | O(n^2) behavior, which requires formal models to be 100%
           | sure. 2. Give up ordering or make some other nasty tradeoff.
           | 
           | At a high level the problem boils down to traversing a DAG in
           | topological order. From computer science theory, we know that
           | this requires a sorted index. And if you're implementing an
           | index on top of Kafka, you might as well embed your data into
           | and consume directly from the index. Of course, this is
           | easier said than done, and that's why no one has cracked this
           | problem yet. We were going to try, but alas we pivoted :)
           | 
           | Edit: Topological sort does not required a sorted index (or
           | similar) if you don't care about concurrency. But then you've
           | lost the advantages of your queue.
        
             | taurath wrote:
             | > traverse arbitrary dependency topologies
             | 
             | Is there another way to state this? It's very difficult for
             | me to grok.
             | 
             | > DAG
             | 
             | Directed acyclic graph right?
        
               | galeaspablo wrote:
               | Apologies, we've been so deep into this problem that we
               | take our slang for granted :)
               | 
               | A graphical representation might be worth a thousand
               | words, keeping in mind it's just one example. Imagine
               | you're traversing the following.
               | 
               | A1 -> A2 -> A3...
               | 
               | |
               | 
               | v
               | 
               | B1 -> B2 -> B3...
               | 
               | |
               | 
               | v
               | 
               | C1 -> C2 -> C3...
               | 
               | |
               | 
               | v
               | 
               | D1 -> D2 -> D3...
               | 
               | |
               | 
               | v
               | 
               | E1 -> E2 -> E3...
               | 
               | |
               | 
               | v
               | 
               | F1 -> F2 -> F3...
               | 
               | |
               | 
               | v
               | 
               | ...
               | 
               | Efficient concurrent consumption of these messages (while
               | respecting causal dependency) would take O(w + h), where
               | w = the _width_ (left to right) of the longest sequence,
               | and h = the _height_ (top to bottom of the first column)
               | 
               | But Pulsar, Kafka + parallel consumer, Et al. would take
               | O(n^2) either in processing time or in space complexity.
               | This is because at a fundamental level, the underlying
               | data storages store looks like this
               | 
               | A1 -> A2 -> A3...
               | 
               | B1 -> B2 -> B3...
               | 
               | C1 -> C2 -> C3...
               | 
               | D1 -> D2 -> D3...
               | 
               | E1 -> E2 -> E3...
               | 
               | F1 -> F2 -> F3...
               | 
               | Notice that the underlying data storage loses information
               | about nodes with multiple children (e.g., A1 previously
               | parented both A2 and B1)
               | 
               | If we want to respect order, the consumer will be
               | responsible for declining to process messages that don't
               | respect causal order. E.g., attempting to process F1
               | before E1. Thus we could get into a situation where we
               | try to process F1, then E1, then D1, then C1, then B1,
               | then A1. Now that A1 is processed, kafka tries again, but
               | it tries F1, then E1, then D1, then C1, then B1... And so
               | on and so forth. This is O(n^2) behavior.
               | 
               | Without changing the underlying data storage
               | architecture, you will either:
               | 
               | 1. Incur O(n^2) space or time complexity
               | 
               | 2. Reimplement the queuing mechanism at the consumer
               | level, but then you might as well not even use Kafka (or
               | others) at all. In practice this is not practical (my
               | evidence being that no one has pulled it off).
               | 
               | 3. Face other nasty issues (e.g., in Kafka parallel
               | consumer you can run out of memory or your processing
               | time can become O(n^2)).
        
               | singron wrote:
               | Do you have an example use case for this? This does seem
               | like something unsuited to kafka, but I'm having a hard
               | time imagining why you would structure something like
               | this.
        
           | Misdicorl wrote:
           | I suppose it depends on your message volume. To me,
           | processing 100k messages and then getting a page however long
           | later as the broker (or whatever) falls apart sounds much
           | worse than head of line blocking and seeing the problem
           | directly in my consumer. If I need to not do head of line
           | blocking, I can build whatever failsafe mechanisms I need for
           | the problematic data and defer to some other queueing system
           | (typically, just add an attempt counter and replay the
           | message to the same kafka topic and then if attempts > X,
           | send it off to wherever)
           | 
           | I'd rather debug a worker problem than an infra scaling
           | problem every day of the week and twice on Sundays.
        
             | Misdicorl wrote:
             | Follow on: If you're using kafka to publish messages to
             | _multiple_ consumers, this is even worse as now you 're
             | infecting every consumer with data processing issues from
             | every other consumer. Bad juju
        
             | singron wrote:
             | It's interesting you say that, since this turned an infra
             | scaling problem into a worker problem for us. Previously,
             | we would get terrible head-of-line throughput issues, so we
             | would use an egregious number of partitions to try to
             | alleviate that. Lots of partitions is hard to manage since
             | resizing topics is operationally tedious and it puts a lot
             | of strain on brokers. But no matter how many partitions you
             | have, the head-of-line still blocks. Even cases where
             | certain keys had slightly slower throughput would clog up
             | the whole partition with normal consumers.
             | 
             | The parallel consumer nearly entirely solved this problem.
             | Only the most egregious cases where keys were ~3000 times
             | slower than other keys would cause an issue, and then you
             | could solve it by disabling that key for a while.
        
               | Misdicorl wrote:
               | Yeah I'd say kafka is not a great technology if your
               | median and 99ths (or 999ths if volume is large enough)
               | are wildly different which sounds like your situation. I
               | use kafka in contexts where 99ths going awry usually
               | aren't key dependent so I don't have the issues you see.
               | 
               | I tend to prefer other queueing mechanisms in those
               | cases, although I still work hard to make 99ths and
               | medians align as it can still cause issues (especially
               | for monitoring)
        
       | rvz wrote:
       | Every time another startup falls for the Java + Kafka arguments,
       | it keeps the AWS consultants happier.
       | 
       | Fast forward into 2025, there are many performant, efficient and
       | less complex alternatives to Kafka that save you money, instead
       | of burning millions in operational costs "to scale".
       | 
       | Unless you are at a hundred million dollar revenue company,
       | choosing Kafka in 2025 is doesn't make sense anymore.
        
         | mrkeen wrote:
         | When I pitched Kafka to my backend team in 2018, I got pushback
         | from the ops team who wanted me to use something AWS-native
         | instead, i.e. Kinesis. A big part of why I doubled-down on
         | Kafka is because it's vendor neutral and I can just run the
         | actual thing locally.
        
         | EdwardDiego wrote:
         | Wut.
         | 
         | Kafka shouldn't be used for low dataflow systems, true, but you
         | can scale a long way with a simple 3 node cluster.
        
         | enether wrote:
         | Can you share three? Without concrete suggestions, this is just
         | a disparaging narrative (that a lot of vendors use)
        
       | bionhoward wrote:
       | step 1: don't use the JVM
        
         | nthingtohide wrote:
         | do you think Grafana made the right choice by using Golang ?
        
         | openplatypus wrote:
         | Don't use best platform engineered and perfected over decades?
         | 
         | Why?
        
         | Sohcahtoa82 wrote:
         | It's not the 90s anymore. The JVM is very performant.
        
       | dangoodmanUT wrote:
       | I think this is missing a key point about partitions: Write
       | visibility ordering
       | 
       | The problem with guaranteed order is that you have to have some
       | agreed upon counter/clock for ordering, otherwise a slow write
       | from one producer to S3 could result in consumers already passing
       | that offset before it was written, thus the write is effectively
       | lost unless the consumers wind-back.
       | 
       | Having partitions means we can assign a dedicated writer for that
       | partition that guarantees that writes are in order. With
       | s3-direct writing, you lose that benefit, even with a
       | timestamp/offset oracle. You'd need some metadata system that can
       | do serializable isolation to guarantee that segments are written
       | (visible to consumers) in the order of their timestamps. Doing
       | that transactional system directly against S3 would be super slow
       | (and you still need either bounded-error clocks, or a timestamp
       | oracle service).
        
       | iwontberude wrote:
       | We could (and will repeatedly) rebuild Kafka from scratch. Solved
       | the question and I didn't even need to read the article.
        
       | bjornsing wrote:
       | > Key-centric access: instead of partition-based access,
       | efficient access and replay of all the messages with one and the
       | same key would be desirable.
       | 
       | I've been working on a datastore that's perfect for this [1], but
       | I'm getting very little traction. Does anyone have any ideas why
       | that is? Is my marketing just bad, or is this feature just not
       | very useful after all?
       | 
       | 1. https://www.haystackdb.dev/
        
         | galeaspablo wrote:
         | Some input from previously working on a superset of this
         | problem. And being in a similar position.
         | 
         | Mature projects have too much bureacracy, and even spending
         | time talking to you = opportunity cost. So making a case for
         | why you're going to solve a problem for them is tough.
         | 
         | New projects (whether at big companies or small companies) have
         | 20 other things to worry about, so the problem isn't big
         | enough.
         | 
         | I wrote about this in our blog if you're curious:
         | https://ambar.cloud/blog/a-new-path-for-ambar
        
           | bjornsing wrote:
           | Thanks. Interesting read, and an interesting product /
           | service. Have been thinking about the same approach myself...
        
         | notfed wrote:
         | The website seems very vapid. Extraordinary claims require
         | extraordinary evidence. Personally I see a lack of evidence
         | here (that this vendor-locked product is better than existing
         | freeware) and I'm going to move on.
        
           | bjornsing wrote:
           | Thanks for the honest assessment!
        
         | tinix wrote:
         | what can you do that redis can't?
         | 
         | I'm also skeptical of the graph on your front page that claims
         | S3 cost as much as DynamoDB.
         | 
         | that alone makes it look like total nonsense.
         | 
         | as someone else said, extraordinary claims require
         | extraordinary evidence.
        
           | bjornsing wrote:
           | > what can you do that redis can't?
           | 
           | Keep the data in S3 for 0.023 USD per GB-month. If you have a
           | billion keys that can be useful.
           | 
           | > I'm also skeptical of the graph on your front page that
           | claims S3 cost as much as DynamoDB.
           | 
           | Good point. Could have put a bit more work into that.
        
         | thesimon wrote:
         | > HaystackDB is accessed through a RESTful HTTPS API. No client
         | library necessary.
         | 
         | That's cool, but but I would prefer to not reinvent the wheel.
         | If you have a simple library, that would already be useful.
         | 
         | Some simple code or request examples would be convenient as
         | well. I really don't know how easy or difficult your interface
         | design is. It would be cool to see the API docs.
        
           | bjornsing wrote:
           | Yeah, it's a bit of a chicken and egg problem. Since I don't
           | have a way to find potential customers I feel it's too risky
           | investing in stuff like client libraries and good API docs.
           | But I can definitely understand you'd like to see more.
        
       | eluusive wrote:
       | This is basically NATS.io
        
       | derefr wrote:
       | > You either want to have global ordering of all messages on a
       | given topic, or (more commonly) ordering of all messages with the
       | same key. In contrast, defined ordering of otherwise unrelated
       | messages whose key happens to yield the same partition after
       | hashing isn't that valuable, so there's not much point in
       | exposing partitions as a concept to users.
       | 
       | The user gets global ordering when
       | 
       | 1. you-the-MQ assign both messages _and partitions_ stable +
       | unique + order + user-exposed identifiers;
       | 
       | 2. the user constructs a "globally-collatable ID" from the
       | (perPartitionMsgSequenceNumber, partitionID) tuple;
       | 
       | 3. the user does a client-side streaming merge-sort of messages
       | received by the partitions, sorting by this collation ID. (Where
       | even in an ACK-on-receive design, messages don't actually get
       | ACKed until they exit the client-side per-partition sort buffer
       | and enter the linearized stream.)
       | 
       | The definition of "exposed to users" is a bit interesting here,
       | as you _might_ think you could do this merge-sort on the backend,
       | just exposing a pre-linearized stream to the client.
       | 
       | But one of the key points/benefits of Kafka-like systems, under
       | high throughput load (which is their domain of comparative
       | advantage, and so should be assumed to be the deployed use-case),
       | is that you can parallelize consumption cheaply, by just
       | assigning your consumer-workers partitions of the topic to
       | consume.
       | 
       | And this _still works_ under global ordering, under some
       | provisos:
       | 
       | * your workload can be structured as a map/reduce, and you don't
       | need global ordering for the _map_ step, only the _reduce_ step;
       | 
       | * it's not impractical for you to materialize+embed the original
       | intended input collation-ordering into the transform workers'
       | output (because otherwise it will be lost in all but very
       | specific situations.)
       | 
       | Plenty of systems fit these constraints, and happily rely on
       | doing this kind of post-linearized map/reduce parallelized Kafka
       | partition consumption.
       | 
       | And if you "hide" this on an API level, this parallelization
       | becomes impossible.
       | 
       | Note, however, that "on an API level" bit. This is only a problem
       | insofar as your system design is _protocol-centered_ , with the
       | expectation of "cheap, easy" third-party client implementations.
       | 
       | If your MQ is not just a backend, but _also_ a fat client SDK
       | library -- then you can put the partition-collation into the fat
       | client, and it will still end up being  "transparent" to the
       | user. (Save for the user possibly wondering why the client
       | library opens O(K) TCP connections to the broker to consume
       | certain topics under certain configurations.)
       | 
       | See also: why Google's Colossus has a fat client SDK library.
        
       | debadyutirc wrote:
       | This is a question we asked 6 years ago.
       | 
       | What if we wrote it in Rust. And leveraged and WASM.
       | 
       | We have been at it for the past 6 years.
       | https://github.com/infinyon/fluvio
       | 
       | For the past 2 years we have also been building Flink using Rust
       | and WASM. https://github.com/infinyon/stateful-dataflow-examples/
        
         | gregfurman wrote:
         | Fluvio looks awesome!
         | 
         | Any chance you're going to be reviving support for the Kafka
         | wire protocol?
         | 
         | https://github.com/infinyon/fluvio/issues/4259
        
         | Micoloth wrote:
         | Interesting!
         | 
         | How would you say your project compares to Arroyo?
        
       | oulipo wrote:
       | There are a few interesting projects to replace Kafka: Redpanda /
       | Pulsar / AutoMQ
       | 
       | have some of you some experience with those and able to give
       | pros/cons?
        
         | SkipperCat wrote:
         | I've used Redpanda and I like it. Its Kafka compliant so all
         | the tools that work with Apache Kafka also work with Redpanda.
         | Plus, no JVM and no zookeeper.
         | 
         | Redpanda most importantly is faster that Apache Kafka. We were
         | able to get a lot more throughput. Its also stable, especially
         | compared to dealing with anything that requires a JVM.
        
       | gitroom wrote:
       | honestly, kafka always felt like way more moving parts than my
       | brain wants to track, but at the same time, its kinda impressive
       | how the ecosystem just keeps growing - you think the reason
       | people stick to it is hype, laziness, or just not enough real
       | pain yet to switch?
        
       | selkin wrote:
       | This is a useful Gedankenexperiment, but I think the replies
       | suggesting that the conclusion is that we should replace Kafka
       | with something new are quiet about what seems obvious to me:
       | 
       | Kafka's biggest strength is the wide and useful ecosystem built
       | on top of it.
       | 
       | It is also a weaknesses, as we have to keep some (but not of all)
       | the design decisions we wouldn't have made had we started from
       | scratch today. Or we could drop backwards compatibility, at the
       | cost of having to recreate the ecosystem we already have.
        
       | tezza wrote:
       | Now that really would be _Kafka-esque_
        
         | rhet0rica wrote:
         | _One morning, when Hacker News woke from troubled dreams, it
         | found itself transformed in its bed into a horrible bikeshed._
        
       ___________________________________________________________________
       (page generated 2025-04-25 23:01 UTC)