[HN Gopher] Launch HN: Svix (YC W21) - Webhooks as a Service
       ___________________________________________________________________
        
       Launch HN: Svix (YC W21) - Webhooks as a Service
        
       Hey everyone, my name is Tom, and I'm the founder of Svix
       (https://www.svix.com) - previously known as Diahook. Svix makes it
       easy for developers to send webhooks from their service using a
       simple API. Think Twilio or SendGrid but for webhooks.  Webhooks
       are how servers notify each other of events, so they are a key
       component of many APIs such as Stripe, Shopify, Slack, Dropbox and
       Github. They look easy to implement (just a POST request), but they
       come with a variety of challenges. For example, customer endpoints
       fail or hang much more often than you would think, so you would
       need to implement retries. You need to make sure such failures
       don't clog your send queue or the rest of your system. The webhook
       system is an additional system separate from your normal web server
       that needs to be scaled and monitored separately. There are also a
       variety of security implications such as SSRF, replay attacks, and
       attackers sending fake webhooks to your customers (so make sure to
       sign the payload and make it easy to verify!). You also want to
       avoid overloading your users' endpoints, so you want to
       automatically rate-limit webhook sending, as well as disabling
       failing ones, and notifying your users when you do.  I encountered
       these challenges at my previous company. Our users were constantly
       asking us for webhooks, but we kept deferring building them because
       we weren't willing to commit the engineering time, resources, and
       ongoing maintenance required of a webhook delivery system. This was
       the seed for Svix, but it's only after a friend of mine asked about
       adding webhooks to her own product that I realized "Oh, there's
       maybe a business here".  The idea behind Svix is to make it very
       easy for everyone to send webhooks. Developers make one API call
       and we take care of deliverability, monitoring, and retries. We
       also have a pre-built management UI that our customers can offer
       their users to manage their webhook endpoints, as well as inspect,
       debug, and replay failures. This is in addition to a variety of
       tools, libraries, and tutorials to make both sending and consuming
       webhooks easy.  We have previously done a Show HN
       (https://news.ycombinator.com/item?id=26399672) and got a lot of
       great feedback from the community. A lot has changed since then,
       for example, we now have libraries for Python, JavaScript
       (TypeScript), Java and Go; a first version of the Ruby and PHP
       libraries, and a CLI for interacting with the service. We have
       improved the management UI, made it easy to embed it in an iframe,
       and improved the onboarding and documentation to make it even
       easier to get started. And finally, we have scaled the backend to
       keep up with the growing needs of our customers. We have a lot more
       planned for the coming months, and we've grown the team so
       improvements are going to come at an even higher pace.  One of the
       common questions from our Show HN was: "Don't developers need to
       handle deliverability and retries to Svix?" Deliverability to user
       endpoints (servers) is very different to deliverability to Svix.
       User endpoints fail all the time and for various reasons, and each
       of them can fail independently. This means developers need a robust
       and scalable delivery system that can deal with failures on an
       ongoing basis. With Svix, outages are rare, and are dealt with as
       incidents, the same way you would with SendGrid, Twilio and other
       API providers.  Our goal with Svix is to make it easier for
       developers to add webhooks to their service. Webhooks make APIs
       that much more useful and enable a lot of automations and
       integrations which benefit both the products offering them, and the
       communities around them. Just think of all the cool Slack bots made
       possible thanks to webhooks. I'd really love to see every service
       out there offering a great API!  I'd love to hear about your
       experience building (or using) webhooks systems. What's a must
       have? Any war stories to share? Got any questions? Suggestions?
       Please let me know!  Docs: https://docs.svix.com/  Docs for
       consuming webhooks: https://docs.svix.com/receiving/introduction
       API viewer (and OpenAPI specs): https://api.svix.com/docs/
        
       Author : tasn
       Score  : 83 points
       Date   : 2021-06-16 13:19 UTC (9 hours ago)
        
       | candiddevmike wrote:
       | You've basically built a message queue as a service using HTTP. I
       | personally don't see the innovation here or the kind of moat you
       | expect to build, but I wish you the best of luck in your
       | endeavor.
        
         | codegeek wrote:
         | This comment reminds of the classic "Dropbox is basically a
         | hosted FTP service" comment :).
        
         | fasteo wrote:
         | >>> basically
         | 
         | The devil is in the details
        
           | candiddevmike wrote:
           | Not really, I don't see how this would be more difficult than
           | any other kind of queue most programs already have--email
           | notification retries especially. Unlike email though, this is
           | HTTP and has better status codes. Also, you'd technically
           | still have to implement a queue to send to svix, no?
           | Otherwise if they go down you lose critical messages.
        
             | atombender wrote:
             | It's potentially a bit different from normal queues in that
             | while you scale up your own queue processing, you can't
             | scale up the webhook receiver. And unike something like
             | newsletter emailing, you probably care very much about
             | latency.
             | 
             | This means that in a naive implementation, unless you run
             | as many parallel workers as there are messages in the
             | queue, someone will block someone else from delivering.
             | Depending on your latency requirements, this might not be
             | acceptable.
             | 
             | Making delivery truly parallel -- that is, each distinct
             | receiver should not block anyone else, no matter how slow
             | or failure-prone they are -- and low latency is a bit more
             | tricky, essentially requiring one logical queue per
             | webhook.
             | 
             | You can solve it in various ways, depending on what
             | solution (Kafka, NATS/JetStream, Pulsar, Google Pub/Sub,
             | etc.) you choose, but as far as I know, nobody provides
             | this out of the box. In particular, one-queue-per-webhook
             | requires worker coordination in a way that classical
             | pub/sub doesn't -- after all, you don't want to run one
             | worker per webhook if they're not all full of pending
             | messages -- and some systems don't scale to many queues
             | very well (e.g. Google Pub/Sub has a hard limit of 10,000
             | topics per project).
             | 
             | Retrying can also pose some challenges. What if the webhook
             | has been down for days? Do you still keep messages in the
             | queue, or do you throw them away? If the webhook comes up,
             | do you prioritize new deliveries or do you mix in the old
             | ones? How do keep track of this so that you can alert the
             | webhook owner about the flakiness?
             | 
             | As the other poster says, the devil is in the details. It's
             | all solveable, but nine times out of then, I personally
             | prefer having something off-the-shelf that's been built
             | once, rather than building it from scratch every time.
        
               | SahAssar wrote:
               | I might be missing something but it seems like all of
               | your details are either things you would need to
               | configure anyways in Svix (not all services should have
               | the same retry/expiry) or things that are not solved by
               | this service. This service takes HTTP as input and
               | output, so you wouldn't need a worker per topic anyway,
               | right? The workload is http-in, http-out, with a failure
               | condition for retry.
               | 
               | If I already have a queue of http messages (which I need
               | to have to protect from Svix downtime) configured with
               | their policies for retry/expiry (which I need to
               | configure since it's not the same for all) then what does
               | this service do that is not basically a curl loop with an
               | error check?
        
             | tasn wrote:
             | We commented about the "queue to send to svix" in the post
             | above.
             | 
             | > Deliverability to user endpoints (servers) is very
             | different to deliverability to Svix. User endpoints fail
             | all the time and for various reasons, and each of them can
             | fail independently. This means developers need a robust and
             | scalable delivery system that can deal with failures on an
             | ongoing basis. While with Svix, outages are rare, and are
             | dealt with as incidents. The same way you would with
             | SendGrid, Twilio and other API providers.
        
               | notyourday wrote:
               | It is not very difficult to have nearly no outages when
               | the service has nearly no traffic.
        
               | tasn wrote:
               | What makes you think we have nearly no traffic?
               | 
               | Anyhow, this was not a comment about our uptime against
               | any particular service, but rather how our uptime against
               | the collective (so how often any of those fail) - because
               | that's what matters here.
               | 
               | Though there's definitely a big difference in uptime
               | between a service that has SLAs and random user-endpoints
               | that don't necessarily promise the same.
        
             | mrkurt wrote:
             | Webhooks are easy-ish to send and retry. Building the UX to
             | help users successfully use webhooks is not simple. You
             | need debugging tools, retry handling, notifications when
             | they break (but not the first time they break, when they
             | break repeatedly), etc.
             | 
             | You're conflating low level plumbing with a ready-to-go,
             | multi tenant feature.
        
               | wferrell wrote:
               | Agree
        
             | fasteo wrote:
             | This [1] is a good read about design decisions and
             | potential problems you face with a service like this at
             | scale.
             | 
             | [1] https://segment.com/blog/introducing-centrifuge/
        
       | ezekg wrote:
       | Once again, I'm a big fan of this idea. I wish this service
       | existed 4 years ago when I built my own. Lots of time has been
       | spent there. Probably bad timing, but I actually *just* finished
       | a blog post on how I built my webhook system and posted it to HN
       | [0]. I gave Svix a shout out. :)
       | 
       | Anyways -- I think really nailing API uptime is going to be
       | critical for a service like this, to reduce the chance of having
       | to do queuing on the customer's end to replay failed requests to
       | Svix. That's one of my big concerns at first glance. Some of the
       | webhooks I send are table-stakes for many customers.
       | 
       | [0]: https://news.ycombinator.com/item?id=27528212
        
         | tasn wrote:
         | Hehe, perfect timing with your post, and thanks for the
         | shutout! :)
         | 
         | Yeah, API uptime is crucial (and we spend a lot of time on just
         | that). Though I think it's not just for us, but rather for
         | almost all of the external APIs and services out there!
        
       | dmytton wrote:
       | The story behind Svix and how the idea developed following Tom's
       | work on the end-to-end encrypted backend product EteSync and
       | Etebase is interesting, and something I covered with him in an
       | interview last month:
       | 
       | https://console.dev/interviews/svix-tom-hacohen/
        
       | edoceo wrote:
       | We had a similar issue. We built out a small tool in Go we call
       | WHOMP (WebHOok Management Proxy). Our app code pumps to that and
       | it handles all the rest of the delivery and security parts (eg:
       | malicious hooks). This single go-binary can easily handle 1000+
       | RPS.
       | 
       | IMHO you're competing with some fairly simple code built on
       | fairly simple queues that can handle loads of traffic at $5/mo
       | after the build (it took us like 24 hours) and it's tightly
       | integrated to our own app - so those retries/errors are visible.
       | 
       | I'm not clear on where the value is here. I drop my home brew
       | then rework systems to your APIs to get back to baseline. Then
       | what? I hit monthly cost parity at 15k messages a month. But at
       | 20k messages it's $10/mo and goes up - it looks like I'd be over
       | $50/mo on the platform quickly. Maybe I just don't see the Killer
       | Feature here.
        
         | dubcanada wrote:
         | You already answered your question.
         | 
         | 1. You spent hours building a small Go tool, assuming your hour
         | is worth at least a $100 (certainly more) and say you spent 5
         | hours, that's $500. 2. You deployed it on a VM, that's $5 a
         | month 3. You need to keep it updated and manage it, so let's
         | say you spend 1 hour a month. That's a $100 a month.
         | 
         | So right now you spent $500 upfront, and $105 a month to keep
         | it going, and that's assuming 1 hour a month which is crazy
         | low.
         | 
         | You can replace that entire cost with a static Y x 0.001 + what
         | ever up front cost to convert the system to use Svix, which can
         | be factored into the system cost. I have 50,000 users, each
         | user has 1 webhook I need 50,000 x average number of messages a
         | month. Just add that on top of the system monthly cost.
         | 
         | It's a lot easier to manage from a business/accounting point of
         | view a SAAS then your own homebrew.
        
         | unamashana wrote:
         | I'd say that the main complexity comes from exposing the
         | logs/analytics to your end-users. I am hoping Svix will help
         | companies implement Webhooks as well as companies like Stripe
         | do.
        
         | wfleming wrote:
         | For a lot of services like this I think the value prop is not
         | monetary cost but more "this is no longer your responsibility".
         | Since you already have a custom-built solution that is stable &
         | operating efficiently, switching doesn't sound worth it to you.
         | For a team that's looking to add webhook functionality for the
         | first time, not needing to go through the implementation work
         | themselves or need to deal with any maintenance longer-term
         | could sound very appealing. I think it's akin to the rise of
         | interest in user-management-as-a-service products, etc.
         | 
         | I'm not, to be clear, expressing an opinion that the business
         | will be successful with that strategy or that I think it's a
         | good trend. On the contrary, this overall trend of feature-x-
         | as-a-service products depresses me a bit. We've gone from the
         | old days when you had to write pretty much everything yourself
         | (which sucked), to the days when there were a ton of feature-x-
         | as-library choices (which was way better but led to complaints
         | of web programming becoming a chore of just bolting libraries
         | together), to the current trend of feature-x-as-service
         | products.
         | 
         | It's a logical evolution in some ways - similar to libraries
         | over roll-your-own, it reduces your own set of
         | responsibilities. But it also reduces your ability to learn
         | (e.g. from reading source) or grow beyond the provided service
         | (much harder to swap out providers or roll-your-own when the
         | current solution is a third party black box). It also feels
         | depressing in that we've gone from a thriving ecosystem of
         | software based on open source to a much more capitalism-first
         | ecosystem of "this could be a library but then how would I make
         | money off of it". (I realize open source funding & maintainer
         | compensation/sanity is its own set of problems. I just wish we
         | could work on those issues without turning everything into a
         | product.)
        
           | akudha wrote:
           | Yup. My teammates joke that the best problem to have is "not
           | my problem".
           | 
           | This might be a simple service, but this is one less thing to
           | worry about
        
           | wdb wrote:
           | Yes, but there is no worth about any compliance regarding
           | data protection that includes the webhooks. Also company is
           | based in the US so that causes Privacy Shield issues.
           | 
           | Not easy to use this service when you need follow all kinds
           | of regulation and the GDPR
        
             | tasn wrote:
             | All of our servers are in Europe, and we are soon going to
             | tackle getting compliance certifications (as users have
             | been asking for this).
        
               | ezekg wrote:
               | Do you find that unnecessarily adds latency for webhook
               | event ingestion from the US? Seems like that would add a
               | point of failure that makes me even more nervous here --
               | instead of a quick request to us-east-2, I have to send
               | data across the pond.
        
               | wdb wrote:
               | Doesn't matter the terms suggest the company is based in
               | Delaware:
               | 
               | This agreement will be governed by the laws of the
               | Delaware, USA. The courts of Delaware have exclusive
               | jurisdiction to settle any dispute arising out of or in
               | connection with this agreement
        
               | tasn wrote:
               | IANAL, though this is for disputes. Not for compliance
               | with international laws which we can do anyway. The
               | question is: can a US company with servers solely in
               | Europe can comply with the European legislations. Based
               | on my understanding the answer is yes, though I'll double
               | check with the experts. :)
        
       | knoebber wrote:
       | Congrats on the launch! The new site looks good. Looking forward
       | to seeing where this goes.
        
         | tasn wrote:
         | Thank you!
        
       | iomcr wrote:
       | This is awesome, I'll probably try it out soon.
       | 
       | My primary use case is that I don't want my end user(s) to know
       | the IP address of my core application.
       | 
       | But, the option to pay someone other than AWS/GoogleCloud and
       | then have to write my own lambdas is also a plus.
       | 
       | Something similar, but for "fetch metadata on this URL" (tags,
       | description, title, etc), version 2 of that being support for
       | when the end target is an SPA, or also a "take a screenshot of
       | this URL" would also be nice.
        
       | asdev wrote:
       | I think the issue here is skilled developers can build this
       | themselves and the product doesn't appeal to non technical folks.
       | so it's kind of in no man's land.
        
       | pgt wrote:
       | Svix the name.
        
       | dylanbfox wrote:
       | Love this idea. Had this on my list of side projects to build for
       | a while - definitely see the use for this. It's one less thing
       | for a dev team to maintain in-house.
       | 
       | What does latency look like on delivering webhooks? From the time
       | your service is hit, to the time when the webhook is sent?
        
         | tasn wrote:
         | Median is 55ms, though it can probably be improved as we
         | haven't spent any time optimizing this...
        
           | tasn wrote:
           | I thought about it a bit more, and I have a few ideas on how
           | to reduce it by a lot.
        
       | debarshri wrote:
       | The thing is if a developer whose core task is not to be build
       | these services can build it. Not just build it, build it
       | robustly, I am guessing that threshold for people to start
       | replicating your services is not that high. In that case, I would
       | question whether is a product or a feature. Because the threshold
       | to build something like this is low, your focus would be more on
       | customer acquisition. You might even acquire alot of customer and
       | tell me you have traction, but I would question what true value
       | it really adds and then I would question long term viability of
       | this company.
       | 
       | On a side note, I have lately started seeing various companies
       | graduating from YC that make me question whether they are product
       | or feature. Not sure what the strategy behind them is. Either
       | they end up going on product hunt, get a decent upvotes and then
       | kind of disappear or are in space that slowly starts seeing other
       | people replicate that.
       | 
       | In devtool space, I have seen YC invest in companies that pretty
       | much do the same thing. In my opinion, it is not fair for the
       | operators of a startup if you have a backer that backs your
       | competitor too as it dilutes your orgs value.
        
       | jph wrote:
       | Great idea. Multiple teams of mine have experienced this exact
       | pain point with webhook retries, monitoring, caching, idempotent
       | commands, etc. If you consider adding Elixir and/or Rust to your
       | library roadmap, please let me know.
        
         | tasn wrote:
         | Thanks! I personally love Rust, though the only reason we
         | haven't done the libraries yet is that I feel not that many
         | people use Rust in production web services just yet. I may be
         | wrong though...
        
       ___________________________________________________________________
       (page generated 2021-06-16 23:01 UTC)