[HN Gopher] Launch HN: Relvy (YC F24) - On-call runbooks, automated
       ___________________________________________________________________
        
       Launch HN: Relvy (YC F24) - On-call runbooks, automated
        
       Hey HN! We are Bharath, and Simranjit from Relvy AI
       (https://www.relvy.ai). Relvy automates on-call runbooks for
       software engineering teams. It is an AI agent equipped with tools
       that can analyze telemetry data and code at scale, helping teams
       debug and resolve production issues in minutes. Here's a video:
       [[[https://www.youtube.com/watch?v=BXr4_XlWXc0]]]  A lot of teams
       are using AI in some form to reduce their on-call burden. You may
       be pasting logs into Cursor, or using Claude Code with Datadog's
       MCP server to help debug. What we've seen is that autonomous root
       cause analysis is a hard problem for AI. This shows up in
       benchmarks - Claude Opus 4.6 is currently at 36% accuracy on the
       OpenRCA dataset, in contrast to coding tasks.  There are three main
       reasons for this: (1) Telemetry data volume can drown the model in
       noise; (2) Data interpretation / reasoning is enterprise context
       dependent; (3) On-call is a time-constrained, high-stakes problem,
       with little room for AI to explore during investigation time.
       Errors that send the user down the wrong path are not easily
       forgiven.  At Relvy, we are tackling these problems by building
       specialized tools for telemetry data analysis. Our tools can detect
       anomalies and identify problem slices from dense time series data,
       do log pattern search, and reason about span trees, all without
       overwhelming the agent context.  Anchoring the agent around
       runbooks leads to less agentic exploration and more deterministic
       steps that reflect the most useful steps that an experienced
       engineer would take. That results in faster analysis, and less
       cognitive load on engineers to review and understand what the AI
       did.  How it works: Relvy is installed on a local machine via
       docker-compose (or via helm charts, or sign up on our cloud),
       connect your stack (observability and code), create your first
       runbook and have Relvy investigate a recent alert.  Each
       investigation is presented as a notebook in our web UI, with data
       visualizations that help engineers verify and build trust with the
       AI. From there on, Relvy can be configured to automatically respond
       to alerts from Slack  Some example runbook steps that Relvy
       automates: - Check so-and-so dashboard, see if the errors are
       isolated to a specific shard. - Check if there's a throughput surge
       on the APM page, and if so, is it from a few IPs? - Check recent
       commits to see if anything changed for this endpoint.  You can also
       configure AWS CLI commands that Relvy can run to automate
       mitigation actions, with human approval.  A little bit about us -
       We did YC back in fall 2024. We started our journey experimenting
       with continuous log monitoring with small language models - that
       was too slow. We then invested deeply into solving root cause
       analysis effectively, and our product today is the result of about
       a year of work with our early customers.  Give us a try today.
       Happy to hear feedback, or about how you are tackling on-call
       burden at your company. Appreciate any comments or suggestions!
        
       Author : behat
       Score  : 25 points
       Date   : 2026-04-09 12:11 UTC (4 hours ago)
        
 (HTM) web link (www.relvy.ai)
 (TXT) w3m dump (www.relvy.ai)
        
       | ramon156 wrote:
       | Congrats on the launch! I dig the concept, seems like a good tool
       | :)
        
         | behat wrote:
         | Thank you :)
        
       | hrimfaxi wrote:
       | How does this differ from cursor cloud agents where I can hook up
       | MCPs, etc and even launch the agent in my own cloud to connect
       | directly to internal hosts like dbs?
        
         | behat wrote:
         | Thanks. Yeah, Cursor / Claude code + MCP is powerful. We
         | differentiate on two fronts, mainly:
         | 
         | 1) Greater accuracy with our specialized tools: Most MCP tools
         | allow agents to query data, or run *ql queries - this
         | overwhelms context windows given the scale of telemetry data.
         | Raw data is also not great for reasoning - we've designed our
         | tools to ensure that models get data in the right format,
         | enriched with statistical summaries, baselines, and correlation
         | data, so LLMs can focus on reasoning.
         | 
         | 2) Product UX: You'll also find that text based outputs from
         | general purpose agents are not sufficient for this task - our
         | notebook UX offers a great way to visualize the underlying data
         | so you can review and build trust with the AI.
        
           | hrimfaxi wrote:
           | To be clear, are the main differentiators basically better
           | built-in MCPs and better UX? Not knocking just trying to
           | understand the differences.
           | 
           | I have had incredible success debugging issues by just
           | hooking up Datadog MCP and giving agents access to it.
           | Claude/cursor don't seem to have any issues pulling in the
           | raw data they need in amounts that don't overload their
           | context.
           | 
           | Do you consider this a tool to be used in addition to
           | something like cursor cloud agents or to replace it?
        
             | behat wrote:
             | For the debugging workflow you described, we would be a
             | standalone replacement for cursor or other agents. We don't
             | yet write code so can't replace your cursor agents
             | entirely.
             | 
             | Re: diffentiation - yes, faster, more accurate and more
             | consistent. Partially because of better tools and UX, and
             | partially because we anchor on runbooks. On-call engineers
             | can quickly map out that the AI ran so-and-so steps, and
             | here's what it found for each, and here's the time series
             | graph that supports this.
             | 
             | Interesting that you have had great success with Datadog
             | MCP. Do you mainly look at logs?
        
               | verdverm wrote:
               | > For the X workflow, we would be a standalone
               | replacement for other agents.
               | 
               | Imo, this is not what users want. They want extension to
               | their agent. If a project tells me I have to use their
               | interface or agentic setup, it's 95% not going to happen.
               | Consider how many SaaS tools we already have to deal
               | with, that many agents is not desirable, they all have
               | their little quirks and take time to "get to know"
               | 
               | Instead, build extensions, skills, and subagents that fit
               | into my agentic workflow and tooling. This will also
               | simplify what you need to do, so you can focus on your
               | core competency. For example, you should be able to
               | create a chat participant in VS Code / Copilot, and take
               | advantage of the native notebook and diff rendering,
               | sharing the MCPs (et al) the user already has for their
               | agents for their internal systems.
        
               | behat wrote:
               | > They want extension to their agent. If a project tells
               | me I have to use their interface or agentic setup, it's
               | 95% not going to happen
               | 
               | Yes, there's definitely friction there. It may be that
               | the right form factor is that you trigger Relvy's
               | debugging agent via Claude code / Cursor .
               | 
               | Our early users are heavy on needing to look at the raw
               | data to be able to review the AI RCA, so a standalone set
               | up makes sense. Also, the dominant usage pattern is
               | background agentic execution triggered by alerts, and not
               | manual.
        
               | verdverm wrote:
               | Yup, we are moving up the ladders of abstraction and will
               | have our agentic team interfaces that include agents
               | triggered outside of human input. It does not change
               | things. As soon as I need to go into the code or to the
               | agent to fix the problem, I'm back to copy and pasting,
               | or switching to view, between multiple interfaces. That's
               | the kind of stuff we loathe
               | 
               | Runbooks are great and all, but actions need to be taken
               | and I'm not going to give all the vendor interfaces to
               | the internal systems. They can be subagents in my system
               | which already has the tools and permission gates needed,
               | access to code and git for IaC changes, etc...
               | 
               | It seems like the way to go now, it's easier to get
               | moving and show off an experience and the vision, but
               | it's definitely not the operational way in prod for a lot
               | of reasons, security being a paramount one.
               | 
               | I also do not discount that your SaaS can be easily
               | replaced by an open sourced subagent team in the next
               | couple of years.
        
         | esafak wrote:
         | They claim a 12% lead (from 36% to 48%) over Opus 4.6 in a RCA
         | benchmark: https://www.relvy.ai/blog/relvy-improves-claude-
         | accuracy-by-...
        
           | behat wrote:
           | heh, I was just about to post the following on your previous
           | comment re: reproducible benchmark results. Thanks for
           | posting the blog.
           | 
           | With the docker images that we offer, in theory, people can
           | re-run the benchmark themselves with our agent. But we should
           | document and make that easier.
           | 
           | At the end of it, you really would have to evaluate on your
           | own production alerts. Hopefully the easy install + set up
           | helps.
        
       | rishav wrote:
       | Woohoo!!! Congrats on the big launch y'all
        
       | Harnoor_Kaur wrote:
       | This is a big one!! Congratulations guys :) Rooting for you!
        
       | willchen wrote:
       | Interesting! tbh, we don't have any runbooks and pretty minimal
       | telemetry set up (we're a very small team :), do you have any
       | recommendations on which telemetry service to use to get started?
       | right now, our services run on a combination GCP Cloud Run +
       | Vercel
        
         | behat wrote:
         | Nice to see you here, Will! I'd generally recommend using open
         | telemetry for instrumentation so that you keep the option of
         | switching between telemetry vendors.
         | 
         | Re: runbooks, yeah even larger teams don't have good ones to
         | begin with. Relvy helps debug without runbooks as well - it
         | might take longer to explore, but once you are happy with a
         | particular investigation path the AI took, you can save it as a
         | runbook for more deterministic future executions.
        
       ___________________________________________________________________
       (page generated 2026-04-09 17:00 UTC)