hngopher.com

       [HN Gopher] Launch HN: Relace (YC W23) - Models for fast and rel...
       ___________________________________________________________________
        
       Launch HN: Relace (YC W23) - Models for fast and reliable codegen
        
       Hey HN community! We're Preston and Eitan, and we're building
       Relace (https://relace.ai). We're trying to make building code
       agents easy and cheap.  Here's an example of our apply model vs.
       whole file edits: https://youtu.be/J0-oYyozUZw  Building reliable
       code agents is hard. Beyond simple prototypes, any app with code
       generation in production quickly runs into two problems -- how do
       you reliably apply diffs, and how do you manage codebase context?
       We're focused on solving these two problems at order-of-magnitude
       lower price and latency.  Our first model that we released, in
       February, is the Fast Apply model -- it merges code snippets with
       files at 4300 tok/s. It is more reliable (in terms of merge errors)
       than Sonnet, Qwen, Llama, or any other model at this task. Each
       file takes ~900ms and gives an instantaneous user experience, as
       well as saving ~40% on Claude 4 output tokens.  Our second model
       focuses on retrieval. For both vibe-coded and enterprise codebases,
       retrieving only the files relevant to a user request saves both on
       SoTA input token cost and reduces the number of times code agents
       need to view files. Our reranker (evals below) can scan a million-
       line codebase in ~1-2s, and our embedding model outperforms any
       other embedding model for retrieval as evaluated on a corpus of
       Typescript/React repositories.  There are many different ways to
       build coding agents, but being able to edit code reliably and
       retrieve the most relevant parts of the codebase is going to be a
       foundational issue. We're excited to be building ways to make it
       more accessible to millions of users who don't want to spend $$$ on
       Claude.  These models are used in production, millions of times per
       week. If you've used Lovable, Create.xyz, Magic Patterns, Codebuff,
       Tempo Labs then you've used us!  Here's a link to try it out:
       https://app.relace.ai, and here are our docs:
       https://docs.relace.ai.  We've opened up free access for
       prototyping on our website to everyone, and the limits should be
       enough for personal coding use and building small projects (correct
       us if it's not). We integrate directly with Open-Source IDE's like
       Continue.dev. Please try us out, we'd love to hear your feedback!
        
       Author : eborgnia
       Score  : 68 points
       Date   : 2025-05-27 15:59 UTC (7 hours ago)
        
       | bigyabai wrote:
       | > We're trying to make building code agents easy and cheap.
       | 
       | What is your plan to beat the performance and cost of first-party
       | models like Claude and GPT?
        
         | eborgnia wrote:
         | Hey -- good question! We're focused on a narrower task right
         | now that aims to save frontier tokens (both input & output).
         | Our merge + retrieval models are simply smaller LLMs that save
         | you from passing in too much context to Sonnet, and allow you
         | to output fewer tokens. These are cheap for us to run while
         | still maintaining or improving accuracy.
        
           | ramoz wrote:
           | I can import my entire codebase to Gemini and get more than a
           | nuanced similarity score in terms of agent guidance.
           | 
           | What's the differentiator or plan for arbitrary query
           | matching?
           | 
           | Latency? If you think about it - not really a huge issue.
           | Spend 20s-1M mapping an entire plan with Gemini for a
           | feature.
           | 
           | Pass that to Claude Code.
           | 
           | At this point you want non-disruptive context moving forward
           | and presumably any new findings would only be redundant with
           | what is in long context already.
           | 
           | Agentic discovery is fairly powerful even without any
           | augmentations. I think Claude Code devs abandoned early
           | embedding architectures.
        
             | eborgnia wrote:
             | Hey, these are really interesting points. The question of
             | agentic discovery vs. one-shot retrieval is really
             | dependent on the type of product.
             | 
             | For Cline or Claude Code where there's a dev in the loop,
             | it makes sense to spend more money on Gemeni ranking or
             | more latency on agentic discovery. Prompt-to-app companies
             | (like Lovable) have a flood of impatient non-technical
             | users coming in, so latency and cost become a big
             | consideration.
             | 
             | That's when using a more traditional retrieval approach can
             | be relevant. Our retrieval models are meant to work really
             | well with non-technical queries on these vibe-coded
             | codebases. They are more of a supplement to the agentic
             | discovery approaches, and we're still figuring out how to
             | integrate them in a sensible way.
        
       | mercurialsolo wrote:
       | Good job on the launch - will give it a spin for our coding
       | agent. Having worked a bunch with the agents i see below as the
       | next evolution or leap in agents.
       | 
       | I see 2 big factors to improve ability of coding agents today
       | 
       | - on device model - context (or understanding of modules) - not
       | only retrieving the relevant sections or codebase but creating a
       | version (transforming it) which is readily consumable by a model
       | and used to focus on the problem at hand.
       | 
       | This requires both a macro global context of the codebase and the
       | ability to retrieve the local context of the problem being
       | solved.
       | 
       | Augment context e.g. does a fairly good job of context
       | compression and retrieval among coding agents. Fast indexing &
       | retrieval is a good step forward to enable open context
       | compression
        
         | eborgnia wrote:
         | Thank you :)
         | 
         | Please do reach out, we love talking to builders in this space
         | & would love to share notes & give you free access.
         | eborgnia@relace.ai
        
       | jacktheturtle wrote:
       | looks very promising! this space is such a cool domain.
       | compression algos v2
        
       | nico wrote:
       | Very interesting. Can these models be used in editors/agents like
       | aider or roo? I can see also see a use case of some sort of
       | plugin or browser extension, to easily apply the patches provided
       | by GPT/Claude on their web interfaces (without having to
       | copy/paste and manually edit the files in the editor)
       | 
       | Also, would love to see more concrete examples of using the Apply
       | model
       | 
       | Reading here: https://docs.relace.ai/docs/instant-
       | apply/quickstart
       | 
       | Is it correct, that first I need to: 1) have some code, 2) create
       | a patch of the code with the changes I want, 3) call the Apply
       | model with the full code + patch to make the changes and provide
       | the result?
       | 
       | Do you have metrics to compare that workflow with just passing
       | the code from 1) with a prompt for the changes to something like
       | gpt/claude?
        
         | pfunctional wrote:
         | (Preston, other guy on the team)
         | 
         | Yes, they can -- I actually tried a semantic edit
         | implementation in Aider. It got the "correct edit format"
         | percentage to 100%, but didn't really budge the overall percent
         | correct on SOTA models. I should push it sometime, since it
         | really helps the reliability of these local models like Qwen3.
         | If you reach out to me, I can try to share some of this code
         | with you as well (it needs to be cleaned up).
         | 
         | But yes, 1. have some code, 2. create a patch (semantic, diff,
         | or udiff formats all work), and 3. apply will return it to you
         | very fast. There's roughly a 10-15% merge error rate when we
         | last benchmarked on using Claude 3.7 Sonnet to create diff
         | patches, and with us it was 4%; and you can use the Apply as a
         | backup if the merge fails.
        
           | conartist6 wrote:
           | What's the semantic diff format?
        
       | HyprMusic wrote:
       | Looks great. Before I get too excited, do you plan to release a
       | per-token paid API, or is your target audience bigger companies
       | who negotiate proper contracts?
        
         | pfunctional wrote:
         | I think we have one on the site right now -- it's roughly
         | 4.1-mini pricing. We're not aiming to make money off of
         | individual users, which is why we're trialing a free thing (and
         | trying to partner with open-source frameworks). Our bread and
         | butter is more companies doing this at scale & licensing.
        
       | jumploops wrote:
       | We looked into many different diff/merge strategies[0] before
       | finding Relace.
       | 
       | Their apply model was a simple drop-in that reduced the latency
       | of our UX substantially, while keeping error rates low.
       | 
       | Great work Preston and Eitan!
       | 
       | [0] https://aider.chat/docs/more/edit-formats.html
        
         | eborgnia wrote:
         | Thanks for the support!
        
       | jadbox wrote:
       | Benchmarks?
        
         | eborgnia wrote:
         | We have a few benchmarks on docs.relace.ai in the model
         | overview sections. Any ideas on other benchmarks you'd like to
         | see are welcome though
        
       | bradly wrote:
       | Great job. I think this is a great area to focus on.
       | 
       | I am a solo developer who after trying to run local llm models
       | for code and not being satisfied with the results is back to
       | copy/pasting from browser tabs. I use vim so getting llm/lsp
       | integration working reliably has felt questionable and not
       | something I enjoying tinkering with. I tried aider with Google's
       | Geminis models, but I never got the IAM accounts, billing quotas,
       | and acls properly configured to get things to just work. I
       | thought it would be fairly straight forward to build a local
       | model based on my Gemfile, codebase, whatever else and have a
       | local llm be both a better and cheaper experience than claude
       | code which I blew threw $5 results that weren't usable or didn't
       | save time after.
       | 
       | The sign up experience was really smooth. Like anything it else,
       | is so easy to over complicate or be too clever, so I commend you
       | for having the discipline to get it straight forward and to the
       | point.
       | 
       | After account verification I didn't feel I understood what to do
       | when landing on the Add Code Playground experience. It took me a
       | while to grok what the three editors were doing and why there was
       | JavaScript on the left and python on the right, but with an
       | option for JavaScript. I found
       | https://docs.relace.ai/docs/instant-apply/quickstart in the docs
       | and at myself would be a better place to land after signup. I'd
       | even recommend having the tabs on those snippets to be able to
       | just grab a curl command and tip my toe in.
       | 
       | I think my biggest miss was my own assumption that a custom model
       | was going to be a local model. Not that it was represented that
       | way, but my brain was lumping those things together prematurely.
        
         | eborgnia wrote:
         | Hey, really appreciate the detailed sign up journey here!
         | Getting the simplest flow is hard, and it's something we obsess
         | over. The docs have been a work in progress for the past couple
         | of months, but now that they are getting better I think it's a
         | good idea to make them more front and center for new users.
         | 
         | We are trying to make this as accessible as possible to the
         | open-source community, with our free tier, but feel free to
         | reach out if you need expanded rate limits. Cheers :)
        
       | diggan wrote:
       | Looks interesting and useful if the accuracy numbers are as told.
       | Kind of sad it's only available via a remote API though, makes
       | the product more like a traditional SaaS-API. The marketing keeps
       | talking about "models" yet the actual thing you use is only the
       | API, would have been nice to be able to run locally. Although I
       | do understand that it's harder to make money in that case.
       | 
       | I got curious about what datasets you used for training the
       | models? Figured the easiest would be to scrape git repositories
       | for commits from there, but seems there are also quality issues
       | with an approach like that.
        
         | eborgnia wrote:
         | Open source git repos are a really good place to get data -- it
         | requires a lot of munging to get it into a useful format, but
         | that's the name of the game with model training.
         | 
         | It's on the roadmap to make public evals people can use to
         | compare their options. A lot of the current benchmarks aren't
         | really specialized for these prompt-to-app use cases
        
       | piterrro wrote:
       | How does it differ from Cline VS extension? It already uses diff
       | apply which makes bigger files edits much faster
        
         | eborgnia wrote:
         | Cline orchestrates all the models under the hood, you could use
         | our apply model with Cline. Not sure what model they are using
         | for that feature right now
        
       | KaoruAoiShiho wrote:
       | Does this work on any language or text?
        
         | eborgnia wrote:
         | We trained it on over a dozen languages, with a bias towards
         | Typescript and Python. We've seen it work on Markdown pretty
         | well, but you could try it on plaintext too -- curious to hear
         | how that goes
        
       | bcyn wrote:
       | Very interested to see what the next steps are to evolve the
       | "retrieval" model - I strongly believe that this is where we'll
       | see the next stepwise improvement in coding models.
       | 
       | Just thinking about how a human engineer approaches a problem.
       | You don't just ingest entire relevant source files into your
       | head's "context" -- well, maybe if your code is broken into very
       | granular files, but often files contain a lot of irrelevant
       | context.
       | 
       | Between architecture diagrams, class relationship diagrams, ASTs,
       | and tracing codepaths through a codebase, there should
       | intuitively be some model of "all relevant context needed to make
       | a code change" - exciting that you all are searching for it.
        
         | ankit219 wrote:
         | I have a different pov on retrieval. It's a hard problem to
         | solve in a generalizable format with embeddings. I believe this
         | can be solved at a model level where its used to fix an issue.
         | With the model providers (oai, anthropic) going full stack,
         | there is a possibility they solve it at reinforcement learning
         | level. Eg: when you teach a model to solve issues in a
         | codebase, the first step is literally getting the right files.
         | Here basic search (with grep) would work very well as with
         | enough training, you want the model to have an instinct about
         | what to search given a problem. similar to how an experienced
         | dev has that instinct about a given issue. (This might be what
         | the tools like cursor are also looking at). (nothing against
         | anyone, just sharing a pov, i might be wrong)
         | 
         | However, the fast apply model is a thing of beauty. Aider uses
         | it and it's just super accurate and very fast.
        
           | bcyn wrote:
           | Definitely agree with you that it's a problem that will be
           | hard to generalize a solution for, and that the eventual
           | solution is likely not embeddings (at least not alone).
        
         | eborgnia wrote:
         | Adding extra structural information about the codebase is an
         | avenue we're actively exploring. Agentic exploration is a
         | structure-aware system where you're using a frontier model
         | (Claude 4 Sonnet or equivalent) that gives you an implicit
         | binary relevance score based on whatever you're putting into
         | context -- filenames, graph structures, etc.
         | 
         | If a file is "relevant" the agent looks at it and decides if it
         | should keep it in context or not. This process repeats until
         | there's satisfactory context to make changes to the codebase.
         | 
         | The question is whether we actually need a 200b+ parameter
         | model to do this or if we can distill the functionality onto a
         | much smaller, more economical model. A lot of people are
         | already choosing to do it with Gemeni (due to the 1m context
         | window), and they write the code with Claude 4 Sonnet.
         | 
         | Ideally, we want to be able to run this process cheaply in
         | parallel to get really fast generations. That's the ultimate
         | goal we're aiming towards
        
       | harrisreynolds wrote:
       | Nice! I am currently writing a new version of my no-code
       | platform, WeBase [1], to use AI to generate and edit
       | applications.
       | 
       | Currently just using foundation models from OpenAI and Gemini but
       | will be very interested to try this out.
       | 
       | My current approach is to just completely overwrite files with
       | new updated version but I am guessing using something like Relace
       | will make the whole process more efficient... is that correct?
       | 
       | I'll watch your video later but I would love to learn more about
       | common use cases. It could even be fun to write a blog post for
       | your blog comparing my "brut force" approach to something more
       | intelligent using Relace.
       | 
       | [1] https://www.webase.com (still points to the old "manual"
       | version)
        
         | diggan wrote:
         | > My current approach is to just completely overwrite files
         | with new updated version
         | 
         | Overwriting full files work great <100 lines or so, but once
         | you want to be able to edit files above that, it kind of gets
         | very slow (and costly if using paid APIs), so using some sort
         | of "patch format" makes a lot of sense.
        
         | eborgnia wrote:
         | Happy to collaborate, shoot us an email at info@relace.ai :)
        
       | max_on_hn wrote:
       | I will have to try out Relace for CheepCode[0], my cloud-based AI
       | coding agent :) Right now I'm using something I hacked together,
       | but this looks quite slick!
       | 
       | [0] https://cheepcode.com
        
       ___________________________________________________________________
       (page generated 2025-05-27 23:00 UTC)