hngopher.com

       [HN Gopher] Show HN: Any-LLM - Lightweight router to access any ...
       ___________________________________________________________________
        
       Show HN: Any-LLM - Lightweight router to access any LLM Provider
        
       We built any-llm because we needed a lightweight router for LLM
       providers with minimal overhead. Switching between models is just a
       string change : update "openai/gpt-4" to "anthropic/claude-3" and
       you're done.  It uses official provider SDKs when available, which
       helps since providers handle their own compatibility updates. No
       proxy or gateway service needed either, so getting started is
       pretty straightforward - just pip install and import.  Currently
       supports 20+ providers including OpenAI, Anthropic, Google,
       Mistral, and AWS Bedrock. Would love to hear what you think!
        
       Author : AMeckes
       Score  : 88 points
       Date   : 2025-07-22 17:40 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | sparacha wrote:
       | There is liteLLM, OpenRouter, Arch (although that's an
       | edge/service proxy for agents) and now this. We all need a new
       | problem to solve
        
         | CuriouslyC wrote:
         | LiteLLM is kind of a mess TBH, I guess it's ok if you just want
         | a docker container to proxy to for personal projects, but
         | actually using it in production isn't great.
        
           | dlojudice wrote:
           | > but actually using it in production isn't great.
           | 
           | I only use it in development. Could you elaborate on why you
           | don't recommend using it in production?
        
           | honorable_coder wrote:
           | the people behind envoy proxy built:
           | https://github.com/katanemo/archgw - has the learnings of
           | Envoy but natively designed to process/route prompts to
           | agents and LLMs. Would be curious about your thoughts
        
           | tom_usher wrote:
           | I definitely appreciate all the work that has gone in to
           | LiteLLM but it doesn't take much browsing through the 7000+
           | line `utils.py` to see where using it could become
           | problematic (https://github.com/BerriAI/litellm/blob/main/lit
           | ellm/utils.p...)
        
             | swyx wrote:
             | can you double click a little bit? many files in
             | professional repos are 1000s of lines. LoC in it self is
             | not a code smell.
        
               | otabdeveloper4 wrote:
               | LiteLLM is the worst code I have ever read in my life.
               | Quite an accomplishment, lol.
        
               | swyx wrote:
               | ok still not helpful in giving substantial criticism
        
         | ieuanking wrote:
         | we are trying to apply model-routing to academic work and pdf
         | chat with ubik.studio -- def lmk what you think
        
         | swyx wrote:
         | portkey as well which is both js and open source
         | https://www.latent.space/p/gateway
        
           | pzo wrote:
           | why provide link if there is not a single portkey keyword
           | there?
        
             | swyx wrote:
             | its my interview w portkey folks which has more thoughts on
             | the category
        
         | wongarsu wrote:
         | And all of them despite 80% of model providers offering an
         | OpenAI compatible endpoint
        
       | dlojudice wrote:
       | I use Litellm Proxy, even in a dev environment via Docker,
       | because the Usage and Logs feature greatly helps in providing
       | visibility into LLM usage. The Caching functionality greatly
       | helps in reducing costs for repetitive testing.
        
       | weinzierl wrote:
       | Not to be confused with AnythingLLM.
        
       | honorable_coder wrote:
       | a proxy means you offload observability, filtering, caching
       | rules, global rate limiters to a specialized piece of software -
       | pushing this in application code means you _cannot_ do things
       | centrally and it doesn't scale as more copies of your application
       | code get deployed. You can bounce a single proxy server neatly
       | vs. updating a fleet of your application server just to monkey
       | patch some proxy functionality.
        
         | RussianCow wrote:
         | You can do all of that without a proxy. Just store the current
         | state in your database or a Redis instance.
        
           | honorable_coder wrote:
           | and managed from among the application servers that are
           | greedily trying to store/retrieve this state? Not to mention
           | you'll have to be in the business of defining, updating and
           | managing the schema, ensuring that upgrades to the db don't
           | break the application servers, etc, etc. The proxy server is
           | the right design decision if you are truly trying to build
           | something production worthy and you want it to scale.
        
         | AMeckes wrote:
         | Good points! any-llm handles the LLM routing, but you can still
         | put it behind your own proxy for centralized control. We just
         | don't force that architectural decision on you. Think of it as
         | composable: use any-llm for provider switching, add
         | nginx/envoy/whatever for rate limiting if you need it.
        
           | honorable_coder wrote:
           | How do I put this behind a proxy? You mean run the module as
           | a containerized service?
           | 
           | But provider switching is built in some of these - and the
           | folks behind envoy built: https://github.com/katanemo/archgw
           | - developers can use an OpenAI client to call any model,
           | offers preference-aligned intelligent routing to LLMs based
           | on usage scenarios that developers can define, and acts as an
           | edge proxy too.
        
             | AMeckes wrote:
             | To clarify: any-llm is just a Python library you import,
             | not a service to run. When I said "put it behind a proxy,"
             | I meant your app (which imports any-llm) can run behind a
             | normal proxy setup.
             | 
             | You're right that archgw handles routing at the
             | infrastructure level, which is perfect for centralized
             | control. any-llm simply gives you the option to handle
             | routing in your application code when that makes sense (For
             | example, premium users get Opus-4). We leave the
             | architectural choice to you, whether that's adding a proxy,
             | keeping routing in your app, or using both, or just using
             | any-llm directly.
        
               | sparacha wrote:
               | But you can also use tokens to implement routing
               | decisions in a proxy. You can make RBAC natively
               | available to all agents outside code. The incremental
               | feature work in code vs an out of process server is the
               | trade off. One gets you going super fast the other offers
               | a design choice that (I think) scales a lot better
        
       | swyx wrote:
       | > LiteLLM: While popular, it reimplements provider interfaces
       | rather than leveraging official SDKs, which can lead to
       | compatibility issues and unexpected behavior modifications
       | 
       | with no vested interest in litellm, i'll challenge you on this
       | one. what compatibility issues have come up? (i expect text to
       | have the least, and probably voice etc have more but for text
       | i've had no issues)
       | 
       | you -want- to reimplement interfaces because you have to
       | normalize api's. in fact without looking at any-llm code deeply i
       | quesiton how you do ANY router without reimplementing interfaces.
       | that's basically the whole job of the router.
        
         | chuckhend wrote:
         | LiteLLM is quite battle tested at this point as well.
         | 
         | > it reimplements provider interfaces rather than leveraging
         | official SDKs, which can lead to compatibility issues and
         | unexpected behavior modifications
         | 
         | Leveraging official SDKs also does not solve compatibility
         | issues. any_llm would still need to maintain compatibility with
         | those offical SDKs. I don't think one way clearly better than
         | the other here.
        
           | amanda99 wrote:
           | Being battle tested is the only good thing I can say about
           | LiteLLM.
        
             | scosman wrote:
             | You can add in it's still 10x better than LangChain
        
           | AMeckes wrote:
           | That's true. We traded API compatibility work for SDK
           | compatibility work. Our bet is that providers are better at
           | maintaining their own SDKs than we are at reimplementing
           | their APIs. SDKs break less often and more predictably than
           | APIs, plus we get provider-implemented features (retries,
           | auth refresh, etc) "for free." Not zero maintenance, but
           | definitely less. We use this in production at Mozilla.ai, so
           | it'll stay actively maintained.
        
         | scosman wrote:
         | Yeah, official SDKs are sometimes a problem too. Together's
         | included Apache Arrow, a ~60MB dependency, for a single feature
         | (I patched to make it optional). If they ever lock dependency
         | versions it could conflict with your project.
         | 
         | I'd rather a library that just used OpenAPI/REST, than one that
         | takes a ton of dependencies.
        
         | delijati wrote:
         | there is nothing lite in litellm ... i was experimenting (using
         | as a lib) but ended using
         | https://llm.datasette.io/en/stable/index.html btw. thanks
         | @simonw for llm
        
         | Szpadel wrote:
         | I use litellm as my personal AI gateway, and from user point of
         | view there is no difference if proxy uses official SDK or not,
         | this might be benefit for proxy developers.
         | 
         | but I can give you one example: litellm recently had issue with
         | handling deepseek reasoning. they broke implementation and
         | while reasoning was missing from sync and streaming responses.
        
         | AMeckes wrote:
         | Both approaches work well for standard text completion. Issues
         | tend to be around edge cases like streaming behavior, timeout
         | handling, or new features rolling out.
         | 
         | You're absolutely right that any router reimplements interfaces
         | for normalization. The difference is what layer we reimplement
         | at. We use SDKs where available for HTTP/auth/retries and
         | reimplement normalization.
         | 
         | Bottom line is we both reimplement interfaces, just at
         | different layers. Our bet on SDKs is mostly about maintenance
         | preferences, not some fundamental flaw in LiteLLM's approach.
        
       | renewiltord wrote:
       | In truth it wasn't that hard for me to ask Claude Code to just
       | implement the text completion API so routing wasn't that much of
       | a problem.
        
       | piker wrote:
       | This looks awesome.
       | 
       | Why Python? Probably because most of the SDKs are python, but
       | something that could be ported across languages without requiring
       | an interpreter would have been really amazing.
        
         | pzo wrote:
         | for js/ts you have vercel aisdk [0], for c++ you have [1], for
         | flutter/reactnative/kotlin there is [2]
         | 
         | [0] https://github.com/vercel/ai
         | 
         | [1] https://github.com/ClickHouse/ai-sdk-cpp
         | 
         | [2] https://github.com/cactus-compute/cactus
        
         | retrovrv wrote:
         | we essentially built the gateway as a service rather than an
         | SDK: https://github.com/portkey-AI/gateway
        
         | Shark1n4Suit wrote:
         | That's the key question. It feels like many of these tools are
         | trying to solve a systems-level problem (cross-language model
         | execution) at the application layer (with a Python library).
         | 
         | A truly universal solution would likely need to exist at a
         | lower level of abstraction, completely decoupling the
         | application's language from the model's runtime. It's a much
         | harder problem to solve there, but it would be a huge step
         | forward.
        
       | mkw5053 wrote:
       | Interesting timing. Projects like Any-LLM or LiteLLM solve
       | backend routing well but still involve server-side code. I've
       | been tackling this from a different angle with Airbolt [1], which
       | completely abstracts backend setup. Curious how others see the
       | trade-offs between routing-focused tools and fully hosted
       | backends like this.
       | 
       | [1] https://github.com/Airbolt-AI/airbolt
        
         | swyx wrote:
         | (retracted after GP edited their comment)
        
           | qntmfred wrote:
           | don't you post links to your own stuff all the time? i don't
           | think their comment was out of line.
        
           | mkw5053 wrote:
           | I didn't intend my original comment to be overly-promotional
           | without relevance. I'm genuinely curious about the tradeoffs
           | between different LLM API routing solutions, most acutely as
           | a consumer.
        
       | amanda99 wrote:
       | I'm excited to see this. Have been using LiteLLM but it's
       | honestly a huge mess once you peek under the hood, and it's being
       | developed very iteratively and not very carefully. For example.
       | for several months recently (haven't checked in ~a month though),
       | their Ollama structured outputs were completely botched and just
       | straight up broken. Docs are a hot mess, etc.
        
       | nexarithm wrote:
       | I have been also working on very similar open source project for
       | python llm abstraction layer. I needed one for my research job. I
       | inspired from that and created one for more generic usage.
       | 
       | Github: https://github.com/proxai/proxai
       | 
       | Website: https://proxai.co/
        
       | nodesocket wrote:
       | This is awesome, will give it a try tonight.
       | 
       | I've been looking for something a bit different though related to
       | Ollama. I'd like a load balancing reverse proxy that supports
       | queuing requests to multiple Ollama servers and sending requests
       | only when a Ollama server is up and idle (not processing).
       | Anything exist?
        
       | t_minus_100 wrote:
       | https://xkcd.com/927/ . LiteLLM rocks !
        
         | AMeckes wrote:
         | I didn't even need to click the link to know what this comic
         | was. LiteLLM is great, we just needed something slightly
         | different for our use case.
        
       | klntsky wrote:
       | Anything like this, but in TypeScript?
        
         | AMeckes wrote:
         | Python only for now. Most providers have official TypeScript
         | SDKs though, so the same approach (wrapping official SDKs)
         | would work well in TS too.
        
         | funerr wrote:
         | ai-sdk by vercel?
        
         | retrovrv wrote:
         | there's portkey that we've been working on:
         | https://github.com/portkey-AI/gateway
        
       | pglevy wrote:
       | How does this differ from this project?
       | https://github.com/simonw/llm
        
       | omneity wrote:
       | Crazy timing!
       | 
       | I shipped a similar abstraction for llms a bit over a week ago:
       | 
       | https://github.com/omarkamali/borgllm
       | 
       | pip install borgllm
       | 
       | I focused on making it Langchain compatible so you could drop it
       | in as a replacement. And it offers virtual providers for
       | automatic fallback when you reach rate limits and so on.
        
       | bdhcuidbebe wrote:
       | What is mozilla-ai?
       | 
       | Seems like reputation parasitism.
        
         | daveguy wrote:
         | It is an official Mozilla Foundation subsidiary. Their website
         | is here: https://www.mozilla.ai/
        
           | bdhcuidbebe wrote:
           | Interesting. I made my comment after visiting their repo and
           | website. Didnt see a pixel worth of the mozilla brand there,
           | hence my comment.
           | 
           | On a second visit I notice a link to mozilla.org on their
           | footer.
           | 
           | Still doesent ring official by me from being a veteran
           | mozilla user (netscape, mdn, firefox) but ok, thanks for the
           | explanation.
        
             | daveguy wrote:
             | I agree it's not very clear. They would do well to mention
             | it somewhere besides the main site footer because it would
             | probably help adoption / community / testing too. That
             | said, any company with a lawyer wouldn't let that stand as
             | a name-squat for long.
        
         | JohnPDickerson wrote:
         | Common question, thanks for asking! We're a public benefit
         | corporation focused on democratizing access to AI tech, on
         | enabling non-AI experts to benefit from and control their own
         | AI tools, and on empowering the open source AI ecosystem. Our
         | majority shareholder is the Mozilla Foundation - the other
         | shareholders being our employees, soon :). As access to
         | knowledge and people shifts due to AI, we're working to make
         | sure people retain choice, ownership, privacy, and dignity.
         | 
         | We're very small compared to the Mozilla mothership, but moving
         | quickly to support open source AI in any way we can.
        
       ___________________________________________________________________
       (page generated 2025-07-22 23:01 UTC)