[HN Gopher] Use the Gemini API with OpenAI Fallback in TypeScript
       ___________________________________________________________________
        
       Use the Gemini API with OpenAI Fallback in TypeScript
        
       Author : l5870uoo9y
       Score  : 68 points
       Date   : 2025-04-04 09:41 UTC (3 days ago)
        
 (HTM) web link (sometechblog.com)
 (TXT) w3m dump (sometechblog.com)
        
       | bearjaws wrote:
       | The Vercel AI SDK abstracts against all LLMs, including locally
       | running ones. It even handles file attachments well, which is
       | something people are using more and more.
       | 
       | https://sdk.vercel.ai/docs/introduction
       | 
       | It uses zod for types and validation, I've loved using it to make
       | my apps swap between models easily.
        
         | urbandw311er wrote:
         | That's a good spot. Is it open source or is it paid software?
         | I've been using Braintrust Proxy for this until now.
        
           | bearjaws wrote:
           | https://github.com/vercel/ai
           | 
           | Its Apache 2.0 licensed
        
         | refulgentis wrote:
         | Locally running, like, llama.cpp? Or Python?
         | 
         | Either way I guess.
         | 
         | I would have thought this was impossible, I contribute to
         | llama.cpp and there's an awful lot of per-model ugliness to
         | make things work, even just in terms of "get it the tool calls
         | in the form it expects."
         | 
         |  _cries at the Phi-4 PR in the other window that I 'm still
         | working on, and discovering new things, 4 weeks later_
        
           | bearjaws wrote:
           | I am not sure about llama.cpp but it works with Ollama.
           | 
           | That's not to say won't need to tweak things when you cut
           | down to smaller models, there are always trade offs swapping
           | models.
        
           | dragonwriter wrote:
           | > Locally running, like, llama.cpp? Or Python?
           | 
           | I'd guess it supports a small set of popular HTTP APIs
           | (particularly, its very common for self-hosting LLM toolkits
           | --not the per-model reference implementations or low-level
           | console frontends--to present an OpenAI compatible API), so
           | you could support a very wide range of local models, through
           | a variety of toolkits, just by supporting the OpenAI API with
           | configurable endpoint addresses.
        
           | BoorishBears wrote:
           | At some point someone tried a toy direct integration, but the
           | only actually supported way is via a Python library that
           | wraps llama.cpp in an OpenAI compatible API endpoint.
        
         | hu3 wrote:
         | > npm install ai
         | 
         | How is this allowed/possible in npm? Don't they have mandatory
         | namespaces?
        
           | jay-barronville wrote:
           | > > npm install ai
           | 
           | > How is this allowed/possible in npm? Don't they have
           | mandatory namespaces?
           | 
           | No, scopes ( _i.e._ , namespaces) aren't mandatory for public
           | packages on npm. If a name is available, it's yours.
           | Sometimes, some folks have published packages that are either
           | empty or not used, so you can reach out to them to ask them
           | to pass it on to you. At least a few times, I've had someone
           | reach out to me asking for a package I had published years
           | ago that I did nothing with and I passed it on to them.
        
             | danillonunes wrote:
             | Also sometimes, you can go directly to npm asking for an
             | existing name, they can give it to you and the previous
             | owner can be pissed and remove his other packages that are
             | dependency of basically everything, breaking the building
             | scripts of the whole internet.
        
         | Garbage wrote:
         | I found that [LangChain](https://www.langchain.com/langchain)
         | has pretty good abstractions and better support for multiple
         | LLMs. They also have a good ecosystem of supporting products -
         | LangGraph and LangSmith. Currently supported languages are
         | Python and Javascript.
        
           | senko wrote:
           | My problem with LangChain (aside from dubious API choices,
           | some a legacy of when it first started) is that now it's a
           | marketing tool for LangGraph Platform and LangSmith.
           | 
           | Their docs (incl. getting started tutorials) are content
           | marketing for the platform services, needlessly pushing new
           | users into more complex and possibly unnecessary direction in
           | the name of user acquisition.
           | 
           | (I have the same beef with NextJs/Vercel and MongoDB).
           | 
           | Some time ago I built a rather thin wrapper for LLMs (multi
           | provider incl local, templates, tools, rag, etc...) for
           | myself. Sensible API, small so it's easy to maintain, and as
           | a bonus, no platform marketing shoved in my face.
           | 
           | I keep an eye on what the LangChain ecosystem's doing, and so
           | far the benefits ain't really there (for me, YMMV).
        
             | mark_l_watson wrote:
             | I agree that the LangChain docs and examples shouldn't rely
             | on their platform commercial products. Then the LangGraph
             | and LangSmith documentation should layer in top of the
             | LangChain docs.
        
           | svachalek wrote:
           | Langchain is nice but it's psychotic in TypeScript.
           | Everything is "any" wrapped in layers of complex
           | parameterized types.
        
         | swyx wrote:
         | agree that TFA's advice is not really useful when open source
         | libraries like these exist
        
         | xmorse wrote:
         | There is also ai-fallback [0] to automatically switch to a
         | fallback provider in case of downtime
         | 
         | [0]: https://github.com/remorses/ai-fallback
        
         | freedomben wrote:
         | Just in case it's helpful to anyone, I recently spoke to a very
         | respected consultant about this, and he also recommended
         | Vercel's AI SDK. We haven't tried it yet but plan to.
        
       | ilrwbwrkhv wrote:
       | Typescript looks so ugly visually. It gives me PHP vibes. I think
       | it's the large words at the first column of the eye line:
       | 
       | export const
       | 
       | function
       | 
       | type
       | 
       | return
       | 
       | etc
       | 
       | This makes scanning through the code really hard because your eye
       | has to jump horizontally.
        
         | sprobertson wrote:
         | which of those words are large?
        
         | AcquiescentWolf wrote:
         | Ah yes, such large words like const, function, or return, that
         | only exist in TypeScript and PHP.
        
         | triyambakam wrote:
         | So, like, which programming language do you think is not ugly?
         | J? K?
        
         | dcre wrote:
         | Silly comment, but I concede this bit is very ugly -- they
         | should have extracted the inner type to an alias and used that
         | twice.                   options: [
         | Omit<ChatCompletionParseParams, 'model'> & { model: Model },
         | Omit<ChatCompletionParseParams, 'model'> & { model: Model },
         | ],
         | 
         | like this                   type Options =
         | Omit<ChatCompletionParseParams, 'model'> & { model: Model }
         | options: [Options, Options]
        
       | calebkaiser wrote:
       | I would recommend looking at OpenRouter, if anyone is interested
       | in implementing fallbacks across model providers. I've been using
       | it in several projects, and the ability to swap across models
       | without changing any implementation code/without managing
       | multiple API keys has been incredibly nice:
       | 
       | https://openrouter.ai/docs/quickstart
        
         | nxrabl wrote:
         | Have you measured how much latency it adds to each query?
         | Naively I'd expect adding in an extra network hop to be a
         | pretty big hit.
        
           | oaththrowaway wrote:
           | It's really pretty reasonable. I don't notice it at all.
           | Nothing as bad as trying to relay 1min.ai or something
        
           | calebkaiser wrote:
           | Anecdotally, there's been no obvious performance hit, but
           | this is something I should test more thoroughly. I'm planning
           | on running a benchmark across a couple of proxies this week--
           | I'll post the results to HN, if anyone is curious
        
       | nextworddev wrote:
       | Just use Litellm
        
       | thesandlord wrote:
       | I've been using using [BAML](https://github.com/boundaryml/baml)
       | to do this, and it works really well. Lets you have multiple
       | different fallback and retry policies, and returns strongly typed
       | outputs from LLMs.
        
       | visarga wrote:
       | I just had very bad JSON mode operation with gemini-1.5-flash and
       | 2.0-flash models using their own library 'google-generativeai'.
       | Either can't follow JSON formatting correctly, or renders string
       | fields with no end until max_tokens. Pretty bad for Gemini, when
       | open models like Qwen do a better job of a basic information
       | extraction to JSON task.
        
         | MrBuddyCasino wrote:
         | Did you provide a JSON schema? I've had good experience with
         | that.
        
         | zbrw wrote:
         | Things to note: 1) supply a JSON schema in
         | `config.reponse_schema` 2) set the `config.response_type` to
         | `application/json`
         | 
         | That works for me reliably. I've had some issues with running
         | into max_token constraints but that was usually on me because I
         | had let it process a large list in one inference call, which
         | would have resulted in very large outputs.
         | 
         | We're using gemini JSON mode in production applications with
         | both `google-generativeai` and `langchain` without issues.
        
       | _pdp_ wrote:
       | Prompt won't just work from one mother transplanted into another.
        
       | DeborahEmeni_ wrote:
       | I've done something similar using OpenRouter and fallback chains
       | across providers. It's super helpful when you're hitting rate
       | limits or need different models for different payload sizes. I
       | would love to see more people share latency data, though,
       | especially when chaining Gemini + OpenAI like this.
        
       ___________________________________________________________________
       (page generated 2025-04-07 23:01 UTC)