[HN Gopher] Use the Gemini API with OpenAI Fallback in TypeScript
___________________________________________________________________
Use the Gemini API with OpenAI Fallback in TypeScript
Author : l5870uoo9y
Score : 68 points
Date : 2025-04-04 09:41 UTC (3 days ago)
(HTM) web link (sometechblog.com)
(TXT) w3m dump (sometechblog.com)
| bearjaws wrote:
| The Vercel AI SDK abstracts against all LLMs, including locally
| running ones. It even handles file attachments well, which is
| something people are using more and more.
|
| https://sdk.vercel.ai/docs/introduction
|
| It uses zod for types and validation, I've loved using it to make
| my apps swap between models easily.
| urbandw311er wrote:
| That's a good spot. Is it open source or is it paid software?
| I've been using Braintrust Proxy for this until now.
| bearjaws wrote:
| https://github.com/vercel/ai
|
| Its Apache 2.0 licensed
| refulgentis wrote:
| Locally running, like, llama.cpp? Or Python?
|
| Either way I guess.
|
| I would have thought this was impossible, I contribute to
| llama.cpp and there's an awful lot of per-model ugliness to
| make things work, even just in terms of "get it the tool calls
| in the form it expects."
|
| _cries at the Phi-4 PR in the other window that I 'm still
| working on, and discovering new things, 4 weeks later_
| bearjaws wrote:
| I am not sure about llama.cpp but it works with Ollama.
|
| That's not to say won't need to tweak things when you cut
| down to smaller models, there are always trade offs swapping
| models.
| dragonwriter wrote:
| > Locally running, like, llama.cpp? Or Python?
|
| I'd guess it supports a small set of popular HTTP APIs
| (particularly, its very common for self-hosting LLM toolkits
| --not the per-model reference implementations or low-level
| console frontends--to present an OpenAI compatible API), so
| you could support a very wide range of local models, through
| a variety of toolkits, just by supporting the OpenAI API with
| configurable endpoint addresses.
| BoorishBears wrote:
| At some point someone tried a toy direct integration, but the
| only actually supported way is via a Python library that
| wraps llama.cpp in an OpenAI compatible API endpoint.
| hu3 wrote:
| > npm install ai
|
| How is this allowed/possible in npm? Don't they have mandatory
| namespaces?
| jay-barronville wrote:
| > > npm install ai
|
| > How is this allowed/possible in npm? Don't they have
| mandatory namespaces?
|
| No, scopes ( _i.e._ , namespaces) aren't mandatory for public
| packages on npm. If a name is available, it's yours.
| Sometimes, some folks have published packages that are either
| empty or not used, so you can reach out to them to ask them
| to pass it on to you. At least a few times, I've had someone
| reach out to me asking for a package I had published years
| ago that I did nothing with and I passed it on to them.
| danillonunes wrote:
| Also sometimes, you can go directly to npm asking for an
| existing name, they can give it to you and the previous
| owner can be pissed and remove his other packages that are
| dependency of basically everything, breaking the building
| scripts of the whole internet.
| Garbage wrote:
| I found that [LangChain](https://www.langchain.com/langchain)
| has pretty good abstractions and better support for multiple
| LLMs. They also have a good ecosystem of supporting products -
| LangGraph and LangSmith. Currently supported languages are
| Python and Javascript.
| senko wrote:
| My problem with LangChain (aside from dubious API choices,
| some a legacy of when it first started) is that now it's a
| marketing tool for LangGraph Platform and LangSmith.
|
| Their docs (incl. getting started tutorials) are content
| marketing for the platform services, needlessly pushing new
| users into more complex and possibly unnecessary direction in
| the name of user acquisition.
|
| (I have the same beef with NextJs/Vercel and MongoDB).
|
| Some time ago I built a rather thin wrapper for LLMs (multi
| provider incl local, templates, tools, rag, etc...) for
| myself. Sensible API, small so it's easy to maintain, and as
| a bonus, no platform marketing shoved in my face.
|
| I keep an eye on what the LangChain ecosystem's doing, and so
| far the benefits ain't really there (for me, YMMV).
| mark_l_watson wrote:
| I agree that the LangChain docs and examples shouldn't rely
| on their platform commercial products. Then the LangGraph
| and LangSmith documentation should layer in top of the
| LangChain docs.
| svachalek wrote:
| Langchain is nice but it's psychotic in TypeScript.
| Everything is "any" wrapped in layers of complex
| parameterized types.
| swyx wrote:
| agree that TFA's advice is not really useful when open source
| libraries like these exist
| xmorse wrote:
| There is also ai-fallback [0] to automatically switch to a
| fallback provider in case of downtime
|
| [0]: https://github.com/remorses/ai-fallback
| freedomben wrote:
| Just in case it's helpful to anyone, I recently spoke to a very
| respected consultant about this, and he also recommended
| Vercel's AI SDK. We haven't tried it yet but plan to.
| ilrwbwrkhv wrote:
| Typescript looks so ugly visually. It gives me PHP vibes. I think
| it's the large words at the first column of the eye line:
|
| export const
|
| function
|
| type
|
| return
|
| etc
|
| This makes scanning through the code really hard because your eye
| has to jump horizontally.
| sprobertson wrote:
| which of those words are large?
| AcquiescentWolf wrote:
| Ah yes, such large words like const, function, or return, that
| only exist in TypeScript and PHP.
| triyambakam wrote:
| So, like, which programming language do you think is not ugly?
| J? K?
| dcre wrote:
| Silly comment, but I concede this bit is very ugly -- they
| should have extracted the inner type to an alias and used that
| twice. options: [
| Omit<ChatCompletionParseParams, 'model'> & { model: Model },
| Omit<ChatCompletionParseParams, 'model'> & { model: Model },
| ],
|
| like this type Options =
| Omit<ChatCompletionParseParams, 'model'> & { model: Model }
| options: [Options, Options]
| calebkaiser wrote:
| I would recommend looking at OpenRouter, if anyone is interested
| in implementing fallbacks across model providers. I've been using
| it in several projects, and the ability to swap across models
| without changing any implementation code/without managing
| multiple API keys has been incredibly nice:
|
| https://openrouter.ai/docs/quickstart
| nxrabl wrote:
| Have you measured how much latency it adds to each query?
| Naively I'd expect adding in an extra network hop to be a
| pretty big hit.
| oaththrowaway wrote:
| It's really pretty reasonable. I don't notice it at all.
| Nothing as bad as trying to relay 1min.ai or something
| calebkaiser wrote:
| Anecdotally, there's been no obvious performance hit, but
| this is something I should test more thoroughly. I'm planning
| on running a benchmark across a couple of proxies this week--
| I'll post the results to HN, if anyone is curious
| nextworddev wrote:
| Just use Litellm
| thesandlord wrote:
| I've been using using [BAML](https://github.com/boundaryml/baml)
| to do this, and it works really well. Lets you have multiple
| different fallback and retry policies, and returns strongly typed
| outputs from LLMs.
| visarga wrote:
| I just had very bad JSON mode operation with gemini-1.5-flash and
| 2.0-flash models using their own library 'google-generativeai'.
| Either can't follow JSON formatting correctly, or renders string
| fields with no end until max_tokens. Pretty bad for Gemini, when
| open models like Qwen do a better job of a basic information
| extraction to JSON task.
| MrBuddyCasino wrote:
| Did you provide a JSON schema? I've had good experience with
| that.
| zbrw wrote:
| Things to note: 1) supply a JSON schema in
| `config.reponse_schema` 2) set the `config.response_type` to
| `application/json`
|
| That works for me reliably. I've had some issues with running
| into max_token constraints but that was usually on me because I
| had let it process a large list in one inference call, which
| would have resulted in very large outputs.
|
| We're using gemini JSON mode in production applications with
| both `google-generativeai` and `langchain` without issues.
| _pdp_ wrote:
| Prompt won't just work from one mother transplanted into another.
| DeborahEmeni_ wrote:
| I've done something similar using OpenRouter and fallback chains
| across providers. It's super helpful when you're hitting rate
| limits or need different models for different payload sizes. I
| would love to see more people share latency data, though,
| especially when chaining Gemini + OpenAI like this.
___________________________________________________________________
(page generated 2025-04-07 23:01 UTC)