[HN Gopher] OpenAI compatibility
       ___________________________________________________________________
        
       OpenAI compatibility
        
       Author : Casteil
       Score  : 164 points
       Date   : 2024-02-08 20:36 UTC (2 hours ago)
        
 (HTM) web link (ollama.ai)
 (TXT) w3m dump (ollama.ai)
        
       | theogravity wrote:
       | Isn't LangChain supposed to provide abstractions that 3rd parties
       | shouldn't need to conform to OpenAI's API contract?
       | 
       | I know not everyone uses LangChain, but I thought that was one of
       | the primary use-cases for it.
        
         | minimaxir wrote:
         | Which just then creates lock-in for LangChain's abstractions.
        
           | ludwik wrote:
           | Which are pretty awful btw - every project at my job that
           | started with LangChain openly regrets it - the abstractions,
           | instead of making hard things easy, trend to make the way
           | things hard (and hard to debug and maintain).
        
             | phantompeace wrote:
             | What are some better options?
        
               | minimaxir wrote:
               | Not using an abstraction at all and avoiding the
               | technical debt it causes.
        
               | hospitalJail wrote:
               | Don't use langchain, just make the calls?
               | 
               | Its what I ended up doing.
        
               | dragonwriter wrote:
               | Have a fairly thin layer than wraps the underlying LLM
               | behind a common API (e.g., Ollama as being discussed
               | here, Oobabooga, etc.) and leaves the application-level
               | stuff for the application rather than a framework like
               | LangChain.
               | 
               | (Better for certain use cases, that is, I'm not saying
               | LangChain doesn't have uses.)
        
       | Implicated wrote:
       | Love it! Ollama has been such a wonderful project (at least, for
       | me).
        
       | patelajay285 wrote:
       | We've been working on a project that provides this sort of easy
       | swapping between open source (via HF, VLLM) & commercial models
       | (OpenAI, Google, Anthropic, Together) in Python:
       | https://github.com/datadreamer-dev/DataDreamer
       | 
       | It's a little bit easier to use if you want to do this without an
       | HTTP API, directly in Python.
        
       | thedangler wrote:
       | Is Ollama model I can use locally to use for my own project and
       | keep my data secure?
        
         | MOARDONGZPLZ wrote:
         | I would not explicitly count on that. I'm a big fan of Ollama
         | and use it every day but they do have some dark patterns that
         | make me question a usecase where data security is a
         | requirement. So I don't use it where that is something that's
         | important.
        
           | mbernstein wrote:
           | Examples?
        
           | slimsag wrote:
           | like what? If you're gonna accuse a project of shady stuff,
           | at least give examples :)
        
             | MOARDONGZPLZ wrote:
             | The same examples given every time ollama is posted. Off
             | the top of my head the installer silently adds login items
             | with no way to opt out, spawns persistent processes in the
             | background in addition to the application with unclear
             | purposes, no info on install about the install itself,
             | doesn't let you back out of the installer when it requests
             | admin access. Basically lots of dark patterns in the non-
             | standard installer.
             | 
             | Reminds me of how Zoom got it start with the "growth
             | hacking" of the installation. Not enough to keep me from
             | using it, but enough for me to keep from using it for
             | anything serious or secure.
        
               | v3ss0n wrote:
               | Show me the code
        
               | MOARDONGZPLZ wrote:
               | Install it on MacOs. Observe for yourself. This is a
               | repeated problem mentioned in every thread. If you need
               | help on the part about checking to see how many processes
               | are running, let me know and I can assist. The rest are
               | things you will observe, step by step, during the install
               | process.
        
               | Patrick_Devine wrote:
               | These are some fair points. There definitely wasn't an
               | intention of "growth hacking", but just trying to get a
               | lot of things done with only a few people in a short
               | period of time. Requiring admin access really sucks
               | though and is something we've wanted to get rid of for a
               | while.
        
           | v3ss0n wrote:
           | Opensource project so you can find evidence of foul play .
           | Prove it or it is bs
        
         | jasonjmcghee wrote:
         | Ollama is an easy way to run local models on Mac/linux. See
         | https://ollama.ai they have a web UI and a terminal/server
         | approach
        
       | swyx wrote:
       | I know a few people privately unhappy that openai api
       | compatibility is becoming a community standard. Apart from some
       | awkwardness around data.choices.text.response and such
       | unnecessary defensive nesting in the schema, I don't really have
       | complaints.
       | 
       | wonder what pain points people have around the API becoming a
       | standard, and if anyone has taken a crack at any alternative
       | standards that people should consider.
        
         | minimaxir wrote:
         | That's why it's good as an _option_ to minimize friction and
         | reduce lock-in to OpenAI 's moat.
        
         | Patrick_Devine wrote:
         | TBH, we debated about this a lot before adding it. It's weird
         | being beholden to someone else's API which can dictate what
         | features we should (or shouldn't) be adding to our own project.
         | If we add something cool/new/different to Ollama will people
         | even be able to use it since there isn't an equivalent thing in
         | the OpenAI API?
        
           | minimaxir wrote:
           | That's more of a marketing problem than a technical problem.
           | If there is indeed a novel use case with a good demo example
           | that's not present in OpenAI's API, then people will use it.
           | And if it's _really_ novel, OpenAI will copy it into their
           | API and thus the problem is no longer an issue.
           | 
           | The power of open source!
        
             | Patrick_Devine wrote:
             | You're right that it's a marketing problem, but it's also a
             | technical problem. If tooling/projects are built around the
             | compat layer it makes it really difficult to consume those
             | features without having to rewrite a lot of stuff. It also
             | places a cognitive burden on developers to know which API
             | to use. That might not sound like a lot, but one of the
             | guiding principles around the project (and a big part of
             | its success) is to keep the user experience as simple as
             | possible.
        
           | satellite2 wrote:
           | At some point, (probably in a relatively close future), there
           | will be the AI Consortium (AIC) to decide what enters the
           | common API?
        
         | simonw wrote:
         | I want it to be documented.
         | 
         | I'm fine with it emerging as a community standard if there's a
         | REALLY robust specification for what the community considers to
         | be "OpenAI API compatible".
         | 
         | Crucially, that standard needs to stay stable even if OpenAI
         | have released a brand new feature this morning.
         | 
         | So I want the following:
         | 
         | - A very solid API specification, including error conditions
         | 
         | - A test suite that can be used to check that new
         | implementations conform to that specification
         | 
         | - A name. I want to know what it means when software claims to
         | be "compatible with OpenAI-API-Spec v3" (for example)
         | 
         | Right now telling me something is "OpenAI API compatible"
         | really isn't enough information. Which bits of that API? Which
         | particular date-in-time was it created to match?
        
       | bulbosaur123 wrote:
       | Anyone actually tested it with GPT4 api to see how well it
       | performs?
        
         | minimaxir wrote:
         | That's not what this announcement is: it's an I/O schema for
         | OSS local LLMs.
        
       | shay_ker wrote:
       | Is Ollama effectively a dockerized HTTP server that calls
       | llama.cpp directly? For the exception of this newly added OpenAI
       | API ;)
        
       | behnamoh wrote:
       | ollama seems like taking a page from langchain book: develop
       | something that's open source but get it so popular that attracts
       | VC money.
       | 
       | I never liked ollama, maybe because ollama builds on llama.cpp (a
       | project I truly respect) but adds so much marketing bs.
       | 
       | For example, the @ollama account on twitter keeps shitposting on
       | every possible thread to advertise ollama. The other day someone
       | posted something about their Mac setup and @ollama said: "You can
       | run ollama on that Mac."
       | 
       | I don't like it when +500 people are working tirelessly on
       | llama.cpp and then guys like langchain, ollama, etc. rip off the
       | benefits.
        
         | slimsag wrote:
         | Make something better, then. (I'm not being dismissive, I
         | really genuinely mean it - please do)
         | 
         | I don't know who is behind Ollama and don't really care about
         | them. I can agree with your disgust for VC 'open source'
         | projects. But there's a reason they become popular and get
         | investment: because they are valuable to people, and people use
         | them.
         | 
         | If Ollama was just a wrapper over llama.cpp, then everyone
         | would just use llama.cpp.
         | 
         | It's not just marketing, either. Compare the README of
         | llama.cpp to the Ollama homepage, notice the stark contrast of
         | how difficult getting llama.cpp connected to some dumb JS app
         | is compared to Ollama. That's why it becomes valuable.
         | 
         | The same thing happened with Docker and we're just now barely
         | getting a viable alternative after Docker as a company
         | imploded, Podman Desktop, and even then it still suffers from
         | major instability on e.g. modern macs.
         | 
         | The sooner open source devs in general learn to make their
         | projects usable by an average developer, the sooner it will be
         | competitive with these VC-funded 'open source' projects.
        
           | behnamoh wrote:
           | llama.cpp already has OpenAI compatible API.
           | 
           | It takes literally one line to install it (git clone and then
           | make).
           | 
           | It takes one line to run the server as mentioned on their
           | examples/server README.                   ./server -m <model>
           | <any additional arguments like mmlock>
        
           | homarp wrote:
           | >notice the stark contrast of how difficult getting llama.cpp
           | connected to some dumb JS app is compared to Ollama.
           | 
           | Sorry, I'm new to ollama 'ecosystem'.
           | 
           | From llama.cpp readme, I ctrl-F-ed "Node.js: withcatai/node-
           | llama-cpp" and from there, I got to
           | https://withcatai.github.io/node-llama-cpp/guide/
           | 
           | Can you explain how ollama does it 'easier' ?
        
         | FanaHOVA wrote:
         | ggml is also VC backed, so that has nothing to do with it.
        
       | samstave wrote:
       | I believe that for Human History, ANY FN THING posted re: LLMs,
       | AI, etc MUST submit an ELI5 Submission Statement.... as there are
       | people growing up and need the NOT BOFH recipe for BFYTW.
       | (Because Fuck You; Thats why)
        
       | tosh wrote:
       | I wonder why ollama didn't namespace the path (e.g. under
       | "/openai") but in any case this is great for interoperability.
        
       | syntaxing wrote:
       | Wow perfect timing. I personally love it. There's so many
       | projects out there that use OpenAI's API whether you like it or
       | not. I wanted to try this unit test writer notebook that OpenAI
       | has but with Ollama. It was such a pain in the ass to fix it that
       | I just didn't bother cause it was just for fun. Now it should be
       | 2 line of code change.
        
       | slimsag wrote:
       | Useful! At work we are building a better version of Copilot, and
       | support bringing your own LLM. Recently I've been adding an
       | 'OpenAI compatible' backend, so that if you can provide any
       | OpenAI compatible API endpoint, and just tell us which model to
       | treat it as, then we can format prompts, stop sequences, respect
       | max tokens, etc. according to the semantics of that model.
       | 
       | I've been needing something exactly like this to test against in
       | local dev environments :) Ollama having this will make my life /
       | testing against the myriad of LLMs we need to support way, way
       | easier.
       | 
       | Seems everyone is centralizing behind OpenAI API compatibility,
       | e.g. there is OpenLLM and a few others which implement the same
       | API as well.
        
       | ilaksh wrote:
       | I think it's a little misleading to say it's compatible with
       | OpenAI because I expect function or tool calling when you say
       | that.
       | 
       | It's nice that you have the role and content thing but that was
       | always fairly trivial to implement.
       | 
       | When it gets to agents you do need to execute actions. In the
       | agent hosting system I started, I included a scripting engine,
       | which makes me think that maybe I need to set up security and
       | permissions for the agent system and just let it run code. Which
       | is what I started.
       | 
       | So I guess I am not sure I really need the function/tool calling.
       | But if I see a bunch of people actually am standardizing on tool
       | calls then maybe I need it in my framework just because it will
       | be expected. Even if I have arbitrary script execution.
        
         | minimaxir wrote:
         | The documentation is upfront about which features are excluded:
         | https://github.com/ollama/ollama/blob/main/docs/openai.md
         | 
         | Function calling/tool choice is done at the application level
         | and currently there's no standard format, and the popular ones
         | are essentually inefficient bespoke system prompts:
         | https://github.com/langchain-ai/langchain/blob/master/libs/l...
        
         | osigurdson wrote:
         | It makes obvious sense to anyone with experience with OpenAI
         | APIs.
        
         | ianbicking wrote:
         | I was drawn to Gemini Pro because it had function/tool
         | calling... but it works terribly. (I haven't tried Gemini Ultra
         | yet; unclear if it's available by API?)
         | 
         | Anyway, probably best that they didn't release support that
         | doesn't work.
        
       | lolpanda wrote:
       | The compatibility layer can be also built in libraries. For
       | example, Langchain has llm() which can work with multiple LLM
       | backend. Which do you prefer?
        
         | mise_en_place wrote:
         | Before OpenAI released their app I was using langchain in a
         | system that I built. It was a very simple SMS interface to
         | LLMs. I preferred working with langchain's abstractions over
         | directly interfacing with the GPT4 API.
        
         | Szpadel wrote:
         | but this means you need each library to support each llm, and I
         | think this is the same issue what is with object storage where
         | basically everyone support S3 compatible API
         | 
         | it's great to have some standard API even if that's isn't
         | perfect, but having second API that allows you to use full
         | potential (like B2 for backblaze) is also fine
         | 
         | so there isn't one model fits all, and if your model have
         | different capabilities, then imo you should provide both
         | options
        
           | SOLAR_FIELDS wrote:
           | This is hopefully much better than the s3 situation due to
           | its simplicity. Many offerings that say "s3 compatible api"
           | often mean "we support like 30% of api endpoints". Granted
           | often the most common stuff is supported and some stuff in
           | the s3 api really only makes sense in AWS, but a good hunk of
           | the s3 api is just hard or annoying to implement and a lot of
           | vendors just don't bother. Which ends up being rather
           | annoying because you'll pick some vendor and try to use an s3
           | client with it only to find out you can't because of the 10%
           | of the calls your client needs to make that are unsupported.
        
       | osigurdson wrote:
       | Smart. When they do come, will the embedding vectors be OpenAI
       | compatible? I assume this is quite hard to do.
        
         | dragonwriter wrote:
         | Probably not, embedding vectors aren't conpatible across
         | different embedding models, and other tools presenting OAI-
         | compatible APIs don't use OAI-compatible embedding models
         | (e.g., oobabooga lets you configure different embeddings
         | models, but none of them produce compatible vectors to the OAI
         | ones.)
        
         | minimaxir wrote:
         | Embeddings as an I/O schema are just text-in, a list of numbers
         | out. There are very few embedding models which require enough
         | preprocessing to warrant an abstraction. (A soft example is the
         | new nomic-embed-text-v1, which requires adding prefix
         | annotations: https://huggingface.co/nomic-ai/nomic-embed-
         | text-v1 )
        
       | ptrhvns wrote:
       | FYI: the Linux installation script for Ollama works in the
       | "standard" style for tooling these days:                   curl
       | https://ollama.ai/install.sh | sh
       | 
       | However, that script asks for root-level privileges via sudo the
       | last time I checked. So, if you want the tool, you may want to
       | download the script and have a look at it, or modify it depending
       | on your needs.
        
         | riffic wrote:
         | we have package managers in this day and age, lol.
        
           | jampekka wrote:
           | Sadly most of them kinda suck, especially for packagers.
        
           | jazzyjackson wrote:
           | do package managers make promises that they only distribute
           | code that's been audited to not pwn you? I'm not sure I see
           | the difference if I decided I'm going to run someone's
           | software whether I install it with sudo apt install vs sudo
           | curl | bash
        
             | n_plus_1_acc wrote:
             | You are already trusting the maintainers of your distro by
             | running Software they compiled, if you installed _anything_
             | via the package manager. So it 's about the number of
             | people.
        
               | jazzyjackson wrote:
               | ok, so, I think i am trusting fewer people if I just run
               | the bash script provided by the people whose software i
               | want to run
        
       | lxe wrote:
       | Does ollama support loaders other than llamacpp? I'm using
       | oobabooga with exllama2 to run exl2 quants on a dual NVIDIA gpu,
       | and nothing else seems to beat performance of it.
        
       | ultrasaurus wrote:
       | The improvements in ease of use for locally hosting LLMs over the
       | last few months have been amazing. I was ranting about how easy
       | https://github.com/Mozilla-Ocho/llamafile is just a few hours ago
       | [1]. Now I'm torn as to which one to use :)
       | 
       | 1: Quite literally hours ago: https://euri.ca/blog/2024-llm-self-
       | hosting-is-easy-now/
        
       | init0 wrote:
       | Trying to openai am I missing something?                   import
       | OpenAI from 'openai'              const openai = new OpenAI({
       | baseURL: 'http://localhost:11434/v1',           apiKey: 'ollama',
       | // required but unused         })              const
       | chatCompletion = await
       | openai.chat.completions.create({           model: 'llama2',
       | messages: [{ role: 'user', content: 'Why is the sky blue?' }],
       | })
       | console.log(completion.choices[0].message.content)
       | 
       | I am getting the below error:                   return new
       | NotFoundError(status, error, message, headers);
       | ^         NotFoundError: 404 404 page not found
        
         | xena wrote:
         | Remove the v1
        
       ___________________________________________________________________
       (page generated 2024-02-08 23:00 UTC)