https://github.com/pulzeai-oss/knn-router/tree/main/deploy/pulze-intent-v0.1 Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} pulzeai-oss / knn-router Public * Notifications * Fork 1 * Star 13 * * Code * Issues 0 * Pull requests 0 * Discussions * Actions * Projects 0 * Security * Insights Additional navigation options * Code * Issues * Pull requests * Discussions * Actions * Projects * Security * Insights Files main Breadcrumbs 1. knn-router 2. /deploy / pulze-intent-v0.1 / Directory actions [ ] More options Directory actions More options Latest commit History History main Breadcrumbs 1. knn-router 2. /deploy / pulze-intent-v0.1 / Top Folders and files Name Name Last commit Last commit message date parent directory .. .gitignore .gitignore README.md README.md docker-compose.yml docker-compose.yml k8s.yaml k8s.yaml View all files README.md pulze-intent-v0.1 (model, dataset) Intent-tuned LLM router that selects the best LLM for a user query. Usage Local Fetch artifacts from Huggingface: huggingface-cli download pulze/intent-v0.1 --local-dir .dist --local-dir-use-symlinks=False Start the services: docker compose up -d --build curl -s 127.0.0.1:8888/ \ -X POST \ -d '{"query":"give me instructions for making ramen at home"}' \ -H 'Content-Type: application/json' | jq . Output: { "hits": [ { "id": "0c571369-e985-41e1-b14b-3620c4bb40b5", "category": "writing_cooking_recipe", "similarity": 0.8069034 }, { "id": "9f44d3c0-95f5-43cc-a881-6e23adf9c68b", "category": "writing_cooking_recipe", "similarity": 0.778615 }, { "id": "21292586-73f3-4bf7-9ada-ae6917d4cd74", "category": "writing_cooking_recipe", "similarity": 0.77417636 }, { "id": "edd39535-f9b7-4188-a56e-055846d0ba23", "category": "writing_cooking_recipe", "similarity": 0.772714 }, { "id": "3cc563e1-1816-4ff1-8d2e-60fb9392c6de", "category": "writing_cooking_recipe", "similarity": 0.76833653 }, { "id": "15c9e10f-d217-418a-b367-a16e3cd3a541", "category": "writing_cooking_recipe", "similarity": 0.76015425 }, { "id": "ad33a141-269f-4a88-b99c-456cf67d9221", "category": "writing_cooking_recipe", "similarity": 0.75983727 }, { "id": "d6ee2a78-3b7a-44d6-9778-adc1e1f9a3db", "category": "writing_cooking_recipe", "similarity": 0.75918543 }, { "id": "afa1f32e-e69e-4d75-9a73-7a11b0259a24", "category": "writing_cooking_recipe", "similarity": 0.7565732 }, { "id": "5a569973-3934-49f6-8901-11f6b490a6cd", "category": "writing_cooking_recipe", "similarity": 0.7564193 } ], "scores": [ { "target": "gpt-3.5-turbo-0125", "score": 0.83 }, { "target": "command-r-plus", "score": 0.93 }, { "target": "llama-3-70b-instruct", "score": 0.95 }, { "target": "gpt-4-turbo-2024-04-09", "score": 0.96 }, { "target": "dbrx-instruct", "score": 0.91 }, { "target": "mixtral-8x7b-instruct", "score": 0.91 }, { "target": "mistral-small", "score": 0.9 }, { "target": "mistral-large", "score": 0.91 }, { "target": "mistral-medium", "score": 0.89 }, { "target": "claude-3-opus-20240229", "score": 0.91 }, { "target": "claude-3-sonnet-20240229", "score": 0.9 }, { "target": "command-r", "score": 0.88 }, { "target": "claude-3-haiku-20240307", "score": 0.89 } ] } Kubernetes See this example. Models * claude-3-haiku-20240307 * claude-3-opus-20240229 * claude-3-sonnet-20240229 * command-r * command-r-plus * dbrx-instruct * gpt-3.5-turbo-0125 * gpt-4-turbo-2024-04-09 * llama-3-70b-instruct * mistral-large * mistral-medium * mistral-small * mixtral-8x7b-instruct Data Prompts and Intent Categories Prompt and intent categories are derived from the GAIR-NLP/Auto-J scenario classification dataset. Citation: @article{li2023generative, title={Generative Judge for Evaluating Alignment}, author={Li, Junlong and Sun, Shichao and Yuan, Weizhe and Fan, Run-Ze and Zhao, Hai and Liu, Pengfei}, journal={arXiv preprint arXiv:2310.05470}, year={2023} } Response Evaluation Candidate model responses were evaluated pairwise using openai/ gpt-4-turbo-2024-04-09, with the following prompt: You are an expert, impartial judge tasked with evaluating the quality of responses generated by two AI assistants. Think step by step, and evaluate the responses, and to the instruction, . Follow these guidelines: - Avoid any position bias and ensure that the order in which the responses were presented does not influence your judgement - Do not allow the length of the responses to influence your judgement - a concise response can be as effective as a longer one - Consider factors such as adherence to the given instruction, helpfulness, relevance, accuracy, depth, creativity, and level of detail - Be as objective as possible Make your decision on which of the two responses is better for the given instruction from the following choices: If is better, use "1". If is better, use "2". If both answers are equally good, use "0". If both answers are equally bad, use "0". {INSTRUCTION} {RESPONSE1} {RESPONSE2} Each pair of models is subject to 2 matches, with the positions of the respective responses swapped in the evaluation prompt. A model is considered a winner only if it wins both matches. For each prompt, we then compute Bradley-Terry scores for the respective models using the same method as that used in the LMSYS Chatbot Arena Leaderboard. Finally, we normalize all scores to a scale from 0 to 1 for interoperability with other weighted ranking systems. Model The embedding model was generated by first fine-tuning BAAI/ bge-base-en-v1.5 with the intent categories from the dataset above, using contrastive learning with cosine similarity loss, and subsequently merging the resultant model with the base model at a 3:2 ratio. Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.