https://github.com/pulzeai-oss/knn-router/tree/main/deploy/pulze-intent-v0.1

Skip to content

Navigation Menu

Toggle navigation
 
Sign in

  * Product
      +  
        Actions
        Automate any workflow
      +  
        Packages
        Host and manage packages
      +  
        Security
        Find and fix vulnerabilities
      +  
        Codespaces
        Instant dev environments
      +  
        Copilot
        Write better code with AI
      +  
        Code review
        Manage code changes
      +  
        Issues
        Plan and track work
      +  
        Discussions
        Collaborate outside of code
    Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
    For
      + Enterprise
      + Teams
      + Startups
      + Education
    By Solution
      + CI/CD & Automation
      + DevOps
      + DevSecOps
    Resources
      + Learning Pathways
      + White papers, Ebooks, Webinars
      + Customer Stories
      + Partners
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
    Repositories
      + Topics
      + Trending
      + Collections
  * Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search
[                    ]
Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

[                    ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name [                    ] 
Query [                    ]

To see all available qualifiers, see our documentation.

Cancel Create saved search
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session. Dismiss alert
{{ message }}
pulzeai-oss / knn-router Public

  * Notifications
  * Fork 1
  * Star 13
  * 

  * Code
  * Issues 0
  * Pull requests 0
  * Discussions
  * Actions
  * Projects 0
  * Security
  * Insights

Additional navigation options

  * Code
  * Issues
  * Pull requests
  * Discussions
  * Actions
  * Projects
  * Security
  * Insights

Files

 main

Breadcrumbs

 1. knn-router
 2. /deploy

/

pulze-intent-v0.1

/

Directory actions

[                    ]

More options

Directory actions

More options

Latest commit

 

History

History
 
 main

Breadcrumbs

 1. knn-router
 2. /deploy

/

pulze-intent-v0.1

/
Top

Folders and files

       Name                Name          Last commit     Last commit
                                           message          date
parent directory

 
..
.gitignore          .gitignore                           

README.md           README.md                            

docker-compose.yml  docker-compose.yml                   

k8s.yaml            k8s.yaml                             

View all files

README.md

pulze-intent-v0.1 (model, dataset)

 

Intent-tuned LLM router that selects the best LLM for a user query.

Usage

 

Local

 

Fetch artifacts from Huggingface:

huggingface-cli download pulze/intent-v0.1 --local-dir .dist --local-dir-use-symlinks=False

Start the services:

docker compose up -d --build

curl -s 127.0.0.1:8888/ \
    -X POST \
    -d '{"query":"give me instructions for making ramen at home"}' \
    -H 'Content-Type: application/json' | jq .

Output:

{
  "hits": [
    {
      "id": "0c571369-e985-41e1-b14b-3620c4bb40b5",
      "category": "writing_cooking_recipe",
      "similarity": 0.8069034
    },
    {
      "id": "9f44d3c0-95f5-43cc-a881-6e23adf9c68b",
      "category": "writing_cooking_recipe",
      "similarity": 0.778615
    },
    {
      "id": "21292586-73f3-4bf7-9ada-ae6917d4cd74",
      "category": "writing_cooking_recipe",
      "similarity": 0.77417636
    },
    {
      "id": "edd39535-f9b7-4188-a56e-055846d0ba23",
      "category": "writing_cooking_recipe",
      "similarity": 0.772714
    },
    {
      "id": "3cc563e1-1816-4ff1-8d2e-60fb9392c6de",
      "category": "writing_cooking_recipe",
      "similarity": 0.76833653
    },
    {
      "id": "15c9e10f-d217-418a-b367-a16e3cd3a541",
      "category": "writing_cooking_recipe",
      "similarity": 0.76015425
    },
    {
      "id": "ad33a141-269f-4a88-b99c-456cf67d9221",
      "category": "writing_cooking_recipe",
      "similarity": 0.75983727
    },
    {
      "id": "d6ee2a78-3b7a-44d6-9778-adc1e1f9a3db",
      "category": "writing_cooking_recipe",
      "similarity": 0.75918543
    },
    {
      "id": "afa1f32e-e69e-4d75-9a73-7a11b0259a24",
      "category": "writing_cooking_recipe",
      "similarity": 0.7565732
    },
    {
      "id": "5a569973-3934-49f6-8901-11f6b490a6cd",
      "category": "writing_cooking_recipe",
      "similarity": 0.7564193
    }
  ],
  "scores": [
    {
      "target": "gpt-3.5-turbo-0125",
      "score": 0.83
    },
    {
      "target": "command-r-plus",
      "score": 0.93
    },
    {
      "target": "llama-3-70b-instruct",
      "score": 0.95
    },
    {
      "target": "gpt-4-turbo-2024-04-09",
      "score": 0.96
    },
    {
      "target": "dbrx-instruct",
      "score": 0.91
    },
    {
      "target": "mixtral-8x7b-instruct",
      "score": 0.91
    },
    {
      "target": "mistral-small",
      "score": 0.9
    },
    {
      "target": "mistral-large",
      "score": 0.91
    },
    {
      "target": "mistral-medium",
      "score": 0.89
    },
    {
      "target": "claude-3-opus-20240229",
      "score": 0.91
    },
    {
      "target": "claude-3-sonnet-20240229",
      "score": 0.9
    },
    {
      "target": "command-r",
      "score": 0.88
    },
    {
      "target": "claude-3-haiku-20240307",
      "score": 0.89
    }
  ]
}

Kubernetes

 

See this example.

Models

 

  * claude-3-haiku-20240307
  * claude-3-opus-20240229
  * claude-3-sonnet-20240229
  * command-r
  * command-r-plus
  * dbrx-instruct
  * gpt-3.5-turbo-0125
  * gpt-4-turbo-2024-04-09
  * llama-3-70b-instruct
  * mistral-large
  * mistral-medium
  * mistral-small
  * mixtral-8x7b-instruct

Data

 

Prompts and Intent Categories

 

Prompt and intent categories are derived from the GAIR-NLP/Auto-J
scenario classification dataset.

Citation:

@article{li2023generative,
  title={Generative Judge for Evaluating Alignment},
  author={Li, Junlong and Sun, Shichao and Yuan, Weizhe and Fan, Run-Ze and Zhao, Hai and Liu, Pengfei},
  journal={arXiv preprint arXiv:2310.05470},
  year={2023}
}

Response Evaluation

 

Candidate model responses were evaluated pairwise using openai/
gpt-4-turbo-2024-04-09, with the following prompt:

You are an expert, impartial judge tasked with evaluating the quality of responses generated by two AI assistants.

Think step by step, and evaluate the responses, <response1> and <response2> to the instruction, <instruction>. Follow these guidelines:
- Avoid any position bias and ensure that the order in which the responses were presented does not influence your judgement
- Do not allow the length of the responses to influence your judgement - a concise response can be as effective as a longer one
- Consider factors such as adherence to the given instruction, helpfulness, relevance, accuracy, depth, creativity, and level of detail
- Be as objective as possible

Make your decision on which of the two responses is better for the given instruction from the following choices:
If <response1> is better, use "1".
If <response2> is better, use "2".
If both answers are equally good, use "0".
If both answers are equally bad, use "0".

<instruction>
{INSTRUCTION}
</instruction>

<response1>
{RESPONSE1}
</response1>

<response2>
{RESPONSE2}
</response2>

Each pair of models is subject to 2 matches, with the positions of
the respective responses swapped in the evaluation prompt. A model is
considered a winner only if it wins both matches.

For each prompt, we then compute Bradley-Terry scores for the
respective models using the same method as that used in the LMSYS
Chatbot Arena Leaderboard. Finally, we normalize all scores to a
scale from 0 to 1 for interoperability with other weighted ranking
systems.

Model

 

The embedding model was generated by first fine-tuning BAAI/
bge-base-en-v1.5 with the intent categories from the dataset above,
using contrastive learning with cosine similarity loss, and
subsequently merging the resultant model with the base model at a 3:2
ratio.

Footer

 (c) 2024 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact
  * Manage cookies
  * Do not share my personal information

You can't perform that action at this time.