https://github.com/psychic-api/rag-stack

Skip to content Toggle navigation
 
Sign up

  * Product
      +  
        Actions
        Automate any workflow
      +  
        Packages
        Host and manage packages
      +  
        Security
        Find and fix vulnerabilities
      +  
        Codespaces
        Instant dev environments
      +  
        Copilot
        Write better code with AI
      +  
        Code review
        Manage code changes
      +  
        Issues
        Plan and track work
      +  
        Discussions
        Collaborate outside of code
    Explore
      + All features
      + Documentation
      + GitHub Skills
      + Blog
  * Solutions
    For
      + Enterprise
      + Teams
      + Startups
      + Education
    By Solution
      + CI/CD & Automation
      + DevOps
      + DevSecOps
    Resources
      + Customer Stories
      + White papers, Ebooks, Webinars
      + Partners
  * Open Source
      +  
        GitHub Sponsors
        Fund open source developers
      +  
        The ReadME Project
        GitHub community articles
    Repositories
      + Topics
      + Trending
      + Collections
  * Pricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search
[                    ]
Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

[                    ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name [                    ] 
Query [                    ]

To see all available qualifiers, see our documentation.

Cancel Create saved search
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session.
{{ message }}
psychic-api / rag-stack Public

  * Notifications
  * Fork 4
  * Star 53

 Deploy a private ChatGPT alternative hosted within your VPC. 
Connect it to your organization's knowledge base and use it as a
corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and
GPT4All.

License

MIT license
53 stars 4 forks
Star
Notifications

  * Code
  * Issues 1
  * Pull requests 1
  * Actions
  * Projects 0
  * Security
  * Insights

More

  * Code
  * Issues
  * Pull requests
  * Actions
  * Projects
  * Security
  * Insights

psychic-api/rag-stack

This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
main
Switch branches/tags
[                    ]
Branches Tags
Could not load branches
Nothing to show
{{ refName }} default View all branches
Could not load tags
Nothing to show
{{ refName }} default
View all tags

Name already in use

A tag already exists with the provided branch name. Many Git commands
accept both tag and branch names, so creating this branch may cause
unexpected behavior. Are you sure you want to create this branch?
Cancel Create
1 branch 0 tags
Code

  * Local
  * Codespaces

  *  
    Clone
    HTTPS GitHub CLI
    [https://github.com/p]

    Use Git or checkout with SVN using the web URL.

    [gh repo clone psychi]

    Work fast with our official CLI. Learn more about the CLI.

  * Open with GitHub Desktop
  * Download ZIP

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Ayan Bandyopadhyay Merge branch 'main' of https://github.com/
psychic-api/rag-stack
...
f3773f1 Jul 20, 2023
Merge branch 'main' of https://github.com/psychic-api/rag-stack
f3773f1

Git stats

  * 39 commits

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
containerize-llms
Update docker configs for llama build
July 19, 2023 08:02
gke-cluster
Update the gcp deploy script to deploy falcon to kubernetes
July 17, 2023 17:35
ragstack-ui
Add pdf parsing and connect UI to server
July 20, 2023 09:58
server
Add pdf parsing and connect UI to server
July 20, 2023 09:58
.gitignore
Update Gitignore
July 16, 2023 23:00
CONTRIBUTING.md
Add initial server scaffolding
July 13, 2023 16:07
LICENSE
Initial commit
July 12, 2023 14:48
README.md
Update README.md
July 20, 2023 09:27
deploy-gcp.sh
Update deploy script with hf api token variable
July 19, 2023 00:59
main.tf
Update deploy script with hf api token variable
July 19, 2023 00:59
run-dev.sh
add sample text, update readme, update UI
July 18, 2023 20:38
sample.txt
add sample text, update readme, update UI
July 18, 2023 20:38
View code
[                    ]
 RAGstack Open-source LLM Vector database Server + UI Run locally
Deploy to Google Cloud Roadmap Credits

README.md

  RAGstack

Deploy a private ChatGPT alternative hosted within your VPC. Connect
it to your organization's knowledge base and use it as a corporate
oracle. Supports open-source LLMs like Llama 2, Falcon, and GPT4All.

                        Slack Issues Twitter

Retrieval Augmented Generation (RAG) is a technique where the
capabilities of a large language model (LLM) are augmented by
retrieving information from other systems and inserting them into the
LLM's context window via a prompt. This gives LLMs information beyond
what was provided in their training data, which is necessary for
almost every enterprise use case. Examples include data from current
web pages, data from SaaS apps like Confluence or Salesforce, and
data from documents like sales contracts and PDFs.

RAG works better than fine-tuning the model because it's cheaper,
it's faster, and it's more reliable since the source of information
is provided with each response.

RAGstack deploys the following resources for retrieval-augmented
generation:

 Open-source LLM

  * GPT4All: When you run locally, RAGstack will download and deploy
    Nomic AI's gpt4all model, which runs on consumer CPUs.

  * Falcon-7b: On the cloud, RAGstack deploys Technology Innovation
    Institute's falcon-7b model onto a GPU-enabled GKE cluster.

  * LLama 2: On the cloud, RAGstack can also deploy the 7B paramter
    version of Meta's Llama 2 model onto a GPU-enabled GKE cluster.

 Vector database

  * Qdrant: Qdrant is an open-source vector database written in Rust,
    so it's highly performant and self-hostable.

 Server + UI

Simple server and UI that handles PDF upload, so that you can chat
over your PDFs using Qdrant and the open-source LLM of choice.

CleanShot 2023-07-18 at 20 36 49@2x

 Run locally

To run locally, run ./run-dev. This will download
ggml-gpt4all-j-v1.3-groovy.bin into server/llm/local/ and run the
server, LLM, and Qdrant vector database locally.

All services will be ready once you see the following message:

INFO:     Application startup complete.

 Deploy to Google Cloud

To deploy the RAG stack using Falcon-7B running on GPUs to your own
google cloud instance, go through the following steps:

 1. Run ./deploy-gcp.sh. This will prompt you for your GCP project
    ID, service account key file, and region.
 2. If you get an error on the Falcon-7B deployment step, run the
    following commands and then run ./deploy-gcp.sh again:

gcloud config set compute/zone YOUR-REGION-HERE
gcloud container clusters get-credentials gpu-cluster
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml

The deployment script was implemented using Terraform.

 3. You can run the frontend by creating a .env file in ragstack-ui
    and setting VITE_SERVER_URL to the url of the ragstack-server
    instance in your Google Cloud run.

 Roadmap

  *  GPT4all support
  *  Falcon-7b support
  *  Deployment on GCP
  *  Llama-2-40b support
  *  Deployment on AWS

 Credits

The code for containerizing Falcon 7B is from Het Trivedi's tutorial
repo. Check out his Medium article on how to dockerize Falcon here!

About

 Deploy a private ChatGPT alternative hosted within your VPC. 
Connect it to your organization's knowledge base and use it as a
corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and
GPT4All.

Resources

Readme

License

MIT license

Stars

53 stars

Watchers

4 watching

Forks

4 forks
Report repository

Releases

No releases published

Packages 0

No packages published

Contributors 2

  * @jasonwcfan jasonwcfan Jason Fan
  * @Ayan-Bandyopadhyay Ayan-Bandyopadhyay Ayan Bandyopadhyay

Languages

  * Python 75.2%
  * TypeScript 7.8%
  * HCL 6.5%
  * Shell 4.7%
  * Dockerfile 2.4%
  * CSS 1.8%
  * Other 1.6%

Footer

 (c) 2023 GitHub, Inc.

Footer navigation

  * Terms
  * Privacy
  * Security
  * Status
  * Docs
  * Contact GitHub
  * Pricing
  * API
  * Training
  * Blog
  * About

You can't perform that action at this time.