https://github.com/psychic-api/rag-stack Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Customer Stories + White papers, Ebooks, Webinars + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. {{ message }} psychic-api / rag-stack Public * Notifications * Fork 4 * Star 53 Deploy a private ChatGPT alternative hosted within your VPC. Connect it to your organization's knowledge base and use it as a corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and GPT4All. License MIT license 53 stars 4 forks Star Notifications * Code * Issues 1 * Pull requests 1 * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Actions * Projects * Security * Insights psychic-api/rag-stack This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 1 branch 0 tags Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/p] Use Git or checkout with SVN using the web URL. [gh repo clone psychi] Work fast with our official CLI. Learn more about the CLI. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit Ayan Bandyopadhyay Merge branch 'main' of https://github.com/ psychic-api/rag-stack ... f3773f1 Jul 20, 2023 Merge branch 'main' of https://github.com/psychic-api/rag-stack f3773f1 Git stats * 39 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time containerize-llms Update docker configs for llama build July 19, 2023 08:02 gke-cluster Update the gcp deploy script to deploy falcon to kubernetes July 17, 2023 17:35 ragstack-ui Add pdf parsing and connect UI to server July 20, 2023 09:58 server Add pdf parsing and connect UI to server July 20, 2023 09:58 .gitignore Update Gitignore July 16, 2023 23:00 CONTRIBUTING.md Add initial server scaffolding July 13, 2023 16:07 LICENSE Initial commit July 12, 2023 14:48 README.md Update README.md July 20, 2023 09:27 deploy-gcp.sh Update deploy script with hf api token variable July 19, 2023 00:59 main.tf Update deploy script with hf api token variable July 19, 2023 00:59 run-dev.sh add sample text, update readme, update UI July 18, 2023 20:38 sample.txt add sample text, update readme, update UI July 18, 2023 20:38 View code [ ] RAGstack Open-source LLM Vector database Server + UI Run locally Deploy to Google Cloud Roadmap Credits README.md RAGstack Deploy a private ChatGPT alternative hosted within your VPC. Connect it to your organization's knowledge base and use it as a corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and GPT4All. Slack Issues Twitter Retrieval Augmented Generation (RAG) is a technique where the capabilities of a large language model (LLM) are augmented by retrieving information from other systems and inserting them into the LLM's context window via a prompt. This gives LLMs information beyond what was provided in their training data, which is necessary for almost every enterprise use case. Examples include data from current web pages, data from SaaS apps like Confluence or Salesforce, and data from documents like sales contracts and PDFs. RAG works better than fine-tuning the model because it's cheaper, it's faster, and it's more reliable since the source of information is provided with each response. RAGstack deploys the following resources for retrieval-augmented generation: Open-source LLM * GPT4All: When you run locally, RAGstack will download and deploy Nomic AI's gpt4all model, which runs on consumer CPUs. * Falcon-7b: On the cloud, RAGstack deploys Technology Innovation Institute's falcon-7b model onto a GPU-enabled GKE cluster. * LLama 2: On the cloud, RAGstack can also deploy the 7B paramter version of Meta's Llama 2 model onto a GPU-enabled GKE cluster. Vector database * Qdrant: Qdrant is an open-source vector database written in Rust, so it's highly performant and self-hostable. Server + UI Simple server and UI that handles PDF upload, so that you can chat over your PDFs using Qdrant and the open-source LLM of choice. CleanShot 2023-07-18 at 20 36 49@2x Run locally To run locally, run ./run-dev. This will download ggml-gpt4all-j-v1.3-groovy.bin into server/llm/local/ and run the server, LLM, and Qdrant vector database locally. All services will be ready once you see the following message: INFO: Application startup complete. Deploy to Google Cloud To deploy the RAG stack using Falcon-7B running on GPUs to your own google cloud instance, go through the following steps: 1. Run ./deploy-gcp.sh. This will prompt you for your GCP project ID, service account key file, and region. 2. If you get an error on the Falcon-7B deployment step, run the following commands and then run ./deploy-gcp.sh again: gcloud config set compute/zone YOUR-REGION-HERE gcloud container clusters get-credentials gpu-cluster kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml The deployment script was implemented using Terraform. 3. You can run the frontend by creating a .env file in ragstack-ui and setting VITE_SERVER_URL to the url of the ragstack-server instance in your Google Cloud run. Roadmap * GPT4all support * Falcon-7b support * Deployment on GCP * Llama-2-40b support * Deployment on AWS Credits The code for containerizing Falcon 7B is from Het Trivedi's tutorial repo. Check out his Medium article on how to dockerize Falcon here! About Deploy a private ChatGPT alternative hosted within your VPC. Connect it to your organization's knowledge base and use it as a corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and GPT4All. Resources Readme License MIT license Stars 53 stars Watchers 4 watching Forks 4 forks Report repository Releases No releases published Packages 0 No packages published Contributors 2 * @jasonwcfan jasonwcfan Jason Fan * @Ayan-Bandyopadhyay Ayan-Bandyopadhyay Ayan Bandyopadhyay Languages * Python 75.2% * TypeScript 7.8% * HCL 6.5% * Shell 4.7% * Dockerfile 2.4% * CSS 1.8% * Other 1.6% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time.