https://github.com/mpaepper/content-chatbot Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code + Explore + All features + Documentation + GitHub Skills + Blog * Solutions + For + Enterprise + Teams + Startups + Education + By Solution + CI/CD & Automation + DevOps + DevSecOps + Case Studies + Customer Stories + Resources * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles + Repositories + Topics + Trending + Collections * Pricing [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this user All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} mpaepper / content-chatbot Public * Notifications * Fork 0 * Star 6 Build a chatbot or Q&A bot of your website's content 6 stars 0 forks Star Notifications * Code * Issues 0 * Pull requests 0 * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Actions * Projects * Security * Insights mpaepper/content-chatbot This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 1 branch 0 tags Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/m] Use Git or checkout with SVN using the web URL. [gh repo clone mpaepp] Work fast with our official CLI. Learn more. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @mpaepper mpaepper Added images ... 81ec99d Mar 21, 2023 Added images 81ec99d Git stats * 8 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time imgs Added images March 21, 2023 21:27 README.md Added images March 21, 2023 21:27 ask_question.py Added script to ask questions with sources returned March 21, 2023 20:50 create_embeddings.py Added argparse to customize embeddings March 21, 2023 21:00 requirements.txt Added requirements.txt March 21, 2023 21:04 start_chat_app.py Added script to use as chatbot March 21, 2023 20:50 View code Your website content -> chatbot / Q&A agent Create your embeddings Answering a question while getting the answer source documents Starting a chatbot on your content README.md Your website content -> chatbot / Q&A agent Turn your website content into a question answering bot which can cite your document sources. Alternatively, use it in an interactive chatbot style fashion. All this can be achieved with a tool called langchain which in turn uses the OpenAI API. This simple repository showcases how to apply it on your own website content. To do so, there are three scripts: * create_embeddings.py: this is the main script which loops your website's sitemap.xml to create embeddings (vectors representing the semantics of your data) of your content * ask_question.py: after you have the embeddings (a file called faiss_store.pkl was created), this script can be used to directly ask a question. It will answer the question and return the URLs of your website which were used as the source. * start_chat_app.py: starts a simple chat interface where you can ask a question and then follow-up on the answer. If the bot is uncertain, it will indicate so. Note that you can tune the query in this script to be more relevant for your content. In my case I mentioned it to be specific to machine learning and technical topics. To install the dependencies, simply run pip install -r requirements.txt. Create your embeddings overview of the embedding process: each blog post is split into N documents and each document yields a vector representation. This is the most important step and you will need to obtain an OpenAI API key to use it. Once you have the $api_key, you can run export OPENAI_API_KEY= '$api_key' in your terminal. Then simply run python create_embeddings.py --sitemap https://path/to /your/sitemap.xml --filter https://path/to/your/blog/posts. This will create your embeddings in a file called faiss_store.pkl. You need to point your website's sitemap.xml to the script and you can filter for URL's to start with filter. If you want to include all pages of your site, you can just set --filter https://. For more details about this, please check this blog post. Answering a question while getting the answer source documents overview of the Q&A process: first we find the closest matches of our documents from the FAISS store and then we ask the question to the GPT3 API. With the embeddings set up, ask a question like this: python ask_question.py "How to detect objects in images?" Answer: Object detection in images can be done using algorithms such as R-CNN, Fast R-CNN, and data augmentation techniques such as shifting, rotations, elastic deformations, and gray value variations. Sources: https://www.paepper.com/blog/posts/deep-learning-on-medical-images-with-u-net/ https://www.paepper.com/blog/posts/end-to-end-object-detection-with-transformers/ Starting a chatbot on your content With the embeddings set up, start a chatbot like this: python start_chat_app.py. Then when it's running, ask your questions and follow-ups. About Build a chatbot or Q&A bot of your website's content Resources Readme Stars 6 stars Watchers 1 watching Forks 0 forks Releases No releases published Packages 0 No packages published Languages * Python 100.0% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.