https://github.com/wolfia-app/gpt-code-search Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Case Studies + Customer Stories + Resources * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this organization All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. {{ message }} wolfia-app / gpt-code-search Public * Notifications * Fork 4 * Star 21 Search your codebase with natural language using AI wolfia.com License Apache-2.0 license 21 stars 4 forks Star Notifications * Code * Issues 0 * Pull requests 0 * Discussions * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Discussions * Actions * Projects * Security * Insights wolfia-app/gpt-code-search This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 1 branch 5 tags Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/w] Use Git or checkout with SVN using the web URL. [gh repo clone wolfia] Work fast with our official CLI. Learn more about the CLI. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @narenmanoharan narenmanoharan Update data privacy section ... a6a3fe3 Jun 27, 2023 Update data privacy section a6a3fe3 Git stats * 53 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github Reduce python requirement to 3.8.17 and above June 26, 2023 10:49 core Handle list or dict from gpt June 27, 2023 07:20 public Update gif June 27, 2023 09:39 .bandit Add tests for truncate tests and run it on github workflow June 25, 2023 22:18 .gitattributes Add git attrs for tracking gifs June 26, 2023 22:32 .gitignore Add posthog for logging query events June 25, 2023 13:57 .pre-commit-config.yaml Add tests for truncate tests and run it on github workflow June 25, 2023 22:18 CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md June 23, 2023 12:53 CONTRIBUTING.md Handle list or dict from gpt June 27, 2023 07:20 LICENSE Create LICENSE June 23, 2023 12:18 README.md Update data privacy section June 27, 2023 12:42 poetry.lock Reduce python requirement to 3.8.17 and above June 26, 2023 10:49 pyproject.toml Bump version to v0.0.9 June 27, 2023 07:21 View code [ ] gpt-code-search Features Getting Started Installation Usage Ask a question about your codebase Select a model to use Configuration Problem How it works Privacy Limitations Roadmap Wolfia Codex Analytics Contributing Code of Conduct Support Feedback License README.md gpt-code-search logo gpt-code-search is a tool enabling you to search your codebase with natural language. It utilizes OpenAI's function calling to retrieve, search and answer queries about your code, boosting productivity and code understanding. Learn more about the motivation behind this project in our announcement blog post. Features * GPT-4: Code search, retrieval, and answering all done with OpenAI's function calling. * Privacy-first: Code snippets only leave your machine when you ask a question and the LLM requests the relevant code. * Works instantly: No pre-processing, chunking, or indexing, get started right away. * File-system backed: Works with any code on your machine. Getting Started Installation pip install gpt-code-search Usage Ask a question about your codebase To query about the purpose of your codebase, you can use the query command: gpt-code-search query "What does this codebase do?" # or use the shorthand alias gcs query "What does this codebase do?" gpt-code-search demo If you want to generate a test for a specific file, for example analytics.py, you can mention the file name to improve accuracy: gcs query "Can you generate a test for analytics.py?" For a general usage question about a certain module, like analytics, you can use keywords to search across the codebase: gcs query "How do I use the analytics module?" Remember, mentioning the file name or specific keywords improves the accuracy of the search. Select a model to use gcs select-model Defaults to gpt-3.5-turbo-16k. The selected model is stored in ~/ $HOME/.gpt-code-search/config.toml. Configuration The tool will prompt you to configure the OPENAI_API_KEY, if you haven't already. Problem You want to leverage the power of GPT-4 to search your codebase, but you don't want to manually copy and paste code snippets into a prompt nor send your code to another third-party service. This tool solves these problems by letting GPT-4 determine the most relevant code snippets within your codebase. This removes the need to copy and paste or send your code to another third-party. Also, it meets you where you already live, in your terminal, not a new UI or window. Examples of the types of questions you might want to ask: * Help debugging errors and finding the relevant code and files * Document large files or functionalities formatted as markdown * [?] Generate new code based on existing files and conventions * Ask general questions about any part of the codebase How it works This tool utilizes OpenAI's function calling to allow GPT to call functions in your codebase. This enables us to automatically upload context directly from the file system on-demand, without having to manually copy and paste code snippets. This also means that no code is sent to any third-party service (other than OpenAI), only the question you ask and the code snippets that are requested by the LLM. [architecture] The functions currently available for the LLM to call are: * search_codebase - searches the codebase using a TF-IDF vectorizer * get_file_tree - provides the file tree of the codebase * get_file_contents - provides the contents of a file Combining these three functions, we can ask the LLM to search the codebase for a keyword, and then retrieve the contents of the file that contains the keyword. And it's as simple as that! Privacy This tool prioritizes privacy. Outside of the LLM, no code is sent to us and is only used as context for the LLM. We do collect anonymous usage data to improve the tool, but you can opt out of this. Limitations This does have some limitations, namely: * The LLM is unable to load context across multiple files at once. This means that if you ask a question that requires context from multiple files, you will need to ask multiple questions. * Specify the file name and keywords in your question to improve accuracy. For example, if you want to ask a question about analytics.py, mention the file name in your question. * The level of search and retrieval is limited by the context window, which refers to the scope of the search conducted by the tool, meaning that we can only search 5 levels deep in the file system. So you need to run the tool from the folder/package closest to the code you want to search. These limitations lead to suboptimal results in a few cases, but we're working on improving this. We wanted to get this tool out there as soon as possible to get feedback and iterate on it! Roadmap * [ ] Use vector embeddings to improve search and retrieval * [ ] Add support for generating code and saving it to a file * [ ] Support for searching across multiple codebases * [ ] Allow the model to create new functions that it can then execute * [ ] Use guidance to improve prompts * [ ] Add support for additional models (Claude, Bedrock, etc) Wolfia Codex gpt-code-search is a simplified version of Wolfia Codex, a cloud tool that enables you to ask any question about open source and private code bases like Langchain, Vercel ai, or gpt-engineer. If you're looking for a more powerful tool which solves the above limitations by using vector embeddings and a more powerful search and retrieval system, or avoiding the setup, check out Wolfia Codex, search codebases, share your questions and answers, and more! Analytics We collect anonymous crash and usage data to help us improve the tool. This data aids in understanding usage patterns and improving the tool. You can opt out of analytics by running: gcs opt-out-of-analytics You can check the data that by looking at the analytics and config files. Here's an exhaustive list of the data we collect: - exception - stacktraces of crashes - uuid - a unique identifier for the user - model - the model used for the query - usage - the type of usage (query_count, query_at, query_execution_time) Note: We do not collect any PII (ip-address), queries or code snippets. Contributing We love contributions from the community! [?] If you'd like to contribute, feel free to fork the repository and submit a pull request. Please read our Code of Conduct and Contributing Guide for more detailed steps and information. Code of Conduct We are committed to fostering a welcoming community. To ensure that everyone feels safe and welcome, we have a Code of Conduct that all contributors, maintainers, and users of this project are expected to adhere to. Support If you're having trouble using gpt-code-search, feel free to open an issue on our GitHub. You can also reach out to us directly at support@wolfia.com. We're always happy to help! Feedback Your feedback is very important to us! If you have ideas for how we can improve gpt-code-search, we'd love to hear from you. Please open an issue with your suggestions, or you can email support@wolfia.com. License Apache 2.0 (c) Wolfia About Search your codebase with natural language using AI wolfia.com Topics privacy code gpt-4 llm Resources Readme License Apache-2.0 license Code of conduct Code of conduct Stars 21 stars Watchers 3 watching Forks 4 forks Report repository Releases 5 tags Contributors 2 * @narenmanoharan narenmanoharan Naren Manoharan * @skovy skovy Spencer Miskoviak Languages * Python 100.0% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time.