https://github.com/kelindar/search Skip to content Navigation Menu Toggle navigation Sign in * Product + GitHub Copilot Write better code with AI + Security Find and fix vulnerabilities + Actions Automate any workflow + Codespaces Instant dev environments + Issues Plan and track work + Code Review Manage code changes + Discussions Collaborate outside of code + Code Search Find more, search less Explore + All features + Documentation + GitHub Skills + Blog * Solutions By company size + Enterprises + Small and medium teams + Startups By use case + DevSecOps + DevOps + CI/CD + View all use cases By industry + Healthcare + Financial services + Manufacturing + Government + View all industries View all solutions * Resources Topics + AI + DevOps + Security + Software Development + View all Explore + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Enterprise + Enterprise platform AI-powered developer platform Available add-ons + Advanced Security Enterprise-grade security features + GitHub Copilot Enterprise-grade AI features + Premium Support Enterprise-grade 24/7 support * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up Reseting focus You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} kelindar / search Public * * Notifications You must be signed in to change notification settings * Fork 1 * Star 38 Go library for embedded vector search and semantic embeddings using llama.cpp License MIT license 38 stars 1 fork Branches Tags Activity Star Notifications You must be signed in to change notification settings * Code * Issues 1 * Pull requests 0 * Actions * Security * Insights Additional navigation options * Code * Issues * Pull requests * Actions * Security * Insights kelindar/search main BranchesTags Go to file Code Folders and files Name Name Last commit Last commit message date Latest commit History 81 Commits .github .github .vscode .vscode dist dist example example internal internal llama-go.cpp llama-go.cpp llama.cpp @ llama.cpp @ f1b8c42 f1b8c42 .gitattributes .gitattributes .gitignore .gitignore .gitmodules .gitmodules CMakeLists.txt CMakeLists.txt LICENSE LICENSE README.md README.md go.mod go.mod go.sum go.sum index.go index.go index_codec.go index_codec.go index_test.go index_test.go llama.go llama.go llama_test.go llama_test.go loader.go loader.go loader_test.go loader_test.go loader_unix.go loader_unix.go loader_windows.go loader_windows.go View all files Repository files navigation * README * MIT license kelindar/search Go Version PkgGoDev Go Report Card License Coverage Semantic Search This library was created to provide an easy and efficient solution for embedding and vector search, making it perfect for small to medium-scale projects that still need some serious semantic power. It's built around a simple idea: if your dataset is small enough, you can achieve accurate results with brute-force techniques, and with some smart optimizations like SIMD, you can keep things fast and lean. The library's strength lies in its simplicity and support for GGUF BERT models, letting you leverage sophisticated embeddings without getting bogged down by the complexities of traditional search systems. It offers GPU acceleration, enabling quick computations on supported hardware. If your dataset has fewer than 100,000 entries, this library is a great fit for integrating semantic search into your Go applications with minimal hassle. demo Key Features * llama.cpp without cgo: The library is built to work with llama.cpp without using cgo. Instead, it relies on purego , which allows calling shared C libraries directly from Go code without the need for cgo. This design significantly simplifies the integration, deployment, and cross-compilation, making it easier to build Go applications that interface with native libraries. * Support for BERT Models: The library supports BERT models via llama.cpp. Vast variations of BERT models can be used, as long as they are using GGUF format. * Precompiled Binaries with Vulkan GPU Support: Available for Windows and Linux in the dist directory, compiled with Vulkan for GPU acceleration. However, you can compile the library yourself with or without GPU support. * Search Index for Embeddings: The library supports the creation of a search index from computed embeddings, which can be saved to disk and loaded later. This feature is suitable for basic vector-based searches in small-scale applications, but it may face efficiency challenges with large datasets due to the use of brute-force techniques. Limitations While simple vector search excels in small-scale applications,avoid using this library if you have the following requirements. * Large Datasets: The current implementation is designed for small-scale applications, and datasets exceeding 100,000 entries may suffer from performance bottlenecks due to the brute-force search approach. For larger datasets, approximate nearest neighbor (ANN) algorithms and specialized data structures should be considered for efficiency. * Complex Query Requirements: The library focuses on simple vector similarity search and does not support advanced query capabilities like multi-field filtering, fuzzy matching, or SQL-like operations that are common in more sophisticated search engines. * High-Dimensional Complex Embeddings: Large language models (LLMs) generate embeddings that are both high-dimensional and computationally intensive. Handling these embeddings in real-time can be taxing on the system unless sufficient GPU resources are available and optimized for low-latency inference. How to Use the Library This example demonstrates how to use the library to generate embeddings for text and perform a simple vector search. The code snippet below shows how to load a model, generate embeddings for text, create a search index, and perform a search. 1. Install library: Precompiled binaries for Windows and Linux are provided in the dist directory. If your target architecture or platform isn't covered by these binaries, you'll need to compile the library from the source. Drop these binaries in /usr/lib or equivalent. 2. Load a model: The search.NewVectorizer function initializes a model using a GGUF file. This example loads the MiniLM-L6-v2.Q8_0.gguf model. The second parameter, indicates the number of GPU layers to enable (0 for CPU only). m, err := search.NewVectorizer("../dist/MiniLM-L6-v2.Q8_0.gguf", 0) if err != nil { // handle error } defer m.Close() 3. Generate text embeddings: The EmbedText method is used to generate vector embeddings for a given text input. This converts your text into a dense numerical vector representation given the model you loaded in the previous step. embedding, err := m.EmbedText("Your text here") 4. Create an index and adding vectors: Create a new index using search.NewIndex. The type parameter [string] in this example specifies that each vector is associated with a string value. You can add multiple vectors with corresponding labels. index := search.NewIndex[string]() index.Add(embedding, "Your text here") 5. Search the index: Perform a search using the Search method, which takes an embedding vector and a number of results to retrieve. This example searches for the 10 most relevant results and prints them along with their relevance scores. results := index.Search(embedding, 10) for _, r := range results { fmt.Printf("Result: %s (Relevance: %.2f)\n", r.Value, r.Relevance) } Compile library First, clone the repository and its submodules with the following commands. The --recurse-submodules flag is used to clone the ggml submodule, which is a header-only library for matrix operations. git submodule update --init --recursive git lfs pull Compile on Linux Make sure you have a C/C++ compiler and CMake installed. For Ubuntu, you can install them with the following commands: sudo apt-get update sudo apt-get install build-essential cmake Then you can compile the library with the following commands: mkdir build && cd build cmake -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=g++ -DCMAKE_C_COMPILER=gcc .. cmake --build . --config Release This should generate libllama_go.so that statically links everything necessary. You can also install the library by coping it into /usr/ lib. Compile on Windows Make sure you have a C/C++ compiler and CMake installed. For Windows, a simple option is to use Build Tools for Visual Studio (make sure CLI tools are included) and CMake. mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release .. cmake --build . --config Release If you are using Visual Studio, solution files are generated. You can open the solution file with Visual Studio and build the project from there. The bin directory would then contain llamago.dll. GPU and other options To enable GPU support (e.g. Vulkan), you'll need to add an appropriate flag to the CMake command, please refer to refer to the llama.cpp build documentation for more details. For example, to compile with Vulkan support on Windows make sure Vulkan SDK is installed and then run the following commands: mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release -DGGML_VULKAN=ON .. cmake --build . --config Release About Go library for embedded vector search and semantic embeddings using llama.cpp Topics search-engine ai gpu embeddings simd semantic-search bert vector-search llamacpp gguf Resources Readme License MIT license Activity Stars 38 stars Watchers 1 watching Forks 1 fork Report repository Releases 3 v0.3.0 Latest Oct 28, 2024 + 2 releases Sponsor this project Sponsor Learn more about GitHub Sponsors Packages 0 No packages published Languages * Go 50.7% * Assembly 31.5% * C++ 8.5% * C 5.7% * CMake 3.2% * Shell 0.4% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.