https://github.com/quickwit-oss/tantivy Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} quickwit-oss / tantivy Public * * Notifications You must be signed in to change notification settings * Fork 596 * Star 10.2k * Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust License MIT license 10.2k stars 596 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings * Code * Issues 267 * Pull requests 45 * Actions * Projects 1 * Wiki * Security * Insights Additional navigation options * Code * Issues * Pull requests * Actions * Projects * Wiki * Security * Insights quickwit-oss/tantivy This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. main BranchesTags Go to file Code Folders and files Name Name Last commit message Last commit date Latest commit History 3,190 Commits .github .github benches benches bitpacker bitpacker columnar columnar common common doc doc examples examples ownedbytes ownedbytes query-grammar query-grammar src src sstable sstable stacker stacker tests/ tests/failpoints failpoints tokenizer-api tokenizer-api .gitignore .gitignore ARCHITECTURE.md ARCHITECTURE.md AUTHORS AUTHORS CHANGELOG.md CHANGELOG.md Cargo.toml Cargo.toml LICENSE LICENSE Makefile Makefile README.md README.md RELEASE.md RELEASE.md TODO.txt TODO.txt cliff.toml cliff.toml rustfmt.toml rustfmt.toml View all files Repository files navigation * README * MIT license Docs Build Status codecov Join the chat at https://discord.gg/ MT27AG5EVE License: MIT Crates.io Tantivy, the fastest full-text search engine library written in Rust Fast full-text search engine library written in Rust If you are looking for an alternative to Elasticsearch or Apache Solr, check out Quickwit, our distributed search engine built on top of Tantivy. Tantivy is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine. Tantivy is, in fact, strongly inspired by Lucene's design. Benchmark The following benchmark breakdowns performance for different types of queries/collections. Your mileage WILL vary depending on the nature of queries and their load. [searchbenc] Details about the benchmark can be found at this repository. Features * Full-text search * Configurable tokenizer (stemming available for 17 Latin languages) with third party support for Chinese (tantivy-jieba and cang-jie), Japanese (lindera, Vaporetto, and tantivy-tokenizer-tiny-segmenter) and Korean (lindera + lindera-ko-dic-builder) * Fast (check out the benchmark ) * Tiny startup time (<10ms), perfect for command-line tools * BM25 scoring (the same as Lucene) * Natural query language (e.g. (michael AND jackson) OR "king of pop") * Phrase queries search (e.g. "michael jackson") * Incremental indexing * Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop) * Mmap directory * SIMD integer compression when the platform/CPU includes the SSE2 instruction set * Single valued and multivalued u64, i64, and f64 fast fields (equivalent of doc values in Lucene) * &[u8] fast fields * Text, i64, u64, f64, dates, ip, bool, and hierarchical facet fields * Compressed document store (LZ4, Zstd, None) * Range queries * Faceted search * Configurable indexing (optional term frequency and position indexing) * JSON Field * Aggregation Collector: histogram, range buckets, average, and stats metrics * LogMergePolicy with deletes * Searcher Warmer API * Cheesy logo with a horse Non-features Distributed search is out of the scope of Tantivy, but if you are looking for this feature, check out Quickwit. Getting started Tantivy works on stable Rust and supports Linux, macOS, and Windows. * Tantivy's simple search example * tantivy-cli and its tutorial - tantivy-cli is an actual command-line interface that makes it easy for you to create a search engine, index documents, and search via the CLI or a small server with a REST API. It walks you through getting a Wikipedia search engine up and running in a few minutes. * Reference doc for the last released version How can I support this project? There are many ways to support this project. * Use Tantivy and tell us about your experience on Discord or by email (paul.masurel@gmail.com) * Report bugs * Write a blog post * Help with documentation by asking questions or submitting PRs * Contribute code (you can join our Discord server) * Talk about Tantivy around you Contributing code We use the GitHub Pull Request workflow: reference a GitHub ticket and/or include a comprehensive commit message when opening a PR. Feel free to update CHANGELOG.md with your contribution. Tokenizer When implementing a tokenizer for tantivy depend on the tantivy-tokenizer-api crate. Clone and build locally Tantivy compiles on stable Rust. To check out and run tests, you can simply run: git clone https://github.com/quickwit-oss/tantivy.git cd tantivy cargo test Companies Using Tantivy Etsy Nuclia Humanfirst.ai Element.io Nuclia Humanfirst.ai Element.io FAQ Can I use Tantivy in other languages? * Python - tantivy-py * Ruby - tantiny You can also find other bindings on GitHub but they may be less maintained. What are some examples of Tantivy use? * seshat: A matrix message database/indexer * tantiny: Tiny full-text search for Ruby * lnx: adaptable, typo tolerant search engine with a REST API * and more! On average, how much faster is Tantivy compared to Lucene? * According to our search latency benchmark, Tantivy is approximately 2x faster than Lucene. Does tantivy support incremental indexing? * Yes. How can I edit documents? * Data in tantivy is immutable. To edit a document, the document needs to be deleted and reindexed. When will my documents be searchable during indexing? * Documents will be searchable after a commit is called on an IndexWriter. Existing IndexReaders will also need to be reloaded in order to reflect the changes. Finally, changes are only visible to newly acquired Searcher. About Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust Topics rust search-engine Resources Readme License MIT license Activity Custom properties Stars 10.2k stars Watchers 141 watching Forks 596 forks Report repository Releases 46 Tantivy v0.22 Latest Apr 12, 2024 + 45 releases Sponsor this project Packages 0 No packages published Used by 1.4k * @cashmerepipeline * @danthegoodman1 * @Jhair-andree * @tree-jhk * @secureonelabs * @tontinton * @superspaceHQ * @Cords-AI + 1,375 Contributors 141 * @fulmicoton * @PSeitz * @lnicola * @dependabot[bot] * @guilload * @trinity-1686a * @adamreichold * @fmassot * @evanxg852000 * @currymj * @saroh * @waywardmonkeys * @boraarslan * @petr-tik + 127 contributors Languages * Rust 100.0% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.