https://github.com/phiresky/ripgrep-all Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + GitHub Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions By size + Enterprise + Teams + Startups By industry + Healthcare + Financial services + Manufacturing By use case + CI/CD & Automation + DevOps + DevSecOps * Resources Topics + AI + DevOps + Security + Software Development + View all Explore + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Enterprise + Enterprise platform AI-powered developer platform Available add-ons + Advanced Security Enterprise-grade security features + GitHub Copilot Enterprise-grade AI features + Premium Support Enterprise-grade 24/7 support * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up Reseting focus You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} phiresky / ripgrep-all Public * Notifications You must be signed in to change notification settings * Fork 155 * Star 7k rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc. License View license 7k stars 155 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings * Code * Issues 32 * Pull requests 4 * Discussions * Actions * Projects 0 * Wiki * Security * Insights Additional navigation options * Code * Issues * Pull requests * Discussions * Actions * Projects * Wiki * Security * Insights phiresky/ripgrep-all This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. master BranchesTags Go to file Code Folders and files Name Name Last commit Last commit message date Latest commit History 415 Commits .github .github .vscode .vscode ci ci doc doc exampledir exampledir src src .envrc .envrc .gitignore .gitignore .typos.toml .typos.toml CHANGELOG.md CHANGELOG.md Cargo.lock Cargo.lock Cargo.toml Cargo.toml LICENSE.md LICENSE.md README.md README.md flake.lock flake.lock flake.nix flake.nix rust-toolchain.toml rust-toolchain.toml rustfmt.toml rustfmt.toml View all files Repository files navigation * README * License rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc. rga is a line-oriented search tool that allows you to look for a regex in a multitude of file types. rga wraps the awesome ripgrep and enables it to search in pdf, docx, sqlite, jpg, movie subtitles (mkv, mp4), etc. github repo Crates.io fearless concurrency For more detail, see this introductory blogpost: https:// phiresky.github.io/blog/2019/ rga--ripgrep-for-zip-targz-docx-odt-epub-jpg/ rga will recursively descend into archives and match text in every file type it knows. Here is an example directory with different file types: demo/ +-- greeting.mkv +-- hello.odt +-- hello.sqlite3 +-- somearchive.zip +-- dir | +-- greeting.docx | +-- inner.tar.gz | +-- greeting.pdf +-- greeting.epub rga output Integration with fzf rga-fzf See the wiki for instructions of integrating rga with fzf. INSTALLATION Linux x64, macOS and Windows binaries are available in GitHub Releases. Linux Arch Linux pacman -S ripgrep-all Nix nix-env -iA nixpkgs.ripgrep-all Debian-based download the rga binary and get the dependencies like this: apt install ripgrep pandoc poppler-utils ffmpeg If ripgrep is not included in your package sources, get it from here. rga will search for all binaries it calls in $PATH and the directory itself is in. Windows Note that installing via chocolatey or scoop is the only supported download method. If you download the binary from releases manually, you will not get the dependencies (for example pdftotext from poppler). If you get an error like VCRUNTIME140.DLL could not be found, you need to install vc_redist.x64.exe. Chocolatey choco install ripgrep-all Scoop scoop install rga Homebrew/Linuxbrew rga can be installed with Homebrew: brew install rga To install the dependencies that are each not strictly necessary but very useful: brew install pandoc poppler ffmpeg MacPorts rga can also be installed on macOS via MacPorts: sudo port install ripgrep-all Compile from source rga should compile with stable Rust (v1.75.0+, check with rustc --version). To build it, run the following (or the equivalent in your OS): ~$ apt install build-essential pandoc poppler-utils ffmpeg ripgrep cargo ~$ cargo install --locked ripgrep_all ~$ rga --version # this should work now Available Adapters rga works with adapters that adapt various file formats. It comes with a few adapters integrated: rga --rga-list-adapters You can also add custom adapters. See the wiki for more information. Adapters: * pandoc Uses pandoc to convert binary/unreadable text documents to plain markdown-like text Runs: pandoc --from= --to=plain --wrap= none --markdown-headings=atx Extensions: .epub, .odt, .docx, .fb2, .ipynb, .html, .htm * poppler Uses pdftotext (from poppler-utils) to extract plain text from PDF files Runs: pdftotext - - Extensions: .pdf Mime Types: application/pdf * postprocpagebreaks Adds the page number to each line for an input file that specifies page breaks as ascii page break character. Mainly to be used internally by the poppler adapter. Extensions: .asciipagebreaks * ffmpeg Uses ffmpeg to extract video metadata/chapters, subtitles, lyrics, and other metadata Extensions: .mkv, .mp4, .avi, .mp3, .ogg, .flac, .webm * zip Reads a zip file as a stream and recurses down into its contents Extensions: .zip, .jar Mime Types: application/zip * decompress Reads compressed file as a stream and runs a different extractor on the contents. Extensions: .als, .bz2, .gz, .tbz, .tbz2, .tgz, .xz, .zst Mime Types: application/gzip, application/x-bzip, application/ x-xz, application/zstd * tar Reads a tar file as a stream and recurses down into its contents Extensions: .tar * sqlite Uses sqlite bindings to convert sqlite databases into a simple plain text format Extensions: .db, .db3, .sqlite, .sqlite3 Mime Types: application/x-sqlite3 The following adapters are disabled by default, and can be enabled using '--rga-adapters=+foo,bar': * mail Reads mailbox/mail files and runs extractors on the contents and attachments. Extensions: .mbox, .mbx, .eml Mime Types: application/mbox, message/rfc822 USAGE: rga [RGA OPTIONS] [RG OPTIONS] PATTERN [PATH ...] FLAGS: --rga-accurate Use more accurate but slower matching by mime type By default, rga will match files using file extensions. Some programs, such as sqlite3, don't care about the file extension at all, so users sometimes use any or no extension at all. With this flag, rga will try to detect the mime type of input files using the magic bytes (similar to the `file` utility), and use that to choose the adapter. Detection is only done on the first 8KiB of the file, since we can't always seek on the input (in archives). --rga-no-cache Disable caching of results By default, rga caches the extracted text, if it is small enough, to a database in ${XDG_CACHE_DIR-~/.cache}/ripgrep-all on Linux, ~/Library/Caches/ripgrep-all on macOS, or C:\Users\username\ AppData\Local\ripgrep-all on Windows. This way, repeated searches on the same set of files will be much faster. If you pass this flag, all caching will be disabled. -h, --help Prints help information --rga-list-adapters List all known adapters --rga-print-config-schema Print the JSON Schema of the configuration file --rg-help Show help for ripgrep itself --rg-version Show version of ripgrep itself -V, --version Prints version information OPTIONS: --rga-adapters=... Change which adapters to use and in which priority order (descending) "foo,bar" means use only adapters foo and bar. "-bar,baz" means use all default adapters except for bar and baz. "+bar,baz" means use all default adapters and also bar and baz. --rga-cache-compression-level= ZSTD compression level to apply to adapter outputs before storing in cache db Ranges from 1 - 22 [default: 12] --rga-config-file= --rga-max-archive-recursion= Maximum nestedness of archives to recurse into [default: 5] --rga-cache-max-blob-len= Max compressed size to cache Longest byte length (after compression) to store in cache. Longer adapter outputs will not be cached and recomputed every time. Allowed suffixes on command line: k M G [default: 2000000] --rga-cache-path= Path to store cache db [default: /home/phire/.cache/ripgrep-all] -h shows a concise overview, --help shows more detail and advanced options. All other options not shown here are passed directly to rg, especially [PATTERN] and [PATH ...] Config The config file location leverage the mechanisms defined by * the XDG base directory and the XDG user directory specifications on Linux (ex: ~/.config/ripgrep-all/config.jsonc) * the Known Folder API on Windows (ex: C:\Users\Alice\AppData\ Roaming\ripgrep-all/config.jsonc) * the Standard Directories guidelines on macOS (ex: ~/Library/ Application Support/ripgrep-all/config.jsonc) Development To enable debug logging: export RUST_LOG=debug export RUST_BACKTRACE=1 Also remember to disable caching with --rga-no-cache or clear the cache (~/Library/Caches/rga on macOS, ~/.cache/rga on other Unixes, or C:\Users\username\AppData\Local\rga on Windows) to debug the adapters. Nix and Direnv You can use the provided flake.nix to setup all build- and run-time dependencies: 1. Enable Flakes in your Nix configuration. 2. Add direnv to your profile: nix profile install nixpkgs#direnv 3. cd into the directory where you have cloned this directory. 4. Allow use of .envrc: direnv allow 5. After the dependencies have been installed, your shell will now have all of the necessary development dependencies. About rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc. Resources Readme License View license Activity Stars 7k stars Watchers 40 watching Forks 155 forks Report repository Releases 9 v0.10.6 Latest Jan 16, 2024 + 8 releases Packages 0 No packages published Contributors 27 * @phiresky * @lafrenierejm * @TriplEight * @FliegendeWurst * @Br1ght0ne * @liskin * @aliesbelik * @richiksc * @Neved4 * @prj-2501 * @svenstaro * @nicoulaj * @mbrubeck * @makefu + 13 contributors Languages * Rust 97.1% * Nix 2.8% * Shell 0.1% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.