[HN Gopher] Show HN: Trieve CLI - Terminal-based LLM agent loop ...
       ___________________________________________________________________
        
       Show HN: Trieve CLI - Terminal-based LLM agent loop with search
       tool for PDFs
        
       Hi HN,  I built a CLI for uploading documents and querying them
       with an LLM agent that uses search tools rather than stuffing
       everything into the context window. I recorded a demo using the
       CrossFit 2025 rulebook that shows how this approach compares to
       traditional RAG and direct context injection[1].  The core insight
       is that LLMs running in loops with tool access are unreasonably
       effective at this kind of knowledge retrieval task[2]. Instead of
       hoping the right chunks make it into your context, the agent can
       iteratively search, refine queries, and reason about what it finds.
       The CLI handles the full workflow:  ```bash  trieve upload
       ./document.pdf  trieve ask "What are the key findings?"  ```  You
       can customize the RAG behavior, check upload status, and the
       responses stream back with expandable source references. I really
       enjoy having this workflow available in the terminal and I'm
       curious if others find this paradigm as compelling as I do.
       Considering adding more commands and customization options if
       there's interest. The tool is free for up to 1k document chunks.
       Source code is on GitHub[3] and available via npm[4].  Would love
       any feedback on the approach or CLI design!  [1]:
       https://www.youtube.com/watch?v=SAV-esDsRUk [2]:
       https://news.ycombinator.com/item?id=43998472 [3]:
       https://github.com/devflowinc/trieve/blob/main/clients/cli/i...
       [4]: https://www.npmjs.com/package/trieve-cli
        
       Author : skeptrune
       Score  : 22 points
       Date   : 2025-06-18 13:52 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jlarocco wrote:
       | I don't really understand the point of it. It seems like a very
       | shallow replacement for skimming (or god forbid reading) the
       | paper, without the benefit of absorbing any of the material
       | yourself.
       | 
       | I have the same critique for a lot of AI tools. We're replacing
       | the meaningful parts of content creation and consumption with a
       | computer so we can pass it off as having created or understood it
       | ourselves. It just seems pointless.
        
         | skeptrune wrote:
         | Sometimes you just want a quick answer to a question though. I
         | agree that tools like this aren't something I'd use to consume
         | some content I'm actually interested in or produce something I
         | think of as high quality.
         | 
         | However, I also want to flag that the cool part about the agent
         | loop is that it _feels_ less like skimming since you can watch
         | the LLM search, evaluate results, search again, evaluate
         | results, and repeat until it 's happy that it has enough
         | information to actually answer.
        
         | behnamoh wrote:
         | Your comment got me thinking about what it really means to
         | understand something. Is it just about remembering the facts or
         | the ideas? Or is it more about being aware of them? I've
         | watched a ton of YouTube videos and read a bunch of articles
         | about physics, but I can't remember how to derive those
         | equations a few weeks later. So, I don't feel like I really
         | understand them. But if I have an idea about how to do it, how
         | much of it is just memory, and how much is actually
         | understanding the concept? That's been a question I've been
         | thinking about for a long time. With all the AI stuff, we've
         | figured out how to deal with the memory part, so we don't have
         | to rely on our own memories as much. But that still leaves the
         | question: what does understanding really mean?
        
           | dingnuts wrote:
           | I've thought about this a lot in the context of "why do I
           | need to learn facts when I can just look them up?"
           | 
           | Understanding a concept means you are able to use it in
           | higher order reasoning. Think about the rote practice
           | necessary to build intuition in mathematics until you're able
           | to use the concept being learned for the next concept which
           | in turn relies on it.
           | 
           | Once that intuition is built, that's understanding.
        
         | BeetleB wrote:
         | > It seems like a very shallow replacement for skimming
         | 
         | Actually, I think we have it all backwards. We're taught to
         | skim because such tools didn't exist. Once (if!) they are
         | reliable enough, skimming should become a dead art, like
         | shorthand is.
         | 
         | One should know how to read well (in detail), when one needs
         | to. Everything else can be delegated. Indeed, this is why
         | people in high positions don't skim - they can afford
         | secretaries and underlings to do the skimming for them.
        
       | westurner wrote:
       | neuml/paperai "indexes databases previously built with paperetl"
       | and does RAG with txtai; https://github.com/neuml/paperai :
       | 
       | > _paperai is a combination of a txtai embeddings index and a
       | SQLite database with the articles. Each article is parsed into
       | sentences and stored in SQLite along with the article metadata.
       | Embeddings are built over the full corpus._
       | 
       | paperai has a YAML report definition schema that's probably
       | useful for meta-analyses.
       | 
       | Paperetl can store articles with SQLite, Elasticsearch, JSON,
       | YAML. It doesn't look like markdown from a tagged git repo is
       | supported yet. Supported inputs include PDF, XML (arXiv, PubMed,
       | TEI), CSV.
       | 
       | PaperQA2 has a CLI: https://github.com/Future-House/paper-
       | qa#what-is-paperqa2 :
       | 
       | > _PaperQA2 is engineered to be the best agentic RAG model for
       | working with scientific papers._
       | 
       | > [ Semantic Scholar, CrossRef, ]
       | 
       | paperqa-zotero: https://github.com/lejacobroy/paperqa-zotero
       | 
       | The Oracle of Zotero is a fork of paper-qa with FAISS and
       | langchain: https://github.com/Frost-group/The-Oracle-of-Zotero
        
         | westurner wrote:
         | simonw/llm is a CLI for LLMs: https://github.com/simonw/llm
         | 
         | `llm --help`: https://llm.datasette.io/en/stable/help.html#llm-
         | help
         | 
         | simonw/llm plugin directory:
         | https://llm.datasette.io/en/stable/plugins/directory.html#pl...
         | 
         | From https://simonwillison.net/2024/Jun/17/cli-language-models/
         | :
         | 
         | > _Every prompt and response run through the LLM tool is
         | permanently logged to a SQLite database,_
        
       | it_shadow wrote:
       | Are AI agents the new Todo list app? Everyone and their
       | grandmother is creating one!
       | 
       | How is this one better than syncing our Google drive with chatgpt
       | for example and 'asking questions' ?
        
       ___________________________________________________________________
       (page generated 2025-06-18 23:00 UTC)