hngopher.com

       [HN Gopher] ThalamusDB: Query text, tables, images, and audio
       ___________________________________________________________________
        
       ThalamusDB: Query text, tables, images, and audio
        
       Author : itrummer
       Score  : 50 points
       Date   : 2025-10-07 19:34 UTC (4 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | tarwich wrote:
       | What a cool idea
        
         | itrummer wrote:
         | Thank you :-)
        
       | satisfice wrote:
       | How is it tested?
        
         | itrummer wrote:
         | We use mocking to replace actual LLM calls when testing for the
         | correctness of the ThalamusDB code. In terms of performance
         | benchmarking, we ran quite a few experiments measuring time,
         | costs (fees for LLM calls), and result accuracy. The latter one
         | is the hardest to evaluate since we need to compare the
         | ThalamusDB results to the ground truth. Often, we used data
         | sets from Kaggle that come with manual labels (e.g., camera
         | trap pictures labeled with the animal species, then we can get
         | ground truth for test queries that count the number of pictures
         | showing specific animals).
        
       | AmazingTurtle wrote:
       | You say it's a DB, given the execution time of up to 600s per
       | query, I say: its an agent.
        
         | itrummer wrote:
         | Well, it definitely goes beyond a traditional DBMS, but yes :-)
         | If processing the same amount of data via pure SQL versus SQL
         | with LLM calls, it will be slower and more expensive when using
         | LLMs. Note that 600s is just the default timeout, though. It's
         | typically much faster (and you can set the timeout to whatever
         | you like; ThalamusDB will return the best result approximation
         | it can find until the timeout). More details in the
         | documentation:
         | https://itrummer.github.io/thalamusdb/thalamusdb.html
        
       | petre wrote:
       | Seems like a good tool for police work.
        
       | ilaksh wrote:
       | Does this use CLIP or something to get embeddings for each image
       | and normal text embeddings for the text fields, and then feed the
       | top N results to a VLM (LLM) to select the best answer(s)?
       | 
       | What's the advantage of this over using llamaindex?
       | 
       | Although even asking that question I will be honest, the last
       | thing I used llamaindex for, it seemed mostly everything had to
       | be shoehorned in as using that library was a foregone conclusion,
       | even though ChromaDB was doing just about all the work in the end
       | because the built in test vector store that llamaindex has
       | strangely bad performance with any scale.
       | 
       | I do like how simple the llamaindex DocumentStore or whatever is
       | where you can just point it at a directory. But it seems when
       | using a specific vectordb you often can't do that.
       | 
       | I guess the other thing people do is put everything in postgres.
       | Do people use pgvector to store image embeddings?
        
         | bobosha wrote:
         | We use a vector db (Qdrant) to store embeddings of images and
         | text and built a search UI atop it.
        
           | ilaksh wrote:
           | Cool. And the other person implies that the queries can
           | search across all rows if necessary? For example if all
           | images have people and the question is which images have the
           | same people in them. Or are you talking about a different
           | project?
        
         | itrummer wrote:
         | LlamaIndex relies heavily on RAG-style approaches, e.g., we're
         | using items whose embedding vectors are close to the embedding
         | vectors of the question (what you describe). RAG-style
         | approaches work great if the answer depends only on a small
         | part of the data, e.g., if the right answer can be extracted
         | from a few top-N documents.
         | 
         | It's less applicable if the answer cannot be extracted from a
         | small data subset. E.g., you want to count the number of
         | pictures showing red cars in your database (rather than
         | retrieving a few pictures of red cars). Or, let's say you want
         | to tag beach holiday pictures with all the people who appear in
         | them. That's another scenario where you cannot easily work with
         | RAG. ThalamusDB supports such scenarios, e.g., you could use
         | the query below in ThalamusDB:
         | 
         | SELECT H.pic FROM HolidayPictures H, ProfilePictures P as Tag
         | WHERE NLFILTER(H.pic, 'this is a picture of the beach') AND
         | NLJOIN(H.pic, P.pic, 'the same person appears in both
         | pictures');
         | 
         | ThalamusDB handles scenarios where the LLM has to look at large
         | data sets and uses a few techniques to make that more
         | efficient. E.g., see here (https://arxiv.org/abs/2510.08489)
         | for the implementation of the semantic join algorithm.
         | 
         | A few other things to consider:
         | 
         | 1) ThalamusDB supports SQL with semantic operators. Lay users
         | may prefer the natural language query interfaces offered by
         | other frameworks. But people who are familiar with SQL might
         | prefer writing SQL-style queries for maximum precision.
         | 
         | 2) ThalamusDB offers various ways to restrict the per-query
         | processing overheads, e.g., time and token limits. If the limit
         | is reached, it actually returns a partial result (e.g., lower
         | and upper bounds for query aggregates, subsets of result rows
         | ...). Other frameworks do not return anything useful if query
         | processing is interrupted before it's complete.
        
       | catlifeonmars wrote:
       | Dumb question: why is this its own DB vs being a Postgres
       | extension (for example).
        
       | cyanydeez wrote:
       | Bizarre coding solutions that reqhire OPENAI
        
       ___________________________________________________________________
       (page generated 2025-10-11 23:02 UTC)