[HN Gopher] ThalamusDB: Query text, tables, images, and audio
___________________________________________________________________
ThalamusDB: Query text, tables, images, and audio
Author : itrummer
Score : 50 points
Date : 2025-10-07 19:34 UTC (4 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| tarwich wrote:
| What a cool idea
| itrummer wrote:
| Thank you :-)
| satisfice wrote:
| How is it tested?
| itrummer wrote:
| We use mocking to replace actual LLM calls when testing for the
| correctness of the ThalamusDB code. In terms of performance
| benchmarking, we ran quite a few experiments measuring time,
| costs (fees for LLM calls), and result accuracy. The latter one
| is the hardest to evaluate since we need to compare the
| ThalamusDB results to the ground truth. Often, we used data
| sets from Kaggle that come with manual labels (e.g., camera
| trap pictures labeled with the animal species, then we can get
| ground truth for test queries that count the number of pictures
| showing specific animals).
| AmazingTurtle wrote:
| You say it's a DB, given the execution time of up to 600s per
| query, I say: its an agent.
| itrummer wrote:
| Well, it definitely goes beyond a traditional DBMS, but yes :-)
| If processing the same amount of data via pure SQL versus SQL
| with LLM calls, it will be slower and more expensive when using
| LLMs. Note that 600s is just the default timeout, though. It's
| typically much faster (and you can set the timeout to whatever
| you like; ThalamusDB will return the best result approximation
| it can find until the timeout). More details in the
| documentation:
| https://itrummer.github.io/thalamusdb/thalamusdb.html
| petre wrote:
| Seems like a good tool for police work.
| ilaksh wrote:
| Does this use CLIP or something to get embeddings for each image
| and normal text embeddings for the text fields, and then feed the
| top N results to a VLM (LLM) to select the best answer(s)?
|
| What's the advantage of this over using llamaindex?
|
| Although even asking that question I will be honest, the last
| thing I used llamaindex for, it seemed mostly everything had to
| be shoehorned in as using that library was a foregone conclusion,
| even though ChromaDB was doing just about all the work in the end
| because the built in test vector store that llamaindex has
| strangely bad performance with any scale.
|
| I do like how simple the llamaindex DocumentStore or whatever is
| where you can just point it at a directory. But it seems when
| using a specific vectordb you often can't do that.
|
| I guess the other thing people do is put everything in postgres.
| Do people use pgvector to store image embeddings?
| bobosha wrote:
| We use a vector db (Qdrant) to store embeddings of images and
| text and built a search UI atop it.
| ilaksh wrote:
| Cool. And the other person implies that the queries can
| search across all rows if necessary? For example if all
| images have people and the question is which images have the
| same people in them. Or are you talking about a different
| project?
| itrummer wrote:
| LlamaIndex relies heavily on RAG-style approaches, e.g., we're
| using items whose embedding vectors are close to the embedding
| vectors of the question (what you describe). RAG-style
| approaches work great if the answer depends only on a small
| part of the data, e.g., if the right answer can be extracted
| from a few top-N documents.
|
| It's less applicable if the answer cannot be extracted from a
| small data subset. E.g., you want to count the number of
| pictures showing red cars in your database (rather than
| retrieving a few pictures of red cars). Or, let's say you want
| to tag beach holiday pictures with all the people who appear in
| them. That's another scenario where you cannot easily work with
| RAG. ThalamusDB supports such scenarios, e.g., you could use
| the query below in ThalamusDB:
|
| SELECT H.pic FROM HolidayPictures H, ProfilePictures P as Tag
| WHERE NLFILTER(H.pic, 'this is a picture of the beach') AND
| NLJOIN(H.pic, P.pic, 'the same person appears in both
| pictures');
|
| ThalamusDB handles scenarios where the LLM has to look at large
| data sets and uses a few techniques to make that more
| efficient. E.g., see here (https://arxiv.org/abs/2510.08489)
| for the implementation of the semantic join algorithm.
|
| A few other things to consider:
|
| 1) ThalamusDB supports SQL with semantic operators. Lay users
| may prefer the natural language query interfaces offered by
| other frameworks. But people who are familiar with SQL might
| prefer writing SQL-style queries for maximum precision.
|
| 2) ThalamusDB offers various ways to restrict the per-query
| processing overheads, e.g., time and token limits. If the limit
| is reached, it actually returns a partial result (e.g., lower
| and upper bounds for query aggregates, subsets of result rows
| ...). Other frameworks do not return anything useful if query
| processing is interrupted before it's complete.
| catlifeonmars wrote:
| Dumb question: why is this its own DB vs being a Postgres
| extension (for example).
| cyanydeez wrote:
| Bizarre coding solutions that reqhire OPENAI
___________________________________________________________________
(page generated 2025-10-11 23:02 UTC)