[HN Gopher] Vector DB with no network latency - SQLite
___________________________________________________________________
Vector DB with no network latency - SQLite
Author : elamje
Score : 12 points
Date : 2023-07-06 18:26 UTC (4 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| pedrovhb wrote:
| I see the appeal and I'd totally consider using SQLite as a
| vector store with the proper extensions/support (I'd imagine this
| exists; does it?), but the code shown there really isn't an
| apples to apples comparison, is it? Every query fetches all
| vectors, deserializes each from JSON, allocates memory, and
| instantiates them as numpy arrays, and then proceeds to do an
| O(n) search for cosine similarity on embeddings which in the
| example aren't normalized. At this point, network latency for a
| (presumably loopback) grpc call isn't what I'd be concerned with.
| There's really no reason to use SQLite at all in this case, just
| keep everything in memory and save state to disk if that's what
| you care about.
| elamje wrote:
| This was a 10 minute proof of concept! There are so many
| optimizations I'll do on the next iteration, but the idea is
| that people are reaching for databases that are overkill and
| add network latency that can be avoided
| scotty79 wrote:
| Why sqlite? You can write an array to a file directly. They read
| whole dataset each time they want to find the closest match.
___________________________________________________________________
(page generated 2023-07-06 23:03 UTC)