[HN Gopher] Vector DB with no network latency - SQLite
       ___________________________________________________________________
        
       Vector DB with no network latency - SQLite
        
       Author : elamje
       Score  : 12 points
       Date   : 2023-07-06 18:26 UTC (4 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | pedrovhb wrote:
       | I see the appeal and I'd totally consider using SQLite as a
       | vector store with the proper extensions/support (I'd imagine this
       | exists; does it?), but the code shown there really isn't an
       | apples to apples comparison, is it? Every query fetches all
       | vectors, deserializes each from JSON, allocates memory, and
       | instantiates them as numpy arrays, and then proceeds to do an
       | O(n) search for cosine similarity on embeddings which in the
       | example aren't normalized. At this point, network latency for a
       | (presumably loopback) grpc call isn't what I'd be concerned with.
       | There's really no reason to use SQLite at all in this case, just
       | keep everything in memory and save state to disk if that's what
       | you care about.
        
         | elamje wrote:
         | This was a 10 minute proof of concept! There are so many
         | optimizations I'll do on the next iteration, but the idea is
         | that people are reaching for databases that are overkill and
         | add network latency that can be avoided
        
       | scotty79 wrote:
       | Why sqlite? You can write an array to a file directly. They read
       | whole dataset each time they want to find the closest match.
        
       ___________________________________________________________________
       (page generated 2023-07-06 23:03 UTC)