[HN Gopher] Product quantization for vector search
       ___________________________________________________________________
        
       Product quantization for vector search
        
       Author : fzliu
       Score  : 46 points
       Date   : 2023-05-19 17:13 UTC (5 hours ago)
        
 (HTM) web link (zilliz.com)
 (TXT) w3m dump (zilliz.com)
        
       | jeffchuber wrote:
       | Be careful with PQ, streaming updates will degrade recall
       | performance a lot (and in ways you may miss)
        
         | fzliu wrote:
         | A naive PQ index that trains the quantizer once and reuses the
         | same codebook for recent data definitely has this problem. We
         | (Milvus/Zilliz) break up all the vectors into multiple indexes
         | at certain checkpoints which we call segments. This doesn't
         | fully solve the problem of degraded performance with PQ or
         | hybrid PQ indexes, but certainly helps mitigate it.
         | 
         | More info here: https://milvus.io/docs/glossary.md#Segment
        
       | jadbox wrote:
       | Is there a product qantasization indexer plugin for postgres?
        
         | fzliu wrote:
         | AFAIK pgvector only supports IVF, no PQ (nor HNSW).
        
       | fzliu wrote:
       | For folks interested in the rest of our vector search 101 series,
       | here's a link: https://zilliz.com/blog?tag=39&page=1
        
       | pbadams wrote:
       | There are a lot of directions people try to go, making different
       | tradeoffs in the complexity of the clustering, the loss from the
       | quantization, the impact on performance (esp. trying to get some
       | subset of the tables to fit in cache). Readers might be
       | interested in [1], which gives a survey of some of the
       | directions.
       | 
       | In general though PQ is a pretty good baseline. I'm glad all
       | these vector DB companies seem to have decided that the best form
       | of marketing is high-quality summaries/tutorials about
       | fundamental concepts, it's a good contribution to the community.
       | 
       | [1] Fig. 1 in
       | https://www.jstage.jst.go.jp/article/mta/6/1/6_2/_pdf (2018)
        
         | fzliu wrote:
         | Just read the abstract and intro for the link you posted -
         | thanks for sharing.
         | 
         | As great as HNSW and other graph-based indexes are, I think PQ
         | and other encoder/decoder-based methods are still incredibly
         | important for ANN search in general. In particular, it should
         | be possible to learn some sort of joint encoding with neural
         | networks targeted towards different modalities.
        
       ___________________________________________________________________
       (page generated 2023-05-19 23:01 UTC)