[HN Gopher] Java Panama Vector API Integrated with Apache Lucene
       ___________________________________________________________________
        
       Java Panama Vector API Integrated with Apache Lucene
        
       Author : kurhan
       Score  : 27 points
       Date   : 2023-05-27 18:23 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | srajabi wrote:
       | A little hard to understand why this is cool, but if I understand
       | correctly:
       | 
       | 1. Lucene is trying to get Approximate Nearest Neighbours (ANN)
       | search working for semantic search purposes:
       | https://issues.apache.org/jira/browse/LUCENE-9004
       | https://github.com/apache/lucene/issues/10047
       | 
       | 2. The Panama Vector API allows CPU's that support it to
       | accelerate vector operations: https://openjdk.org/jeps/438
       | 
       | So this allows fast ANN on Lucene for semantic search!
       | 
       | How did people do this before Lucene supported it? Only through
       | entirely different tools?
        
         | dzaima wrote:
         | A little confusing because "vector" here (largely) refers to
         | two different things. "Vector search" being this ANN thing, but
         | the "Vector API" is about SIMD. SIMD provides CPU operations on
         | a bunch of data at a time, i.e. instead of one instruction for
         | each 32-bit float, you operate on, depending on the CPU, 128 or
         | 256 or 512 bits worth of floats at the same time. So, over
         | scalar code, SIMD here could get maybe a 4-16x improvement
         | (give or take _a lot_ - things here are pretty complicated).
         | So, while definitely a significant change, I wouldn 't say it's
         | at the make-or-break level.
        
           | [deleted]
        
           | gst wrote:
           | As add-on to this comment: There's another Lucene issue from
           | 2 weeks ago that provides some more details on different
           | approaches that were considered:
           | https://github.com/apache/lucene/issues/12302
        
       | jillesvangurp wrote:
       | The issue you link to was marked resolved in 2020. That's indeed
       | what opensearch and elasticsearch are using. Solr too probably
       | (not sure).
        
       ___________________________________________________________________
       (page generated 2023-05-27 23:00 UTC)