[HN Gopher] Java Panama Vector API Integrated with Apache Lucene
___________________________________________________________________
Java Panama Vector API Integrated with Apache Lucene
Author : kurhan
Score : 27 points
Date : 2023-05-27 18:23 UTC (4 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| srajabi wrote:
| A little hard to understand why this is cool, but if I understand
| correctly:
|
| 1. Lucene is trying to get Approximate Nearest Neighbours (ANN)
| search working for semantic search purposes:
| https://issues.apache.org/jira/browse/LUCENE-9004
| https://github.com/apache/lucene/issues/10047
|
| 2. The Panama Vector API allows CPU's that support it to
| accelerate vector operations: https://openjdk.org/jeps/438
|
| So this allows fast ANN on Lucene for semantic search!
|
| How did people do this before Lucene supported it? Only through
| entirely different tools?
| dzaima wrote:
| A little confusing because "vector" here (largely) refers to
| two different things. "Vector search" being this ANN thing, but
| the "Vector API" is about SIMD. SIMD provides CPU operations on
| a bunch of data at a time, i.e. instead of one instruction for
| each 32-bit float, you operate on, depending on the CPU, 128 or
| 256 or 512 bits worth of floats at the same time. So, over
| scalar code, SIMD here could get maybe a 4-16x improvement
| (give or take _a lot_ - things here are pretty complicated).
| So, while definitely a significant change, I wouldn 't say it's
| at the make-or-break level.
| [deleted]
| gst wrote:
| As add-on to this comment: There's another Lucene issue from
| 2 weeks ago that provides some more details on different
| approaches that were considered:
| https://github.com/apache/lucene/issues/12302
| jillesvangurp wrote:
| The issue you link to was marked resolved in 2020. That's indeed
| what opensearch and elasticsearch are using. Solr too probably
| (not sure).
___________________________________________________________________
(page generated 2023-05-27 23:00 UTC)