[HN Gopher] The other hard retrieval problems
       ___________________________________________________________________
        
       The other hard retrieval problems
        
       Author : softwaredoug
       Score  : 18 points
       Date   : 2024-03-25 13:48 UTC (1 days ago)
        
 (HTM) web link (softwaredoug.com)
 (TXT) w3m dump (softwaredoug.com)
        
       | esafak wrote:
       | What I got out of this is that you need hybrid search, and that
       | involves reranking. I agree, and do not find the point
       | controversial. Is there some thesis beyond this? I think the
       | author needs to elaborate.
       | 
       | Since we're here, what libraries are y'all using for reranking in
       | hybrid search these days? Anybody doing personalization or
       | contextualization?
        
       | pmc00 wrote:
       | Here's a bit of a quantification of the point Doug makes. Indeed,
       | for a number of scenarios you get better results if you combine
       | vector search and keyword search into a hybrid retrieval step,
       | and do reranking on top of that.
       | 
       | https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...
       | 
       | (disclaimer: I work in that team)
        
       | peter_l_downs wrote:
       | Quotidian viewpoint that everyone working in search already
       | agrees with, for years, but nice to see it written down.
       | Embedding-only search has never survived its first contact with
       | an actual user.
        
       | ganzuul wrote:
       | If you ensemble a bunch of optimizations you get something that
       | behaves like a neuron.
       | 
       | Neurons approximate the generator of the layers of 'blurs'. - The
       | distribution.
       | 
       | Such generators have to have separate origin to be truly
       | orthogonal. E.g. evolutionary algorithms for rodents, particle
       | swarm for tool use. If you have orthogonality in your evolution
       | generator, you are not modeling evolution.
       | 
       | To my intuition, when you are making orthogonal insertions into
       | data with these constraints you are operating in a regime where
       | computational complexity has exploded several times over what we
       | are used to. Non-spoofed squirrels with screwdrivers would be
       | extremely rare data so having an encoding scheme which allows for
       | that would be a weird flex.
       | 
       | Thoughts still baking...
        
       | brazzy wrote:
       | Yeah, you're not the first person noticing the Leopard Print Sofa
       | Problem: https://redsails.org/suddenly-a-leopard-print-sofa-
       | appears/
        
       ___________________________________________________________________
       (page generated 2024-03-26 23:01 UTC)