[HN Gopher] Building a custom code search index in Go
       ___________________________________________________________________
        
       Building a custom code search index in Go
        
       Author : boyter
       Score  : 74 points
       Date   : 2022-11-23 10:17 UTC (12 hours ago)
        
 (HTM) web link (boyter.org)
 (TXT) w3m dump (boyter.org)
        
       | anxiously wrote:
       | Is fitting the index in ram really that important? Obviously it
       | is fast, but if you can get away with storing it on a fast disk
       | like an nvme gen4 then why not?
        
         | GaryNumanVevo wrote:
         | Extremely important, search indexes are cache optimized. Live
         | update indexes even more so.
        
       | alecthomas wrote:
       | Great read, but a basic search yields zero results:
       | 
       | https://searchcode.com/?q=kong.New+lang%3Ago
       | 
       | Same search on GitHub:
       | 
       | https://github.com/search?type=code&q=kong.New
       | 
       | Edit: there also doesn't seem to be any ranking at all, such as
       | exact word matches being boosted
        
         | alecthomas wrote:
         | > But let me know where it's not doing what you expect and I'll
         | fix it.
         | 
         | I would expect > 0 results for the above search
         | 
         | Same search on SourceGraph:
         | https://sourcegraph.com/search?q=context:global+kong.New&pat...
        
           | boyter wrote:
           | Ah I see a vanity search! The ones that always cause issues
           | :)
           | 
           | Its just down to it not being in the index, I shall ensure I
           | add it just for you based on this.
           | 
           | Done. That repository https://github.com/alecthomas/kong will
           | get picked up when I kick off the indexing again (sometime
           | next week once all the activity dies down)
        
         | boyter wrote:
         | Probably because I don't prioritise GitHub anymore. Their own
         | search is great, but it might get picked u[ eventually.
         | 
         | There is ranking, first by a pre rank popularity of the
         | repository and secondly by tf/idf of the trigrams. It's
         | weighted towards longer matches in the display as well.
         | 
         | But let me know where it's not doing what you expect and I'll
         | fix it.
        
       | hu3 wrote:
       | Wow the search is screamingly fast. And it's custom made! I
       | enjoyed the writing. Thanks for taking the time to condense the
       | knowledge involved in words.
        
         | boyter wrote:
         | I really did try to make it as fast as I could. Always happy to
         | write it down too.
        
       | encryptluks2 wrote:
       | Congratulations to the author. It seems like they have excelled
       | at a very fast pace from using third party solutions to building
       | their own in a short time. I look forward to their progress and
       | seeing where this goes, to maybe becoming an excellent open
       | source copilot alternative.
        
         | boyter wrote:
         | I don't know about a fast pace, but I did have fun with it!
        
       | AlchemistCamp wrote:
       | Very cool to see this here, Ben! It was fun hearing the ins and
       | outs of your work on this in the TZ discord, and the final result
       | is _fast_.
       | 
       | Also, off-topic but as you know, I recently tried out your scc
       | tool and am eagerly awaiting its support for Elixir templates
       | (.eex, .heex)! You said it was a day from done a while back and
       | would go out in the next release. What's the release schedule
       | like?
       | 
       | https://github.com/boyter/scc
        
         | boyter wrote:
         | It's actually sitting on my hdd. Just need to finish it off. I
         | got sidetracked.
        
       | Darkskiez wrote:
       | See also https://codesearch.debian.net/ -
       | https://github.com/Debian/dcs for a similar project that may fit
       | your needs better. I've not compared them both, but I use dcs
       | frequently
        
         | boyter wrote:
         | The blog posts about it are great too. This one
         | https://michael.stapelberg.ch/posts/2019-09-29-dcs-positiona...
         | in particular.
        
       ___________________________________________________________________
       (page generated 2022-11-23 23:01 UTC)