[HN Gopher] How bloom filters made SQLite 10x faster
       ___________________________________________________________________
        
       How bloom filters made SQLite 10x faster
        
       Author : avinassh
       Score  : 130 points
       Date   : 2024-12-22 14:44 UTC (8 hours ago)
        
 (HTM) web link (avi.im)
 (TXT) w3m dump (avi.im)
        
       | dang wrote:
       | Related:
       | 
       |  _SQLite: Past, Present, and Future_ -
       | https://news.ycombinator.com/item?id=32675861 - Sept 2022 (143
       | comments)
        
       | ncruces wrote:
       | [flagged]
        
         | gpcz wrote:
         | Even if true, it seems like they're doing a pretty good job on
         | their own.
        
         | jpalawaga wrote:
         | SQLite is self-described as not open contribution. So yes by
         | their own measure they've made it more difficult to mainline
         | features (and intentionally so).
        
         | steve_gh wrote:
         | I submitted a bug report on SQLite a year or so back (a simple
         | test case only, not a solution). The folks were super nice, and
         | their patch went into the next release.
        
         | binary132 wrote:
         | Open contribution isn't a good in and of itself.
        
       | DaveMcMartin wrote:
       | SQLite is getting better and better. I am using it in production
       | for a bunch of websites and never got a problem.
        
         | immibis wrote:
         | It should be fine for read-only data. If you want to write, be
         | aware that only one process can write at a time, and if you
         | forget to set busy_timeout at the start of the connection, it
         | defaults to zero milliseconds and you'll get an error if
         | another process has locked the database for writing while you
         | try to read or write it. Client-server databases tend to handle
         | concurrent writers better.
        
           | bingaweek wrote:
           | What do you mean it "should be fine"? It obviously is fine.
           | It sounds like you read a blog post on sqlite and couldn't
           | wait to share it with us.
        
       | PartiallyTyped wrote:
       | Just a thought, just because a general problem is NPHard doesn't
       | mean that we can't find specific solutions quickly or that a
       | given input is hard to search for. If the downstream effect
       | results in an order of magnitude less work, it makes sense, it's
       | just a tradeoff.
        
         | bawolff wrote:
         | Well yes, heurstics for query planning is a very well
         | researched field
        
       | datadeft wrote:
       | Next should be this ->
       | https://x.com/lemire/status/1869752213402157131
       | 
       | What a progress we have with these. Amazing times.
        
       | mkonecny wrote:
       | > At the start of the join operation, we go over all the rows of
       | dimension tables and set the bits in the Bloom filter which match
       | the query predicate.
       | 
       | Can someone explain this? Seems to me it's just as expensive as
       | iterating over the tables (the previous implementation), since
       | you still need to visit each row to build the cache?
        
       ___________________________________________________________________
       (page generated 2024-12-22 23:00 UTC)