[HN Gopher] Graph DBMSs need new join algorithms: Story of worst...
       ___________________________________________________________________
        
       Graph DBMSs need new join algorithms: Story of worst-case optimal
       joins
        
       Author : semihsalihoglu
       Score  : 61 points
       Date   : 2023-02-27 18:13 UTC (4 hours ago)
        
 (HTM) web link (kuzudb.com)
 (TXT) w3m dump (kuzudb.com)
        
       | semihsalihoglu wrote:
       | I wrote about a (relatively) new class of join algorithms for
       | DBMSs that are called "worst-case optimal join algorithms". This
       | is a topic that has kept me very busy for the last 5 6 years and
       | the post aims to explain the simple ideas behind it and why it is
       | the type of algorithms that graph DBMSs should integrate (and we
       | have at KuzuDB: https://github.com/kuzudb/kuzu).
       | 
       | If you're curious about how come database researchers/developers
       | still work on joins after 50 years, it's because what is possible
       | and what is not is actually not that well understood. After many
       | decades wcoj algorithms was probably the biggest algorithmic leap
       | in the field. Many people are looking into information theory and
       | computational geometry to improve our understanding. There is
       | even more advanced but currently unpractical algorithms called
       | "beyond worst-case optimal" join algorithms, which I briefly
       | leave pointers to at the end of the post.
       | 
       | Enjoy reading!
        
         | namibj wrote:
         | Tetris doesn't seem impractical for analytical joins, though?
         | 
         | Granted, it needs a good datastructure; I'm waiting for a
         | friend to find time to discuss options there.
         | 
         | I'm firmly of the belief that beyond-wcoj is a question of
         | implementation, not of theoretical suitability in practical
         | scenarios.
        
           | semihsalihoglu wrote:
           | If you have any good ideas about how to implement Tetris in a
           | practical way, you can make a good impact here. This is the
           | only work that I know that tried to implement these:
           | https://arxiv.org/pdf/1503.04169.pdf (and excitingly
           | published at GRADES workshop at SIGMOD, which I co-chaired
           | twice and is very dear to my heart). But I talked to the
           | authors and they all agree that despite the tone of the
           | paper, they found them quite difficult to implement in a
           | performant way.
           | 
           | So here's the problem. The core algorithmic step of "beyond
           | wcojs" are "geometric resolutions". The core idea is to work
           | with gaps in the space. So for example suppose we are joining
           | two relations R(A), S(A) which is an intersection and suppose
           | A is an integer domain. Suppose further than R's maximum A
           | value is 100, and S's minimum value is 101. Then there is a
           | gap of (100, \infty) in R's space and another graph (-\infty,
           | 101) in S's space. If you "resolve/join" these gaps, you get
           | a gap of (-\infty, \infty), which tells you in one operation
           | that the join's output is empty.
           | 
           | On this simple query, this seems to work fine but if you have
           | general relations (let alone non-integer data types) doing
           | such geometric "resolutions" and finding efficient indices to
           | index those "gaps" becomes quite challenging. But any good
           | idea here will push the field!
        
         | mamcx wrote:
         | I wonder if exist a byte-sized implementation of this
         | algorithms.? As far I know all is part of big system...
        
           | semihsalihoglu wrote:
           | Can you clarify what you mean by byte-sized implementations?
           | There are several systems now that implement these
           | algorithms, KuzuDB, Umbra, LogicBlox (the earliest was this)
           | are examples I know. I'm sure more will come.
        
       | genman wrote:
       | Interesting. Thank you for sharing your work.
        
         | semihsalihoglu wrote:
         | You are welcome :)
        
       | carterschonwald wrote:
       | These are fun algorithms. I've found that some of my experiments
       | in writing efficient linear algebra routines end up with exactly
       | the same interfaces you'd want for implementing these algorithms
        
         | semihsalihoglu wrote:
         | Interesting. I think there are groups that does quite
         | interesting work on linear algebra that intersects with wcoj
         | algorithms: This keynote abstract should give some pointers on
         | this group's work:
         | https://www.vldb.org/pvldb/vol13/p3502-olteanu.pdf.
        
       ___________________________________________________________________
       (page generated 2023-02-27 23:01 UTC)