[HN Gopher] Optimizing Datalog for the GPU
       ___________________________________________________________________
        
       Optimizing Datalog for the GPU
        
       Author : blakepelton
       Score  : 94 points
       Date   : 2025-11-04 14:31 UTC (8 hours ago)
        
 (HTM) web link (danglingpointers.substack.com)
 (TXT) w3m dump (danglingpointers.substack.com)
        
       | ux266478 wrote:
       | Curious, why use cuda and hip? These frameworks are rather
       | opinionated about kernel design, they seem suboptimal for
       | implementing a language runtime when SPIR-V is right there,
       | particularly in the case of datalog.
        
         | embedding-shape wrote:
         | Why is cuda sub-optimal compared to SPIR-V? I don't think I
         | know the internals enough to understand if it's supposed to be
         | obvious why one is better than the other.
         | 
         | I'm currently sitting and learning cuda for ML purposes, so
         | happy to get more educated :)
        
           | jb1991 wrote:
           | Just depends on how the manufacturer of the GPU handles code
           | written in different languages. For example, what level of
           | API access, what level of abstraction, and how is the source
           | compiled i.e. how optimized is it. For example, on an apple
           | GPU, you'll see benchmarks that openCL and metal can vary
           | depending on the tasks.
        
             | embedding-shape wrote:
             | Right, but that'd depend a lot on the context, task,
             | hardware and so on.
             | 
             | What parent said seemed more absolute and less relative,
             | almost positing it as there is no point in using cuda
             | (since it's "sub-optimal" and people should use SPIR-V
             | _obviously_. I was curious in the specifics about that.
        
             | sigbottle wrote:
             | I mean, nvidia exposes some pretty low level primitives,
             | and you can always fiddle with the PTX as deepseek did.
        
         | touisteur wrote:
         | From their publication history, they want to use all HPC
         | niceties, to use most/any available HPC installations.
         | 
         | Nowadays that means mostly CUDA on NVIDIA and HIP on AMD on the
         | device side. Curious how the spirv support is on NVIDIA GPUs,
         | including nsight tooling and the maturity/performance of
         | libraries available (if only the cub-stuff for collective
         | operations).
        
         | lmeyerov wrote:
         | (have been a big fan of this work for years now)
         | 
         | From the nearby perspective of building GFQL, an embeddable oss
         | GPU graph dataframe query language somewhere between cypher and
         | duckdb/pandas/spark, at an even higher-level on top of pandas,
         | cudf, etc:
         | 
         | It's nice using higher-level languages with rich libraries
         | underneath so we can focus on the foundational algorithm & data
         | ecosystem problems while still achieving crazy numbers
         | 
         | cudf gives us optimized GPU joins, so jumping from cheap
         | personal CPU or GPU boxes to 80GB server GPUs and deep 2B edge
         | whole-graph queries running in a second without work has been
         | nice :) we want our focus on getting regular graph operations
         | fully data parallel in the way we want while being easy for
         | users, figuring out areas like bigger-than-memory and data
         | lakes, etc, so we want to defer lower-level efforts to when the
         | rust etc rewrite is more merited. I do see value in starting
         | low when the target value and workload is obvious for building
         | our (eg, vector indexes / DBs), but when breaking new ground at
         | every point, value to going where you can roll & extend faster.
        
         | zozbot234 wrote:
         | What kind of SPIR-V? The SPIR-V used for compute shaders
         | (Vulkan Compute) is totally different to the one for compute
         | kernels (OpenCL and SYCL)...
        
       | haolez wrote:
       | On a side note, what tools that leverage Datalog are in use by
       | the HN crowd?
       | 
       | I know that Datomic[0] is very popular. I've also been playing
       | with Clingo[1] lately.
       | 
       | [0] https://www.datomic.com/
       | 
       | [1] https://potassco.org/clingo/
        
         | embedding-shape wrote:
         | I have some local-first/client-side applications using
         | datascript in ClojureScript. Used datahike (FOSS Datomic
         | alternative) some times on the backend too, but mostly tend to
         | use XTDB nowadays, which used to have a Datalog API but I think
         | they removed in favor of SQL-like way instead, which was kind
         | of a shame.
        
           | manoDev wrote:
           | I guess SQL is a requirement if they want to market their
           | technology to normies.
        
             | zozbot234 wrote:
             | SQL can express Datalog-like queries rather easily using
             | recursive CTE's, and even more so via the recently-added
             | Property Graph Query syntax.
        
         | blurbleblurble wrote:
         | Check out CozoDB, the embedded datalog-queried hybrid
         | relational+vector+graph database written in Rust:
         | https://www.cozodb.org/
         | 
         | I used it in a toy application and it was awesome.
         | 
         | This appears to be a dream database from the future.
        
           | huevosabio wrote:
           | It seems like the project has been abandoned? Last commit a
           | year ago.
        
             | anonzzzies wrote:
             | Yep, bit of a shame, many nice things in it and interesting
             | to learn from but not active.
        
       | touisteur wrote:
       | The work done/supervised by Kristopher Micinski on using HPC
       | hardware (not only GPUs but clusters) for formal methods is
       | really encouraging. I hope we reach a breakthrough of affinity
       | between COTS compute hardware and all kinds of formal methods, as
       | GPUs found theirs with deep learning and subsequent large models.
       | 
       | One possible answer to 'what do we do with all the P100s, V100s,
       | A100s when they're decomissionned from their AI heyday (apart
       | from 'small(er) models'.
        
       ___________________________________________________________________
       (page generated 2025-11-04 23:00 UTC)