[HN Gopher] Optimizing Datalog for the GPU
___________________________________________________________________
Optimizing Datalog for the GPU
Author : blakepelton
Score : 94 points
Date : 2025-11-04 14:31 UTC (8 hours ago)
(HTM) web link (danglingpointers.substack.com)
(TXT) w3m dump (danglingpointers.substack.com)
| ux266478 wrote:
| Curious, why use cuda and hip? These frameworks are rather
| opinionated about kernel design, they seem suboptimal for
| implementing a language runtime when SPIR-V is right there,
| particularly in the case of datalog.
| embedding-shape wrote:
| Why is cuda sub-optimal compared to SPIR-V? I don't think I
| know the internals enough to understand if it's supposed to be
| obvious why one is better than the other.
|
| I'm currently sitting and learning cuda for ML purposes, so
| happy to get more educated :)
| jb1991 wrote:
| Just depends on how the manufacturer of the GPU handles code
| written in different languages. For example, what level of
| API access, what level of abstraction, and how is the source
| compiled i.e. how optimized is it. For example, on an apple
| GPU, you'll see benchmarks that openCL and metal can vary
| depending on the tasks.
| embedding-shape wrote:
| Right, but that'd depend a lot on the context, task,
| hardware and so on.
|
| What parent said seemed more absolute and less relative,
| almost positing it as there is no point in using cuda
| (since it's "sub-optimal" and people should use SPIR-V
| _obviously_. I was curious in the specifics about that.
| sigbottle wrote:
| I mean, nvidia exposes some pretty low level primitives,
| and you can always fiddle with the PTX as deepseek did.
| touisteur wrote:
| From their publication history, they want to use all HPC
| niceties, to use most/any available HPC installations.
|
| Nowadays that means mostly CUDA on NVIDIA and HIP on AMD on the
| device side. Curious how the spirv support is on NVIDIA GPUs,
| including nsight tooling and the maturity/performance of
| libraries available (if only the cub-stuff for collective
| operations).
| lmeyerov wrote:
| (have been a big fan of this work for years now)
|
| From the nearby perspective of building GFQL, an embeddable oss
| GPU graph dataframe query language somewhere between cypher and
| duckdb/pandas/spark, at an even higher-level on top of pandas,
| cudf, etc:
|
| It's nice using higher-level languages with rich libraries
| underneath so we can focus on the foundational algorithm & data
| ecosystem problems while still achieving crazy numbers
|
| cudf gives us optimized GPU joins, so jumping from cheap
| personal CPU or GPU boxes to 80GB server GPUs and deep 2B edge
| whole-graph queries running in a second without work has been
| nice :) we want our focus on getting regular graph operations
| fully data parallel in the way we want while being easy for
| users, figuring out areas like bigger-than-memory and data
| lakes, etc, so we want to defer lower-level efforts to when the
| rust etc rewrite is more merited. I do see value in starting
| low when the target value and workload is obvious for building
| our (eg, vector indexes / DBs), but when breaking new ground at
| every point, value to going where you can roll & extend faster.
| zozbot234 wrote:
| What kind of SPIR-V? The SPIR-V used for compute shaders
| (Vulkan Compute) is totally different to the one for compute
| kernels (OpenCL and SYCL)...
| haolez wrote:
| On a side note, what tools that leverage Datalog are in use by
| the HN crowd?
|
| I know that Datomic[0] is very popular. I've also been playing
| with Clingo[1] lately.
|
| [0] https://www.datomic.com/
|
| [1] https://potassco.org/clingo/
| embedding-shape wrote:
| I have some local-first/client-side applications using
| datascript in ClojureScript. Used datahike (FOSS Datomic
| alternative) some times on the backend too, but mostly tend to
| use XTDB nowadays, which used to have a Datalog API but I think
| they removed in favor of SQL-like way instead, which was kind
| of a shame.
| manoDev wrote:
| I guess SQL is a requirement if they want to market their
| technology to normies.
| zozbot234 wrote:
| SQL can express Datalog-like queries rather easily using
| recursive CTE's, and even more so via the recently-added
| Property Graph Query syntax.
| blurbleblurble wrote:
| Check out CozoDB, the embedded datalog-queried hybrid
| relational+vector+graph database written in Rust:
| https://www.cozodb.org/
|
| I used it in a toy application and it was awesome.
|
| This appears to be a dream database from the future.
| huevosabio wrote:
| It seems like the project has been abandoned? Last commit a
| year ago.
| anonzzzies wrote:
| Yep, bit of a shame, many nice things in it and interesting
| to learn from but not active.
| touisteur wrote:
| The work done/supervised by Kristopher Micinski on using HPC
| hardware (not only GPUs but clusters) for formal methods is
| really encouraging. I hope we reach a breakthrough of affinity
| between COTS compute hardware and all kinds of formal methods, as
| GPUs found theirs with deep learning and subsequent large models.
|
| One possible answer to 'what do we do with all the P100s, V100s,
| A100s when they're decomissionned from their AI heyday (apart
| from 'small(er) models'.
___________________________________________________________________
(page generated 2025-11-04 23:00 UTC)