[HN Gopher] Learn Datalog Today
       ___________________________________________________________________
        
       Learn Datalog Today
        
       Author : tosh
       Score  : 125 points
       Date   : 2024-01-21 16:38 UTC (6 hours ago)
        
 (HTM) web link (www.learndatalogtoday.org)
 (TXT) w3m dump (www.learndatalogtoday.org)
        
       | SJC_Hacker wrote:
       | I learned a bit of Datalog back in university, too many years
       | ago. It was impressive how powerful the query language is. You
       | could do in a single line what required several lines of SQL, and
       | far more intuitively.
       | 
       | But ... the problem is how many DBs support it, and how useful of
       | a skill it is to know.
        
       | FreeFull wrote:
       | It's a shame that there doesn't seem to be any decent open-source
       | implementation of Datalog. If you go for full Prolog instead of
       | Datalog, there are several (Scryer Prolog being my personal
       | favourite).
        
         | cmrdporcupine wrote:
         | https://en.wikipedia.org/wiki/Souffl%C3%A9_(programming_lang...
         | 
         | https://github.com/souffle-lang/souffle
        
         | nezaj wrote:
         | Here's a blog post showing you how to roll your own in ~100
         | lines of JS
         | 
         | https://www.instantdb.com/essays/datalogjs
        
         | kjqgqkejbfefn wrote:
         | 1. Datomic - While not open-source, it has an open-source
         | version called Datomic Free, which is a distributed database
         | designed to enable scalable, flexible, and intelligent data
         | storage and queries. Datomic's query language is closely
         | inspired by Datalog.
         | 
         | 2. DataScript - An open-source in-memory database and query
         | engine for Clojure, ClojureScript, and JavaScript that is
         | heavily influenced by Datalog and Datomic.
         | 
         | 3. Crux (now XTDB) - A bitemporal database with Datalog-
         | inspired querying capabilities. It is designed for efficient
         | querying of historical data and offers ACID transactions.
         | 
         | 4. Racket's miniKanren - While not strictly a database,
         | miniKanren is an open-source logic programming extension to the
         | Racket language, which is inspired by Datalog and can be used
         | to manipulate and query data in a manner similar to Prolog.
         | 
         | 5. LogicBlox - An open-source platform that combines a database
         | system, a Datalog-based modeling language, and application
         | server facilities. It allows developers to build complex, data-
         | intensive applications.
         | 
         | 6. Souffle - A Datalog-inspired language that is designed for
         | static analysis problems. It can be viewed as a database query
         | language with a focus on performance, allowing for parallel
         | execution of queries.
         | 
         | 7. Dedalus - A Datalog-like temporal logic language used to
         | express complex distributed systems. It is primarily a research
         | tool but has informed the design of other Datalog-inspired
         | systems.
         | 
         | 8. Flora-2 - An open-source object-oriented knowledge
         | representation and reasoning system that integrates a variant
         | of Datalog with objects and frames.
         | 
         | Top 3 are from the Clojure ecosystem. Additionnaly in this same
         | space there is Datalevin & Datahike among many others
        
           | persnickety wrote:
           | Cozo uses Datalog for queries, and has several backends,
           | including SQLite
        
             | cmrdporcupine wrote:
             | Cozo is very attractive. I just wish there was a native
             | Rust DSL API for it, so it could be embedded in Rust
             | programs without using datalog queries in strings.
        
               | soraki_soladead wrote:
               | https://github.com/cozodb/pycozo/blob/main/pycozo/test_bu
               | ild...
               | 
               | Here's the python version of what I think you're looking
               | for. Shouldn't be too difficult to port to rust.
        
               | cmrdporcupine wrote:
               | ok but that's not what i want.
               | 
               | the thing is written in Rust. but does not expose a Rust
               | query API, you have to query it through Datalog queries
               | in strings; what you shared there just builds those
               | strings from python.. it'd be nice to have a directly
               | native API, with horne clauses constructed in Rust.
        
           | summarity wrote:
           | Also Rego, which is Datalog with structured extensions, in
           | use everywhere where OPA is used (as in many k8s
           | environments)
        
           | j-pb wrote:
           | Are you sure that LogicBlox is open-source? I couldn't find
           | anything confirming this.
           | 
           | I'd be very surprised if they were, because they even
           | patented their join algorithm.
        
             | cmrdporcupine wrote:
             | It's definitely not open source.
             | 
             | Not only is the join algorithm patented, but my
             | understanding is the original authors of it can't even use
             | it, because the LogicBlox IP was acquired but the people
             | moved on.
             | 
             | But some have since gone on to create new stuff @
             | RelationalAI
        
           | macmac wrote:
           | Ad 1. All versions of Datomic are now free, but none are Open
           | Source.
        
           | kevindamm wrote:
           | Also GDL and its variants, but that is more of a domain-
           | specific language for game descriptions and general game-
           | playing runtimes. Still, they refer to Datalog as its basis.
        
           | cmrdporcupine wrote:
           | Another one is Differential Datalog, for streaming data.
           | 
           | https://github.com/vmware/differential-datalog
        
             | habitue wrote:
             | I'm sad their last commit was 2 years ago, seemed like a
             | really cool idea
        
               | tylerhou wrote:
               | The authors spun it out into a startup, Feldera. A paper
               | describing their idea also won Best Paper at VLDB 2023.
               | The idea is very far from dead.
        
               | cmrdporcupine wrote:
               | Neat. Had run into them before (the "careers" page was
               | marked as visited in my Firefox history ;-) ), but didn't
               | make the connection.
        
           | odipar wrote:
           | CodeQL is another datalog with the domain of code analysis as
           | its use case. Too bad you cannot create a custom fact
           | database with CodeQL. Otherwise, the implementation of CodeQL
           | is pretty advanced and efficient.
        
             | infima wrote:
             | While not trivial because it is not documented, you can
             | create your a database with your own facts. Some of the
             | extractors that create the required files are open source h
             | ttps://github.com/github/codeql/blob/main/ruby/extractor/sr
             | ...
        
           | lukev wrote:
           | Is LogicBlox open-source now? I encountered it on a project
           | several years ago and at that point it was very much
           | closed/commercial.
           | 
           | Now the website isn't even loading... has the project been
           | shuttered? I know LogicBlox was acquired by Predictix a long
           | time ago, and recently Infor acquired Predictix. Hoping the
           | project is still a going concern, there was some very cool
           | tech in there.
        
           | marcle wrote:
           | ErgoAI is as "an enterprise-level extension of the Flora-2
           | system" which was recently open-sourced:
           | https://github.com/ErgoAI . It seems to be well documented.
        
         | Jonovono wrote:
         | CozoDB: https://github.com/cozodb/cozo
        
         | manu3000 wrote:
         | you can use Datalig within Flix https://flix.dev/
        
           | refset wrote:
           | For comparison, I previously translated that cart parts
           | scheduling example on the Flix homepage to Datomic-style
           | Datalog syntax: https://gist.github.com/refset/21b3fc1dec9a69
           | 28943073809e133...
        
         | dagipflihax0r wrote:
         | Mangle https://github.com/google/mangle is an open-source
         | implementation in golang, it was an explicit goal to make it
         | easy to learn. Meaning: it is easy to recognize the pure
         | datalog part, the syntax is following the good old course
         | material.
         | 
         | It was discussed here:
         | https://news.ycombinator.com/item?id=33756800
        
       | grepexdev wrote:
       | I thought that syntax looked familiar! Looks like Logseq uses
       | Datalog for advanced queries.
       | 
       | https://hub.logseq.com/features/av5LyiLi5xS7EFQXy4h4K8/getti...
        
         | packetlost wrote:
         | More specifically, Logseq uses DataScript, a Datomic-inspired
         | Datalog engine for ClojureScript.
        
         | achileas wrote:
         | They do! IIRC they were inspired by Roam's use of it with
         | Clojurescript
        
       | brendanyounger wrote:
       | I wish people would stop referring to Datomic as datalog. Datomic
       | is many things, but only the query format (Horn clauses with
       | unification of variables, similar to prolog) has anything to do
       | with datalog.
       | 
       | Real datalog is far more interesting since it implicitly encodes
       | recursion allowing you to chain rules. Rule A derives new facts,
       | which rule B uses to derive new facts, which rules A and C use to
       | derive new facts, and so on. Datomic has a notion of rules which
       | are mostly syntax sugar and do not support this sort of recursive
       | reasoning.
       | 
       | Why is that a big deal? When rules are run automatically, you can
       | build live, reactive systems, not just a database that sits
       | around waiting for you to query it. Hellerstein's work at UC
       | Berkeley
       | (https://dsf.berkeley.edu/papers/sigrec10-declimperative.pdf)
       | explores this in some detail.
        
         | 6gvONxR4sf7o wrote:
         | Sounds cool. What's the complexity of running this kind of
         | recursive reasoning? Reasonable? Can you suggest any tools to
         | not have to implement it ourselves?
        
           | brendanyounger wrote:
           | Souffle and Cozo mentioned below already implement the whole
           | of "traditional" datalog.
           | 
           | Percival (https://github.com/ekzhang/percival) has some very
           | nice examples showing how you can interactively write and
           | test rules on top of a datalog interpreter.
           | 
           | Bud (http://bloom-lang.net/bud/) is Hellerstein's proof of
           | concept playground. It has bit-rotted in the past few years,
           | but the examples are readable even if you can't easily get it
           | working.
           | 
           | The complexity can be quite good. You can syntactically
           | determine when you've written linear recursion (equivalent to
           | a for loop) vs not. Otherwise, the complexity is what you'd
           | expect from incremental view maintenance in a normal SQL
           | database. Which is to say O(n^k) with k being the number of
           | relations joined, but usually much, much less with
           | appropriate indexes and skew in the data. All the usual
           | tricks concerning data normalization and indexes from
           | databases apply.
        
           | refset wrote:
           | RDFox offers a rather impressive sounding Datalog inferencing
           | engine: https://www.oxfordsemantic.tech/rdfox
           | 
           | > We present a novel approach to parallel materialisation
           | (i.e., fixpoint computation) of datalog programs in
           | centralised, main-memory, multi-core RDF systems. Our
           | approach comprises an algorithm that evenly distributes the
           | workload to cores, and an RDF indexing data structure that
           | supports efficient, 'mostly' lock-free parallel updates.
           | 
           | > Materialisation is PTIME-complete in data complexity and is
           | thus believed to be inherently sequential. Nevertheless, many
           | practical parallelisation techniques have been developed
           | [...]
           | 
           | There have been several papers and patents describing their
           | approach, e.g.
           | http://www.cs.ox.ac.uk/dan.olteanu/papers/mnpho-aaai14.pdf
        
         | refset wrote:
         | > Datomic has a notion of rules which are mostly syntax sugar
         | and do not support this sort of recursive reasoning.
         | 
         | > Why is that a big deal? When rules are run automatically, you
         | can build live, reactive systems, not just a database that sits
         | around waiting for you to query it.
         | 
         | There was at least one serious attempt to bring these worlds
         | together: https://github.com/sixthnormal/clj-3df
        
       | dang wrote:
       | Related:
       | 
       |  _Learn Datalog Today_ -
       | https://news.ycombinator.com/item?id=27173890 - May 2021 (34
       | comments)
       | 
       |  _Learn Datalog_ - https://news.ycombinator.com/item?id=19154997
       | - Feb 2019 (1 comment)
       | 
       |  _Learn Datalog Today_ -
       | https://news.ycombinator.com/item?id=17109105 - May 2018 (2
       | comments)
       | 
       |  _Learn Datalog Today_ -
       | https://news.ycombinator.com/item?id=14434457 - May 2017 (10
       | comments)
       | 
       |  _Learn Datalog Today Ported to DataScript and Clojure (JVM)_ -
       | https://news.ycombinator.com/item?id=13037199 - Nov 2016 (1
       | comment)
       | 
       |  _Learn Datalog Today - An interactive Datomic query tutorial_ -
       | https://news.ycombinator.com/item?id=6171722 - Aug 2013 (7
       | comments)
        
       | Clever321 wrote:
       | Datalog feels so much more intuitive than SQL or any other query
       | language I've used. I'm able to write concise, complex
       | expressions pretty easily. In a SQL-based system, there seems to
       | be a (low) complexity metric where it's easier to
       | write/debug/maintain what was supposed to be a 'declarative' SQL
       | query in a functional/imperative language instead. It feels like
       | datalog is the next evolution of a declarative query language,
       | one that is much more declarative than SQL itself.
       | 
       | In the "day of datomic" videos, there is a segment where Stu
       | debugs a slow query. He does the debugging without even looking
       | at the data model, only by rearranging the clauses. It is really,
       | really impressive, and I can't imagine having that capability in
       | SQL.
        
         | brendanyounger wrote:
         | I greatly respect what Stu and Rich have done to make Datomic.
         | 
         | However, they made an explicit design decision to not include a
         | query optimizer and execute the clauses as they were written.
         | This is usually fine since the author has some idea of what the
         | best order is, but there are O(2^k) different permutations of
         | clauses so doing it by hand will fail at some point (if you
         | want the optimal ordering).
        
       | account-5 wrote:
       | For the idiot in the thread, why would I use datalog (which I've
       | never heard of before) over SQL?
       | 
       | Having looked quickly at it just now it seems (Wikipedia article)
       | similar to Web Ontology Language (OWL), though I believe datalog
       | may have been around long before owl.
        
         | brendanyounger wrote:
         | On a syntax level, parsing, generating, and templating datalog
         | is _much_ simpler than doing the same to SQL. DBT would never
         | exist if every SQL database accepted datalog queries and SQL
         | injection attacks would be rare to non-existent.
         | 
         | The more interesting answer is to think of datalog as making it
         | easy to encode nearly all of your application logic as a bunch
         | of self-referencing, incrementally updated, materialized views.
         | Some examples:                 # view of Users table for
         | currently logged in user       LoggedInUserView(name, email,
         | id) :- Users(id:        payload["userId"], name, email),
         | Cookies(name: "login", payload).            # view of Users for
         | admin       AdminUserView(name, email, id) :- Users(id, name,
         | email), Cookies(name: "login", payload), payload["isAdmin"] =
         | true.            # posts a user can see       PostsView(title,
         | content, id) :- Posts(title, content, public: true).
         | PostsView(title, content, id) :- Posts(title, content, author:
         | payload["userId"]), Cookies(name: "login", payload).
         | 
         | And then you write your UI code to explicitly reference these
         | derived views rather than manually wrapping an API around
         | querying the Posts table and doing the filtering.
         | 
         | The examples above can be neatly replicated in Supabase or
         | Postgraphile (the OG of auto-generated GraphQL over Postgres),
         | but you can do a lot more with datalog as a language. The
         | Hellerstein paper mentioned above is a good starting place.
        
         | refset wrote:
         | Datalog can be very effective for expressing certain kinds of
         | problems and for generating efficient solutions to those
         | problems. Particularly anything that is even mildly recursive,
         | and therefore especially "knowledge graphs" that rely heavily
         | on rules to infer, model and retrieve information. However if
         | your problem domain amounts to CRUD storage without a need for
         | complex recursion then mature SQL systems usually have all the
         | advantages (asides from the syntax!). For a more formal answer:
         | 
         | > The intersection of databases, logic, and artificial
         | intelligence gave raise to deductive databases. Deductive
         | database systems are database management systems built around a
         | logical model of data, and their query languages allow
         | expressing logical queries. A deductive database system
         | includes procedures for defining deductive rules which can
         | infer information (in the so-called intensional database) in
         | addition to the facts loaded in the (so-called extensional)
         | database. The logic model for deductive databases is closely
         | related to the relational model and, in particular, with the
         | domain relational calculus. Datalog is the most known deductive
         | query language (which syntactically is a Prolog subset) where
         | constructed terms are not allowed as other non-declarative
         | constructs such as the cut.
         | 
         | > Also following the relational model, relational database
         | systems are well-known and widespread nowadays. Their formal
         | query languages include relational algebra and relational
         | calculi but, in practical systems, the de-facto and ANSI/ISO
         | standard SQL is the language of choice of every relational
         | database vendor. Whilst SQL and relational formal languages
         | implement a limited form of logic, deductive database languages
         | implement advanced forms of logic.
         | 
         | https://www.fdi.ucm.es/profesor/fernan/des/html/manual/manua...
        
       ___________________________________________________________________
       (page generated 2024-01-21 23:00 UTC)