[HN Gopher] Logica - Declarative logic programming language for ...
       ___________________________________________________________________
        
       Logica - Declarative logic programming language for data
        
       Author : voat
       Score  : 68 points
       Date   : 2024-11-16 19:09 UTC (3 hours ago)
        
 (HTM) web link (logica.dev)
 (TXT) w3m dump (logica.dev)
        
       | dang wrote:
       | Related:
       | 
       |  _Google is pushing the new language Logica to solve the major
       | flaws in SQL_ - https://news.ycombinator.com/item?id=29715957 -
       | Dec 2021 (1 comment)
       | 
       |  _Logica, a novel open-source logic programming language_ -
       | https://news.ycombinator.com/item?id=26805121 - April 2021 (98
       | comments)
        
       | Y_Y wrote:
       | If, like me, your first reaction is that this looks suspiciously
       | like Datalog then you may be interested to learn that they indeed
       | consider Logical to be "in the the Datalog family".
        
         | jp57 wrote:
         | I think Datalog should be thought of as "in the logic
         | programming family", so other data languages based on logic
         | programming are likely to be similar.
         | 
         | And, of course the relational model of data is based on first-
         | order logic, so one could say that SQL is a declarative logic
         | programming language for data.
        
       | riku_iki wrote:
       | Only one active committer on github..
        
       | thenaturalist wrote:
       | I don't want to come off as too overconfident, but would be very
       | hard pressed to see the value of this.
       | 
       | At face value, I shudder at the syntax.
       | 
       | Example from their tutorial:
       | 
       | EmployeeName(name:) :- Employee(name:);
       | 
       | Engineer(name:) :- Employee(name:, role: "Engineer");
       | 
       | EngineersAndProductManagers(name:) :- Employee(name:, role:),
       | role == "Engineer" || role == "Product Manager";
       | 
       | vs. the equivalent SQL:
       | 
       | SELECT Employee.name AS name
       | 
       | FROM t_0_Employee AS Employee
       | 
       | WHERE (Employee.role = "Engineer" OR Employee.role = "Product
       | Manager");
       | 
       | SQL is much more concise, extremely easy to follow.
       | 
       | No weird OOP-style class instantiation for something as simple as
       | just getting the name.
       | 
       | As already noted in the 2021 discussion, what's actually the
       | killer though is adoption and, three years later, ecosystem.
       | 
       | SQL for analytics has come an extremely long way with the
       | ecosystem that was ignited by dbt.
       | 
       | There is so much better tooling today when it comes to testing,
       | modelling, running in memory with tools like DuckDB or Ibis,
       | Apache Iceberg.
       | 
       | There is value to abstracting on top of SQL, but it does very
       | much seem to me like this is not it.
        
         | Tomte wrote:
         | The syntax is Prolog-like, so people in the field are familiar
         | with it.
        
           | thenaturalist wrote:
           | Which field would that be?
           | 
           | I.e. I understand now that it's seemingly about more than
           | simple querying, so me coming very much from an analytics/
           | data crunching background am wondering what a use case would
           | look like where this is arguably superior to SQL.
        
             | tannhaeuser wrote:
             | > _Which field would that be?_
             | 
             | Database theory papers and books have used Prolog/Datalog-
             | like syntax throughout the years, such as those by Serge
             | Abiteboul, just to give a single example of a researcher
             | and prolific author over the decades.
        
         | aseipp wrote:
         | Logica is in the Datalog/Prolog/Logic family of programming
         | languages. It's very familiar to anyone who knows how to read
         | it. None of this has anything to do with OOP at all and you
         | will heavily mislead yourself if you try to map any of that
         | thinking onto it. (Beyond that, and not specific to Logica or
         | SQL in any way -- comparing two 3-line programs to draw
         | conclusions is effectively meaningless. You have to actually
         | write programs bigger than that to see the whole picture.)
         | 
         | Datalog is not really a query language, actually. But it is
         | relational, like SQL, so it lets you express relations between
         | "facts" (the rows) inside tables. But it is more general,
         | because it also lets you express relations between _tables
         | themselves_ (e.g. this  "table" is built from the relationship
         | between two smaller tables), and it does so without requiring
         | extra special case semantics like VIEWs.
         | 
         | Because of this, it's easy to write small fragments of Datalog
         | programs, and then stick it together with other fragments,
         | without a lot of planning ahead of time, meaning as a language
         | it is very compositional. This is one of the primary reasons
         | why many people are interested in it as a SQL alternative;
         | aside from your typical weird SQL quirks that are avoided with
         | better language design (which are annoying, but not really the
         | big picture.)
        
           | thenaturalist wrote:
           | > but it is more general, because it also lets you express
           | relations between tables themselves (e.g. this "table" is
           | built from the relationship between two smaller tables), and
           | it does so without requiring extra special case semantics
           | like VIEWs.
           | 
           | If I understand you correctly, you can easily get the same
           | with ephemeral models in dbt or CTEs generally?
           | 
           | > Because of this, it's easy to write small fragments of
           | Datalog programs, and then stick it together with other
           | fragments, without a lot of planning ahead of time, meaning
           | as a language it is very compositional.
           | 
           | This can be a benefit in some cases, I guess, but how can you
           | guarantee correctness with flexibility involved?
           | 
           | With SQL, I get either table or column level lineage with all
           | modern tools, can audit each upstream output before going
           | into a downstream input. In dbt I have macros which I can
           | reuse everywhere.
           | 
           | It's very compositional while at the same time perfectly
           | documented and testable at runtime.
           | 
           | Could you share a more specific example or scenario where you
           | have seen Datalog/ Logica outperform a modern SQL setup?
           | 
           | Generally curious.
           | 
           | I am not at all familiar with the Logica/Datalog/Prolog
           | world.
        
             | from-nibly wrote:
             | Prolog et al is a real brain buster. As in it will break
             | your spirits and build you back up better. I remember in
             | college I was able to build a binary tree with 3 lines of
             | code. And once you write the insert, the delete, search,
             | and others just magically appear.
             | 
             | It also frames your thinking about defining what you want
             | rather than how to get it.
             | 
             | If you really want to see the power of these kinds of
             | languages look up Einstein's puzzle solved with prolog. The
             | solution just magically comes out by entering the
             | constraints of the puzzle.
        
               | rytis wrote:
               | I suppose something like this:
               | https://stackoverflow.com/a/8270393 ?
        
               | surgical_fire wrote:
               | I had to use Prolog in college, and while I never saw it
               | in the wild - I at least never stumbled upon a scenario
               | where prolog was the answer - I really enjoyed how I had
               | to change how I looked at a problem in order to solve it
               | in prolog.
        
             | burakemir wrote:
             | Here is a proof that you can translate non-recursive
             | datalog into relational algebra and vice versa: https://git
             | hub.com/google/mangle/blob/main/docs/spec_explain...
             | 
             | Since Logica is translated to SQL it should benefit from
             | all the query optimistic goodness that went into the SQL
             | engine that runs the resulting queries.
             | 
             | I personally see the disadvantages of SQL in that it is not
             | really modular, you cannot have libraries, tests and such.
             | 
             | Disclosure: I wrote Mangle (the link goes to the Mangle
             | repo), another datalog, different way of extending, no SQL
             | translation but an engine library.
        
             | aseipp wrote:
             | > If I understand you correctly, you can easily get the
             | same with ephemeral models in dbt or CTEs generally?
             | 
             | You can bolt on any number of 3rd party features or
             | extensions to get some extra thing, that goes for any tool
             | in the world. The point of something like Datalog is that
             | it can express a similar class of relational programs that
             | SQL can, but with a smaller set of core ideas. "Do more
             | with less."
             | 
             | > I guess, but how can you guarantee correctness with
             | flexibility involved?
             | 
             | How do you guarantee the correctness of anything? How do
             | you know any SQL query you write is correct? Well, as the
             | author, you typically have a good idea. The point of being
             | compositional is that it's easier to stick together
             | arbitrary things defined in Datalog, and have the resulting
             | thing work smoothly.
             | 
             | Going back to the previous example, you can define any two
             | "tables" and then just derive a third "table" from these,
             | using language features that you already use -- to define
             | relationships between rows. Datalog can define relations
             | between rules (tables) and between facts (rows), all with a
             | single syntactic/semantic concept. While SQL can only by
             | default express relations between rows. Therefore, raw SQL
             | is kind of "the bottom half" of Datalog, and to get the
             | upper half you need features like CTEs, VIEWs, etc, and
             | apply them appropriately. You need more concepts to cover
             | both the bottom and top half; Datalog covers them with one
             | concept. Datalog also makes it easy to express things like
             | e.g. queries on graph structures, but again, you don't need
             | extra features like CTEs for this to happen.
             | 
             | There are of course lots of tricky bits (e.g. optimization)
             | but the general idea works very well.
             | 
             | > Could you share a more specific example or scenario where
             | you have seen Datalog/ Logica outperform a modern SQL
             | setup?
             | 
             | Again, Datalog is not about SQL. It's a logic programming
             | language. You need to actually spend time doing logic
             | programming with something like Prolog or Datalog to
             | appreciate the class of things it can do well. It just so
             | happens Datalog is also good for expressing relational
             | programs, which is what you do in SQL.
             | 
             | Most of the times I'm doing logic programming I'm actually
             | writing programs, not database queries. Trying to do things
             | like analyze programs to learn facts about them (Souffle
             | Datalog, "can this function ever call this other function
             | in any circumstance?") or something like a declarative
             | program as a decision procedure. For example, I have a
             | prototype Prolog program sitting around that scans a big
             | code repository, figures out all 3rd party dependencies and
             | their licenses, then tries to work out whether they are
             | compatible.
             | 
             | It's a bit like Lisp, in the sense that it's a core
             | formulation of a set of ideas that you aren't going to
             | magically adopt without doing it yourself a bunch. I could
             | show you a bunch of logic programs, but without experience
             | all the core ideas are going to be lost and the comparison
             | would be meaningless.
             | 
             | For the record, I don't use Logica with SQL, but not
             | because I wouldn't want to. It seems like a good approach.
             | I would use Datalog over SQL happily for my own projects if
             | I could. The reasons I don't use Logica for instance are
             | more technical than anything -- it is a Python library, and
             | I don't use Python.
        
             | jyounker wrote:
             | The covid analysis seems like a pretty good example: https:
             | //colab.research.google.com/github/EvgSkv/logica/blob/...
             | 
             | A good exercise might be converting it to the corresponding
             | SQL and comparing the two for clarity.
        
           | cess11 wrote:
           | Right, so that's what they claim, that you'll get small
           | reusable pieces.
           | 
           | But: "Logica compiles to SQL".
           | 
           | With the caveat that it only kind of does, since it seems
           | constrained to three database engines, probably the one they
           | optimise the output to perform well on, one where it usually
           | doesn't matter and one that's kind of mid performance wise
           | anyway.
           | 
           | In light of that quote it's also weird that they mention that
           | they are able to run the SQL they compiled to "in interactive
           | time" on a rather large dataset, which they supposedly
           | already could with SQL.
           | 
           | Arguably I'm not very good with Datalog and have mostly used
           | Prolog, but to me it doesn't look much like a Datalog.
           | Predicates seems to be variadic with named parameters, making
           | variables implicit at the call site so to understand a
           | complex predicate you need to hop away and look at how the
           | composite predicates are defined to understand what they
           | return. Maybe I misunderstand how it works, but at first
           | glance that doesn't look particularly attractive to me.
           | 
           | Can you put arithmetic in the head of clauses in Datalog
           | proper? As far as I can remember, that's not part of the
           | language. To me it isn't obvious what this is supposed to do
           | in this query language.
        
             | aseipp wrote:
             | For the record, I don't use Logica myself so I'm not
             | familiar with every design decision or feature -- I'm not a
             | Python programmer. I'm speaking about Datalog in general.
             | 
             | > making variables implicit at the call site
             | 
             | What example are you looking at? The NewsData example for
             | instance seems pretty understandable to me. It seems like
             | for any given predicate you can either take the implicit
             | name of the column or you can map it onto a different name
             | e.g. `date: date_num` for the underlying column on gdelt-
             | bq.gdeltv2.gkg.
             | 
             | Really it just seems like a way to make the grammar less
             | complicated; the `name: foo` syntax is their way of
             | expressing 'AS' clauses and `name:` is just a shorthand for
             | `name: name`
             | 
             | > In light of that quote it's also weird that they mention
             | that they are able to run the SQL they compiled to "in
             | interactive time" on a rather large dataset, which they
             | supposedly already could with SQL.
             | 
             | The query in question is run on BigQuery (which IIRC was
             | the original and only target database for Logica), and in
             | that setup you might do a query over 4TB of data but get a
             | response in milliseconds due to partitioning, column
             | compression, parallel aggregation, etc. This is actually
             | really common for many queries. So, in that kind of setup
             | the translation layer needs to be fast so it doesn't spoil
             | the benefit for the end user. I think the statement makes
             | complete sense, tbh. (This also probably explains why they
             | wrote it in Python, so you could use it in Jupyter
             | notebooks hooked up to BigQuery.)
        
           | joe_the_user wrote:
           | _It 's very familiar to anyone who knows how to read it._
           | 
           | "Anyone who know the system can easily learn it" he said with
           | a sniff.
           | 
           | Yes, the similarity to Prolog lets you draw on a vast pool of
           | Prolog programmers out there.
           | 
           | I mean, I studied a variety of esoteric languages in college
           | and they were interesting (I can't remember if we got to
           | prolog tbh but I know 1st logic pretty well and that's
           | related). When I was thrown into a job with SQL, it's English
           | language syntax made things really easy. I feel confident
           | that knowing SQL wouldn't oppositely make learning Prolog
           | easy (I remember Scala later and not being able to deal with
           | it's opaque verbosity easily).
           | 
           | Basically, SQL syntax makes easy things easy. This gets
           | underestimated a lot, indeed people seem to have contempt for
           | it. I think that's a serious mistake.
        
             | jyounker wrote:
             | > Basically, SQL syntax makes easy things easy. This gets
             | underestimated a lot, indeed people seem to have contempt
             | for it. I think that's a serious mistake.
             | 
             | The flip side of that is SQL makes hard things nearly
             | impossible.
             | 
             | SQL doesn't have facilities for abstraction, and it doesn't
             | compose, and this has consequences that I deal with daily.
             | 
             | The lack of abstract facilities makes it hard to construct
             | complicated queries, it makes it hard to debug them, and it
             | makes it hard refactor them.
             | 
             | Instead of writing more complicated SQL queries, developers
             | lean on the host languages to coordinate SQL calls, using
             | the host language's abstraction facilities to cover for
             | SQL's inadequacies.
        
               | joe_the_user wrote:
               | _The flip side of that is SQL makes hard things nearly
               | impossible._
               | 
               | What about SQL _syntax_ makes the hard things possible? I
               | get that the actual language SQL is broken in all sorts
               | of ways. But I don 't see any reason to replace it with
               | some opaque from get-go.
               | 
               | I mean, what stops you from defining, say adjectives and
               | using those for rough modularity.
               | 
               | Say                   EXPENSIVE(T) means T.price > 0;
               | Select name FROM books WHERE EXPENSIVE(books);
               | 
               | Seems understandable.
        
             | aseipp wrote:
             | I mean, yes, that's sort of how linguistics works in
             | general? You can't just look at a language with completely
             | different orthography or semantic concepts and expect to be
             | able to reliably map it onto your pre-existing language
             | with no effort. That's sort of the whole reason translation
             | is a generally difficult problem.
             | 
             | I don't really get this kind of complaint in general I'm
             | afraid. Many people can read and write, say, Hangul just
             | fine -- and at the same time we don't expect random English
             | speakers with no familiarity will be able to understand
             | Korean conversations, or any syllabic writing systems in
             | general. Programming language families/classes like logic
             | programming are really no different.
             | 
             | > it's English language syntax made things really easy
             | 
             | That's just called "being familiar with English" more than
             | any inherent property of SQL or English.
        
         | jyounker wrote:
         | > No weird OOP-style class instantiation for something as
         | simple as just getting the name.
         | 
         | I understand the desire to no waste your time, but I think
         | you're missing the big idea. Those statements define logical
         | relations. There's nothing related to classes or OOP.
         | 
         | Using those building blocks you can do everything that you can
         | with SQL. No need for having clauses. No need for group by
         | clauses. No need for subquery clauses. No need for special join
         | syntax. Just what you see above.
         | 
         | And you can keep going with it. SQL quickly runs into the
         | limitations of the language. Using the syntax above (which is
         | basically Prolog) you can construct arbitrarily large software
         | systems which are still understandable.
         | 
         | If you're really interested in improving as a developer, then I
         | suggest that spend a day or two playing with a logic
         | programming system of some sort. It's a completely different
         | way of thinking about programming, and it will give you mental
         | tools that you will never pick up any other way.
        
       | cess11 wrote:
       | If this is how you want to compile to SQL, why not invent your
       | own DCG with Prolog proper?
       | 
       | It should be easy enough if you're somewhat fluent in both
       | languages, and has the perk of not being some Python thing at a
       | megacorp famous for killing its projects.
        
       | taeric wrote:
       | I find the appeals to composition tough to agree with. For one,
       | most queries begin as ad hoc questions. And can usually be tossed
       | after. If they are needed for speed, it is the index structure
       | that is more vital than the query structure. That and knowing
       | what materialized views have been made with implications on
       | propagation delays.
       | 
       | Curious to hear battle stories from other teams using this.
        
       | Agraillo wrote:
       | I think it is a good direction imho. Once being familiar with SQL
       | I learned Prolog a little and similarities struck me. I wasn't
       | the first one sure, and there are others who summarized it better
       | than me [1] (2010-2012):
       | 
       |  _Each can do the other, to a limited extent, but it becomes
       | increasingly difficult with even small increases in complexity.
       | For instance, you can do inferencing in SQL, but it is almost
       | entirely manual in nature and not at all like the automatic
       | forward-inferencing of Prolog. And yes, you can store data(facts)
       | in Prolog, but it is not at all designed for the "storage,
       | retrieval, projection and reduction of Trillions of rows with
       | thousands of simultaneous users" that SQL is._
       | 
       | I even wanted to implement something like Logica at the moment,
       | primarily trying to build a bridge through a virtual table in
       | SQLite that would allow storing rules as mostly Prolog statements
       | and having adapters to SQL storage when inference needs facts.
       | 
       | [1]: https://stackoverflow.com/a/2119003
        
       | foobarqux wrote:
       | There don't seem to be any examples of how to connect to an
       | existing (say sqlite) database even though it says you should try
       | logica if "you already have data in BigQuery, PostgreSQL or
       | SQLite,". How do you connect to an existing sqlite database?
        
       | avodonosov wrote:
       | > Composite(a * b) distinct :- ...
       | 
       | Wait, does Logica factorize the number passed to this predicate
       | when unifying the number with a * b?
       | 
       | So when we call Composite (100) it automatically tries all a's
       | and b's who give 100 when m7ltiplied
       | 
       | I'd be curious to see the SQL it transpiles to.
        
       ___________________________________________________________________
       (page generated 2024-11-16 23:00 UTC)