[HN Gopher] Logica - Declarative logic programming language for ...
___________________________________________________________________
Logica - Declarative logic programming language for data
Author : voat
Score : 68 points
Date : 2024-11-16 19:09 UTC (3 hours ago)
(HTM) web link (logica.dev)
(TXT) w3m dump (logica.dev)
| dang wrote:
| Related:
|
| _Google is pushing the new language Logica to solve the major
| flaws in SQL_ - https://news.ycombinator.com/item?id=29715957 -
| Dec 2021 (1 comment)
|
| _Logica, a novel open-source logic programming language_ -
| https://news.ycombinator.com/item?id=26805121 - April 2021 (98
| comments)
| Y_Y wrote:
| If, like me, your first reaction is that this looks suspiciously
| like Datalog then you may be interested to learn that they indeed
| consider Logical to be "in the the Datalog family".
| jp57 wrote:
| I think Datalog should be thought of as "in the logic
| programming family", so other data languages based on logic
| programming are likely to be similar.
|
| And, of course the relational model of data is based on first-
| order logic, so one could say that SQL is a declarative logic
| programming language for data.
| riku_iki wrote:
| Only one active committer on github..
| thenaturalist wrote:
| I don't want to come off as too overconfident, but would be very
| hard pressed to see the value of this.
|
| At face value, I shudder at the syntax.
|
| Example from their tutorial:
|
| EmployeeName(name:) :- Employee(name:);
|
| Engineer(name:) :- Employee(name:, role: "Engineer");
|
| EngineersAndProductManagers(name:) :- Employee(name:, role:),
| role == "Engineer" || role == "Product Manager";
|
| vs. the equivalent SQL:
|
| SELECT Employee.name AS name
|
| FROM t_0_Employee AS Employee
|
| WHERE (Employee.role = "Engineer" OR Employee.role = "Product
| Manager");
|
| SQL is much more concise, extremely easy to follow.
|
| No weird OOP-style class instantiation for something as simple as
| just getting the name.
|
| As already noted in the 2021 discussion, what's actually the
| killer though is adoption and, three years later, ecosystem.
|
| SQL for analytics has come an extremely long way with the
| ecosystem that was ignited by dbt.
|
| There is so much better tooling today when it comes to testing,
| modelling, running in memory with tools like DuckDB or Ibis,
| Apache Iceberg.
|
| There is value to abstracting on top of SQL, but it does very
| much seem to me like this is not it.
| Tomte wrote:
| The syntax is Prolog-like, so people in the field are familiar
| with it.
| thenaturalist wrote:
| Which field would that be?
|
| I.e. I understand now that it's seemingly about more than
| simple querying, so me coming very much from an analytics/
| data crunching background am wondering what a use case would
| look like where this is arguably superior to SQL.
| tannhaeuser wrote:
| > _Which field would that be?_
|
| Database theory papers and books have used Prolog/Datalog-
| like syntax throughout the years, such as those by Serge
| Abiteboul, just to give a single example of a researcher
| and prolific author over the decades.
| aseipp wrote:
| Logica is in the Datalog/Prolog/Logic family of programming
| languages. It's very familiar to anyone who knows how to read
| it. None of this has anything to do with OOP at all and you
| will heavily mislead yourself if you try to map any of that
| thinking onto it. (Beyond that, and not specific to Logica or
| SQL in any way -- comparing two 3-line programs to draw
| conclusions is effectively meaningless. You have to actually
| write programs bigger than that to see the whole picture.)
|
| Datalog is not really a query language, actually. But it is
| relational, like SQL, so it lets you express relations between
| "facts" (the rows) inside tables. But it is more general,
| because it also lets you express relations between _tables
| themselves_ (e.g. this "table" is built from the relationship
| between two smaller tables), and it does so without requiring
| extra special case semantics like VIEWs.
|
| Because of this, it's easy to write small fragments of Datalog
| programs, and then stick it together with other fragments,
| without a lot of planning ahead of time, meaning as a language
| it is very compositional. This is one of the primary reasons
| why many people are interested in it as a SQL alternative;
| aside from your typical weird SQL quirks that are avoided with
| better language design (which are annoying, but not really the
| big picture.)
| thenaturalist wrote:
| > but it is more general, because it also lets you express
| relations between tables themselves (e.g. this "table" is
| built from the relationship between two smaller tables), and
| it does so without requiring extra special case semantics
| like VIEWs.
|
| If I understand you correctly, you can easily get the same
| with ephemeral models in dbt or CTEs generally?
|
| > Because of this, it's easy to write small fragments of
| Datalog programs, and then stick it together with other
| fragments, without a lot of planning ahead of time, meaning
| as a language it is very compositional.
|
| This can be a benefit in some cases, I guess, but how can you
| guarantee correctness with flexibility involved?
|
| With SQL, I get either table or column level lineage with all
| modern tools, can audit each upstream output before going
| into a downstream input. In dbt I have macros which I can
| reuse everywhere.
|
| It's very compositional while at the same time perfectly
| documented and testable at runtime.
|
| Could you share a more specific example or scenario where you
| have seen Datalog/ Logica outperform a modern SQL setup?
|
| Generally curious.
|
| I am not at all familiar with the Logica/Datalog/Prolog
| world.
| from-nibly wrote:
| Prolog et al is a real brain buster. As in it will break
| your spirits and build you back up better. I remember in
| college I was able to build a binary tree with 3 lines of
| code. And once you write the insert, the delete, search,
| and others just magically appear.
|
| It also frames your thinking about defining what you want
| rather than how to get it.
|
| If you really want to see the power of these kinds of
| languages look up Einstein's puzzle solved with prolog. The
| solution just magically comes out by entering the
| constraints of the puzzle.
| rytis wrote:
| I suppose something like this:
| https://stackoverflow.com/a/8270393 ?
| surgical_fire wrote:
| I had to use Prolog in college, and while I never saw it
| in the wild - I at least never stumbled upon a scenario
| where prolog was the answer - I really enjoyed how I had
| to change how I looked at a problem in order to solve it
| in prolog.
| burakemir wrote:
| Here is a proof that you can translate non-recursive
| datalog into relational algebra and vice versa: https://git
| hub.com/google/mangle/blob/main/docs/spec_explain...
|
| Since Logica is translated to SQL it should benefit from
| all the query optimistic goodness that went into the SQL
| engine that runs the resulting queries.
|
| I personally see the disadvantages of SQL in that it is not
| really modular, you cannot have libraries, tests and such.
|
| Disclosure: I wrote Mangle (the link goes to the Mangle
| repo), another datalog, different way of extending, no SQL
| translation but an engine library.
| aseipp wrote:
| > If I understand you correctly, you can easily get the
| same with ephemeral models in dbt or CTEs generally?
|
| You can bolt on any number of 3rd party features or
| extensions to get some extra thing, that goes for any tool
| in the world. The point of something like Datalog is that
| it can express a similar class of relational programs that
| SQL can, but with a smaller set of core ideas. "Do more
| with less."
|
| > I guess, but how can you guarantee correctness with
| flexibility involved?
|
| How do you guarantee the correctness of anything? How do
| you know any SQL query you write is correct? Well, as the
| author, you typically have a good idea. The point of being
| compositional is that it's easier to stick together
| arbitrary things defined in Datalog, and have the resulting
| thing work smoothly.
|
| Going back to the previous example, you can define any two
| "tables" and then just derive a third "table" from these,
| using language features that you already use -- to define
| relationships between rows. Datalog can define relations
| between rules (tables) and between facts (rows), all with a
| single syntactic/semantic concept. While SQL can only by
| default express relations between rows. Therefore, raw SQL
| is kind of "the bottom half" of Datalog, and to get the
| upper half you need features like CTEs, VIEWs, etc, and
| apply them appropriately. You need more concepts to cover
| both the bottom and top half; Datalog covers them with one
| concept. Datalog also makes it easy to express things like
| e.g. queries on graph structures, but again, you don't need
| extra features like CTEs for this to happen.
|
| There are of course lots of tricky bits (e.g. optimization)
| but the general idea works very well.
|
| > Could you share a more specific example or scenario where
| you have seen Datalog/ Logica outperform a modern SQL
| setup?
|
| Again, Datalog is not about SQL. It's a logic programming
| language. You need to actually spend time doing logic
| programming with something like Prolog or Datalog to
| appreciate the class of things it can do well. It just so
| happens Datalog is also good for expressing relational
| programs, which is what you do in SQL.
|
| Most of the times I'm doing logic programming I'm actually
| writing programs, not database queries. Trying to do things
| like analyze programs to learn facts about them (Souffle
| Datalog, "can this function ever call this other function
| in any circumstance?") or something like a declarative
| program as a decision procedure. For example, I have a
| prototype Prolog program sitting around that scans a big
| code repository, figures out all 3rd party dependencies and
| their licenses, then tries to work out whether they are
| compatible.
|
| It's a bit like Lisp, in the sense that it's a core
| formulation of a set of ideas that you aren't going to
| magically adopt without doing it yourself a bunch. I could
| show you a bunch of logic programs, but without experience
| all the core ideas are going to be lost and the comparison
| would be meaningless.
|
| For the record, I don't use Logica with SQL, but not
| because I wouldn't want to. It seems like a good approach.
| I would use Datalog over SQL happily for my own projects if
| I could. The reasons I don't use Logica for instance are
| more technical than anything -- it is a Python library, and
| I don't use Python.
| jyounker wrote:
| The covid analysis seems like a pretty good example: https:
| //colab.research.google.com/github/EvgSkv/logica/blob/...
|
| A good exercise might be converting it to the corresponding
| SQL and comparing the two for clarity.
| cess11 wrote:
| Right, so that's what they claim, that you'll get small
| reusable pieces.
|
| But: "Logica compiles to SQL".
|
| With the caveat that it only kind of does, since it seems
| constrained to three database engines, probably the one they
| optimise the output to perform well on, one where it usually
| doesn't matter and one that's kind of mid performance wise
| anyway.
|
| In light of that quote it's also weird that they mention that
| they are able to run the SQL they compiled to "in interactive
| time" on a rather large dataset, which they supposedly
| already could with SQL.
|
| Arguably I'm not very good with Datalog and have mostly used
| Prolog, but to me it doesn't look much like a Datalog.
| Predicates seems to be variadic with named parameters, making
| variables implicit at the call site so to understand a
| complex predicate you need to hop away and look at how the
| composite predicates are defined to understand what they
| return. Maybe I misunderstand how it works, but at first
| glance that doesn't look particularly attractive to me.
|
| Can you put arithmetic in the head of clauses in Datalog
| proper? As far as I can remember, that's not part of the
| language. To me it isn't obvious what this is supposed to do
| in this query language.
| aseipp wrote:
| For the record, I don't use Logica myself so I'm not
| familiar with every design decision or feature -- I'm not a
| Python programmer. I'm speaking about Datalog in general.
|
| > making variables implicit at the call site
|
| What example are you looking at? The NewsData example for
| instance seems pretty understandable to me. It seems like
| for any given predicate you can either take the implicit
| name of the column or you can map it onto a different name
| e.g. `date: date_num` for the underlying column on gdelt-
| bq.gdeltv2.gkg.
|
| Really it just seems like a way to make the grammar less
| complicated; the `name: foo` syntax is their way of
| expressing 'AS' clauses and `name:` is just a shorthand for
| `name: name`
|
| > In light of that quote it's also weird that they mention
| that they are able to run the SQL they compiled to "in
| interactive time" on a rather large dataset, which they
| supposedly already could with SQL.
|
| The query in question is run on BigQuery (which IIRC was
| the original and only target database for Logica), and in
| that setup you might do a query over 4TB of data but get a
| response in milliseconds due to partitioning, column
| compression, parallel aggregation, etc. This is actually
| really common for many queries. So, in that kind of setup
| the translation layer needs to be fast so it doesn't spoil
| the benefit for the end user. I think the statement makes
| complete sense, tbh. (This also probably explains why they
| wrote it in Python, so you could use it in Jupyter
| notebooks hooked up to BigQuery.)
| joe_the_user wrote:
| _It 's very familiar to anyone who knows how to read it._
|
| "Anyone who know the system can easily learn it" he said with
| a sniff.
|
| Yes, the similarity to Prolog lets you draw on a vast pool of
| Prolog programmers out there.
|
| I mean, I studied a variety of esoteric languages in college
| and they were interesting (I can't remember if we got to
| prolog tbh but I know 1st logic pretty well and that's
| related). When I was thrown into a job with SQL, it's English
| language syntax made things really easy. I feel confident
| that knowing SQL wouldn't oppositely make learning Prolog
| easy (I remember Scala later and not being able to deal with
| it's opaque verbosity easily).
|
| Basically, SQL syntax makes easy things easy. This gets
| underestimated a lot, indeed people seem to have contempt for
| it. I think that's a serious mistake.
| jyounker wrote:
| > Basically, SQL syntax makes easy things easy. This gets
| underestimated a lot, indeed people seem to have contempt
| for it. I think that's a serious mistake.
|
| The flip side of that is SQL makes hard things nearly
| impossible.
|
| SQL doesn't have facilities for abstraction, and it doesn't
| compose, and this has consequences that I deal with daily.
|
| The lack of abstract facilities makes it hard to construct
| complicated queries, it makes it hard to debug them, and it
| makes it hard refactor them.
|
| Instead of writing more complicated SQL queries, developers
| lean on the host languages to coordinate SQL calls, using
| the host language's abstraction facilities to cover for
| SQL's inadequacies.
| joe_the_user wrote:
| _The flip side of that is SQL makes hard things nearly
| impossible._
|
| What about SQL _syntax_ makes the hard things possible? I
| get that the actual language SQL is broken in all sorts
| of ways. But I don 't see any reason to replace it with
| some opaque from get-go.
|
| I mean, what stops you from defining, say adjectives and
| using those for rough modularity.
|
| Say EXPENSIVE(T) means T.price > 0;
| Select name FROM books WHERE EXPENSIVE(books);
|
| Seems understandable.
| aseipp wrote:
| I mean, yes, that's sort of how linguistics works in
| general? You can't just look at a language with completely
| different orthography or semantic concepts and expect to be
| able to reliably map it onto your pre-existing language
| with no effort. That's sort of the whole reason translation
| is a generally difficult problem.
|
| I don't really get this kind of complaint in general I'm
| afraid. Many people can read and write, say, Hangul just
| fine -- and at the same time we don't expect random English
| speakers with no familiarity will be able to understand
| Korean conversations, or any syllabic writing systems in
| general. Programming language families/classes like logic
| programming are really no different.
|
| > it's English language syntax made things really easy
|
| That's just called "being familiar with English" more than
| any inherent property of SQL or English.
| jyounker wrote:
| > No weird OOP-style class instantiation for something as
| simple as just getting the name.
|
| I understand the desire to no waste your time, but I think
| you're missing the big idea. Those statements define logical
| relations. There's nothing related to classes or OOP.
|
| Using those building blocks you can do everything that you can
| with SQL. No need for having clauses. No need for group by
| clauses. No need for subquery clauses. No need for special join
| syntax. Just what you see above.
|
| And you can keep going with it. SQL quickly runs into the
| limitations of the language. Using the syntax above (which is
| basically Prolog) you can construct arbitrarily large software
| systems which are still understandable.
|
| If you're really interested in improving as a developer, then I
| suggest that spend a day or two playing with a logic
| programming system of some sort. It's a completely different
| way of thinking about programming, and it will give you mental
| tools that you will never pick up any other way.
| cess11 wrote:
| If this is how you want to compile to SQL, why not invent your
| own DCG with Prolog proper?
|
| It should be easy enough if you're somewhat fluent in both
| languages, and has the perk of not being some Python thing at a
| megacorp famous for killing its projects.
| taeric wrote:
| I find the appeals to composition tough to agree with. For one,
| most queries begin as ad hoc questions. And can usually be tossed
| after. If they are needed for speed, it is the index structure
| that is more vital than the query structure. That and knowing
| what materialized views have been made with implications on
| propagation delays.
|
| Curious to hear battle stories from other teams using this.
| Agraillo wrote:
| I think it is a good direction imho. Once being familiar with SQL
| I learned Prolog a little and similarities struck me. I wasn't
| the first one sure, and there are others who summarized it better
| than me [1] (2010-2012):
|
| _Each can do the other, to a limited extent, but it becomes
| increasingly difficult with even small increases in complexity.
| For instance, you can do inferencing in SQL, but it is almost
| entirely manual in nature and not at all like the automatic
| forward-inferencing of Prolog. And yes, you can store data(facts)
| in Prolog, but it is not at all designed for the "storage,
| retrieval, projection and reduction of Trillions of rows with
| thousands of simultaneous users" that SQL is._
|
| I even wanted to implement something like Logica at the moment,
| primarily trying to build a bridge through a virtual table in
| SQLite that would allow storing rules as mostly Prolog statements
| and having adapters to SQL storage when inference needs facts.
|
| [1]: https://stackoverflow.com/a/2119003
| foobarqux wrote:
| There don't seem to be any examples of how to connect to an
| existing (say sqlite) database even though it says you should try
| logica if "you already have data in BigQuery, PostgreSQL or
| SQLite,". How do you connect to an existing sqlite database?
| avodonosov wrote:
| > Composite(a * b) distinct :- ...
|
| Wait, does Logica factorize the number passed to this predicate
| when unifying the number with a * b?
|
| So when we call Composite (100) it automatically tries all a's
| and b's who give 100 when m7ltiplied
|
| I'd be curious to see the SQL it transpiles to.
___________________________________________________________________
(page generated 2024-11-16 23:00 UTC)