[HN Gopher] Transpile Any SQL to PostgreSQL Dialect
       ___________________________________________________________________
        
       Transpile Any SQL to PostgreSQL Dialect
        
       Author : fljdin
       Score  : 152 points
       Date   : 2024-03-18 09:46 UTC (13 hours ago)
        
 (HTM) web link (gitlab.com)
 (TXT) w3m dump (gitlab.com)
        
       | fljdin wrote:
       | A CLI tool to transpile SQL snippet from a dialect to another
       | using an AST
        
       | wendyshu wrote:
       | How is it different from sqlglot?
        
       | souenzzo wrote:
       | https://www.jooq.org/translate/
       | 
       | JOOQ also do that, but it is in JAVA.
        
         | fljdin wrote:
         | The `transqlate` project aims to extend this kind of tool
         | without limitation and under open-source license.
         | 
         | We're studying JOOQ from afar, with the ambition of getting
         | closer to its functional coverage.
        
       | ThinkBeat wrote:
       | I see that the support for Oracle is mentioned and clearly
       | present in the source code, is that also at present the only
       | supported database? Any hope for transactSql (Microsoft Sql.
       | Server)
        
         | hans_castorp wrote:
         | From the "Roadmap" section of the README:
         | 
         | > more source dialects : mysql, mssql, sybase, etc.
         | 
         | https://gitlab.com/dalibo/transqlate#roadmap
        
         | staticlibs wrote:
         | Transact-SQL support (both queries and stored procedures) for
         | PostgreSQL is implemented in Babelfish extensions [1]. It is
         | implemented from the ground up as a "pl/tsql" language for
         | Postgres [2], without using SQL-to-SQL transpilation like JOOQ.
         | 
         | [1] https://babelfishpg.org/ [2] https://github.com/babelfish-
         | for-postgresql/babelfish_extens...
        
       | pentaphobe wrote:
       | `transqlate` is such a good name - I am very jealous, well done!
        
       | cipherself wrote:
       | Interesting, a while back at $EMPLOYER while working in a PoC, I
       | found myself in a situation where I need to take user-provided
       | Postgres SQL queries and run them against one of our APIs.
       | Roughly, I converted the API response to a `pandas dataframe`, I
       | then parsed and transformed the query from the postgres dialect
       | to the duckdb dialect using `sqlglot` and used `duckdb` to query
       | the `pandas dataframe`, coverted it into `json` and returned it
       | to the user.
        
         | neeleshs wrote:
         | That's interesting. I'm looking to do something similar, but
         | need wire compatibility with postgresql. This allows any
         | postgres client to talk to our service. Didn't have a lot of
         | luck finding a good "middleware"
        
           | hantusk wrote:
           | you could consider hosting an empty postgresql database,
           | compile your code as a postgresql foreign data wrapper and
           | expose it as a view. Nothing is more compatible with the
           | postgres wire protocol than postgresql itself ;)
           | 
           | turbot compiles their steampipe plugins in this way. Example:
           | https://github.com/turbot/steampipe-plugin-net
        
             | neeleshs wrote:
             | That's a good idea and we considered FDW (for this and more
             | stuff), but having a middleware makes it more flexible -
             | FDW has limitations around pushdown with subselects, we are
             | still constrained to a single postgresql instance for
             | execution, when in theory we could parallelize (certain)
             | queries across nodes.
        
           | cipherself wrote:
           | I think this is probably the missing piece for you?
           | https://github.com/jwills/buenavista
        
       | pmarreck wrote:
       | This is interesting.
       | 
       | Is there any standardized AST for SQL?
        
         | jerf wrote:
         | The SQL standard nominally defines one (I don't know if it
         | hands it to you on a silver platter as some sort of grammar but
         | it certainly defines one one way or another), but if you just
         | implement that and then send your code out into the world,
         | you're going to be disappointed with the results. All the
         | database have deviations and extensions of their own and people
         | use them frequently.
        
       | amw-zero wrote:
       | This in combination with
       | [pg_query](https://github.com/pganalyze/libpg_query) could allow
       | for writing generic static analyzers.
        
       | liotier wrote:
       | I hope for the day such tool will convert Oracle PL/SQL to
       | Postgres
        
         | fljdin wrote:
         | This is one of the major challenges the tool aims to meet! We
         | hope to achieve reliable results this year.
         | 
         | https://gitlab.com/dalibo/transqlate/-/issues/20
        
       | teslatom wrote:
       | I've been looking for a library that can convert query plans, not
       | just ASTs, between Postgres and other query engines. The reason
       | this is useful is because RLS (row level security) settings are
       | injected into the query plan after the AST is constructed, so the
       | same query string can produce different query plans depending on
       | the security settings of the database.
       | 
       | Substrait comes close, but I haven't found a Postgres Substrait
       | producer. Are there any projects working on this?
        
       | quantxx wrote:
       | Interesting
        
       | LeftHandPath wrote:
       | I wondered why this sort of thing didn't already exist. I had to
       | develop for IBM Db2 (for i Series) on a Linux environment (using
       | MariaDB) due to the licensing that stopped us from having a test
       | env.
       | 
       | I wound up writing a very ad-hoc C++ program that would parse a
       | base SQL file and generate the appropriate DB2 and MariaDB
       | dialect versions. Not flexible enough for reuse but it got the
       | job done.
        
         | Symbiote wrote:
         | I think Apache Calcite can do this, or something similar.
         | 
         | https://calcite.apache.org/
        
       | jakozaur wrote:
       | I see a lot of these types of tools, each with some essential
       | support, but usually they don't work. I would appreciate what is
       | support what is not.
       | 
       | Tried this tool: go run gitlab.com/dalibo/transqlate@v0.1-beta.2
       | 
       | select TRUNC(SYSDATE, 'month') from dual
       | 
       | -- TRANSLATION ERROR at +1:8: unsupported date format: "month"
       | rule="replace trunc() by date_trunc()" select DATE_TRUNC('month',
       | LOCALTIMESTAMP)exit status 1
       | 
       | Also, at least, it would set the latest git tag, currently
       | command from Readme.md doesn't work:
       | 
       | go run gitlab.com/dalibo/transqlate@latest
       | 
       | go: gitlab.com/dalibo/transqlate@latest: module
       | gitlab.com/dalibo/transqlate@latest found (v0.0.0), but does not
       | contain package gitlab.com/dalibo/transqlate
       | 
       | latest should point to v0.1-beta.2
        
         | fljdin wrote:
         | Thank you for your feedback, both issues have been added to the
         | project tracking.
        
         | zeroxfe wrote:
         | I asked Google Gemini to do this, and it gave me a seemingly
         | convincing response. Anyone know if it's right?
         | 
         | Prompt:                 translate the SQL statement "select
         | TRUNC(SYSDATE, 'month') from dual" to a PostgreSQL dialect
         | 
         | Response (formatting removed):                 Here is the
         | equivalent SQL statement in PostgreSQL for "select
         | TRUNC(SYSDATE, 'month') from dual" in Oracle:
         | SELECT DATE_TRUNC('month', CURRENT_DATE) AT TIME ZONE 'UTC'
         | Use code with caution.            In PostgreSQL:
         | DATE_TRUNC is the function used to truncate a date value to a
         | specified unit.              CURRENT_DATE is a function that
         | returns the current date.              AT TIME ZONE 'UTC' is
         | used to specify that the truncation should be done in
         | Coordinated Universal Time (UTC). This is optional, but it's a
         | good practice to be explicit about the time zone to avoid any
         | ambiguity.            The dual table is a dummy table specific
         | to Oracle that is used in simple SQL statements.
         | PostgreSQL doesn't require a table reference in this case.
        
           | peter_l_downs wrote:
           | I view a comment like yours as essentially spam and of
           | negative value. I'm not sure if we (vaguely, hn forum
           | community) have agreed on the etiquette here, but I have to
           | ask why you would bother sharing an AI-generated response if
           | you're not sure if it's correct. What are you getting at? How
           | could this be helpful?
        
             | saurik wrote:
             | I am often picky about my/our time being wasted by low-
             | effort comments but in this case I actually was myself
             | coming into this thread assuming the actual tool being
             | discussed was itself going to be an LLM-based solution;
             | and, when it wasn't, all I was thinking was "aren't we in
             | the future? isn't this just something people are going to
             | ask an AI to handle now?".
             | 
             | Hell: a friend of mine claims he asked Claude (using some
             | plug-in in his IDE) to port his app from Firebase all the
             | way to PostgreSQL and it just did it... and it mostly
             | worked well enough to save him a ton of time typing and
             | looking up syntax for stuff, and I think it is now in
             | production?
             | 
             | I do agree, though--and maybe this was your core point--
             | that maybe this "let's ask Gemini" comment should be left
             | by someone who would also know if the answer was correct
             | rather than just saying "no clue but". Although, honestly,
             | even having the output seems useful here (and again: I am
             | someone constantly unhappy about people asking questions
             | they can Google and other such filler content).
        
               | irq-1 wrote:
               | I agree: the comment was on-topic, included technical
               | information (dual table?), and AI accuracy/usefulness is
               | of interest to most readers. Maybe in a year I'll think
               | such a comment is bad, but not today.
        
             | zeroxfe wrote:
             | Apologies, I was trying to imply that LLMs may be more
             | practical to solve these kinds of problems than hand-built
             | translators (assuming the primary purpose is for engineers
             | using this during development.)
        
             | parhamn wrote:
             | I think it's really reasonable. When I was going through
             | the readme I was (half heartedly) interested in seeing an
             | OPENAI_KEY config variable in the readme.
        
           | irq-1 wrote:
           | Looks like `AT TIME ZONE 'UTC'` should not be used, but I
           | don't think it's illegal syntax.
           | 
           | https://www.postgresql.org/docs/current/datatype-
           | datetime.ht...
           | 
           | > PostgreSQL endeavors to be compatible with the SQL standard
           | definitions for typical usage. However, the SQL standard has
           | an odd mix of date and time types and capabilities...
           | 
           | > Although _the date type cannot have an associated time
           | zone_ , the time type can.
           | 
           | (emphasis mine)
           | 
           | > To address these difficulties, we recommend using date/time
           | types that contain both date and time when using time zones.
        
       | peter_d_sherman wrote:
       | Related:
       | 
       | PRQL (higher-level abstraction language for SQL)
       | 
       | https://prql-lang.org/
       | 
       | HN Article, PRQL:
       | 
       | https://news.ycombinator.com/item?id=36866861
       | 
       | Google search, site: GitHub, "SQL Transpiler":
       | 
       | https://www.google.com/search?q=site%3Agithub.com+%22SQL+Tra...
       | 
       | GitHub general list of open-source Transpilers:
       | 
       | https://github.com/topics/transpiler
        
       | fuzzieozzie wrote:
       | Many tools try to achieve similar eg.
       | https://aws.amazon.com/rds/aurora/babelfish/
       | 
       | One company, CompilerWorks actually had tools that transpiled
       | many SQL dialects. They were bought by Google
       | https://www.crunchbase.com/acquisition/google-acquires-compi...
        
       | chaps wrote:
       | Nice. Anyone know if similar exists going the opposite direction?
       | Specifically thinking of postgresql -> oracle sql.
        
       | r1b wrote:
       | Recommend checking out https://github.com/tobymao/sqlglot if you
       | are interested in this capability for other SQL dialects
       | 
       | Tools like this are helpful for:
       | 
       | - Rendering SQL in a consistent way, eg for snapshot testing
       | 
       | - Testing SQL business logic in CI against a dialect with less
       | heavyweight dependencies
       | 
       | - Applying AST transformations to take advantage of dialect-
       | specific optimizations
        
       ___________________________________________________________________
       (page generated 2024-03-18 23:00 UTC)