[HN Gopher] FugueSQL: SQL-ish for pandas, dask, spark
       ___________________________________________________________________
        
       FugueSQL: SQL-ish for pandas, dask, spark
        
       Author : aiNohY6g
       Score  : 57 points
       Date   : 2021-10-11 16:43 UTC (6 hours ago)
        
 (HTM) web link (fugue-tutorials.readthedocs.io)
 (TXT) w3m dump (fugue-tutorials.readthedocs.io)
        
       | crimsoneer wrote:
       | So, I'm a data science person who uses SQL largely to integrate
       | with python or just for pretty straightforward extracts - my
       | company is very data not mature. I've previously used MySql,
       | postGresql, and have been considered DuckDB (though not clear I
       | can store stuff outside of temp memory). Is this good? Is
       | something else good?
        
       | bothra90 wrote:
       | Is this solving similar problems as Ray [1]?
       | 
       | [1] https://www.ray.io/
        
         | goodwanghan wrote:
         | Hey, I am the author of Fugue.
         | 
         | Fugue is a higher level abstraction compared to Ray. It
         | provides unified and non-invasive interfaces for people to use
         | Spark, Dask and Pandas. Ray/Modin is also on our roadmap.
         | 
         | It provides both Python interface (not pandas-like) and Fugue
         | SQL (standard SQL + extra features). Users can choose the one
         | they are most comfortable with as the semantic layer for
         | distributed computing, they are equivalent.
         | 
         | With Fugue, most of your logic will be in simple Python/SQL
         | that is framework and scale agnostic. From the mindset to the
         | code, Fugue minimizes your dependency on any specific computing
         | frameworks including Fugue itself.
         | 
         | Please let me know if you want to learn more. our slack is in
         | the README of the fugue repo
         | 
         | Fugue repo: https://github.com/fugue-project/fugue Tutorials:
         | https://fugue-project.github.io/tutorials/
        
           | dayeye2006 wrote:
           | What kind of parser does FugueSQL use? Does it use Apache
           | Calcite?
        
             | goodwanghan wrote:
             | No, we use antlr, we have no dependency on Java.
        
         | elephantum wrote:
         | no
        
           | mwexler wrote:
           | Well, sort of. Fugue overall is a scaling engine like ray.
           | The specific link to yet another SQL access layer to a
           | dataset doesnt really have an analog on ray, but has some
           | nice features.
           | 
           | I love these SQL layers but they can obfuscate how they
           | implement their transforms. So, they can speed up filter and
           | join creation and coding... til something breaks and then you
           | have to go atomic anyway.
        
             | elephantum wrote:
             | Fugue is a translation layer from SQL to underlying
             | runtime: pandas, dask, spark.
             | 
             | Each of the runtimes, supported by Fugue, can be compared
             | to Ray, but Fugue is a tool of a different kind.
        
               | goodwanghan wrote:
               | That is very true. Thank you.
               | 
               | Fugue SQL is one way, and it also has functional API.
               | They both can be translated into the underlying runtime.
               | You can choose based your preference and real need.
        
       ___________________________________________________________________
       (page generated 2021-10-11 23:01 UTC)