[HN Gopher] FugueSQL: SQL-ish for pandas, dask, spark
___________________________________________________________________
FugueSQL: SQL-ish for pandas, dask, spark
Author : aiNohY6g
Score : 57 points
Date : 2021-10-11 16:43 UTC (6 hours ago)
(HTM) web link (fugue-tutorials.readthedocs.io)
(TXT) w3m dump (fugue-tutorials.readthedocs.io)
| crimsoneer wrote:
| So, I'm a data science person who uses SQL largely to integrate
| with python or just for pretty straightforward extracts - my
| company is very data not mature. I've previously used MySql,
| postGresql, and have been considered DuckDB (though not clear I
| can store stuff outside of temp memory). Is this good? Is
| something else good?
| bothra90 wrote:
| Is this solving similar problems as Ray [1]?
|
| [1] https://www.ray.io/
| goodwanghan wrote:
| Hey, I am the author of Fugue.
|
| Fugue is a higher level abstraction compared to Ray. It
| provides unified and non-invasive interfaces for people to use
| Spark, Dask and Pandas. Ray/Modin is also on our roadmap.
|
| It provides both Python interface (not pandas-like) and Fugue
| SQL (standard SQL + extra features). Users can choose the one
| they are most comfortable with as the semantic layer for
| distributed computing, they are equivalent.
|
| With Fugue, most of your logic will be in simple Python/SQL
| that is framework and scale agnostic. From the mindset to the
| code, Fugue minimizes your dependency on any specific computing
| frameworks including Fugue itself.
|
| Please let me know if you want to learn more. our slack is in
| the README of the fugue repo
|
| Fugue repo: https://github.com/fugue-project/fugue Tutorials:
| https://fugue-project.github.io/tutorials/
| dayeye2006 wrote:
| What kind of parser does FugueSQL use? Does it use Apache
| Calcite?
| goodwanghan wrote:
| No, we use antlr, we have no dependency on Java.
| elephantum wrote:
| no
| mwexler wrote:
| Well, sort of. Fugue overall is a scaling engine like ray.
| The specific link to yet another SQL access layer to a
| dataset doesnt really have an analog on ray, but has some
| nice features.
|
| I love these SQL layers but they can obfuscate how they
| implement their transforms. So, they can speed up filter and
| join creation and coding... til something breaks and then you
| have to go atomic anyway.
| elephantum wrote:
| Fugue is a translation layer from SQL to underlying
| runtime: pandas, dask, spark.
|
| Each of the runtimes, supported by Fugue, can be compared
| to Ray, but Fugue is a tool of a different kind.
| goodwanghan wrote:
| That is very true. Thank you.
|
| Fugue SQL is one way, and it also has functional API.
| They both can be translated into the underlying runtime.
| You can choose based your preference and real need.
___________________________________________________________________
(page generated 2021-10-11 23:01 UTC)