[HN Gopher] Dataflow, a self-hosted Observable notebook editor
___________________________________________________________________
Dataflow, a self-hosted Observable notebook editor
Author : tosh
Score : 121 points
Date : 2021-05-13 18:28 UTC (4 hours ago)
(HTM) web link (observablehq.com)
(TXT) w3m dump (observablehq.com)
| lejohnq wrote:
| This is pretty awesome. Feels like a streamlit for the javascript
| world.
| FormFollowsFunc wrote:
| I've been looking for something like this for data vis
| exploration. Compared to Observable accessing local data files is
| more convenient. Currently I use a Jupyter notebook along with
| Pandas and Matplotlib. I'm not a huge fan of Matplotlib so I
| would prefer to use Plot or Vega Lite API and Pandas could be
| replaced with Danfo.js or Arquero.
| [deleted]
| whoevercares wrote:
| How does this related to data flow or it's just a brand name
| RocketSyntax wrote:
| Help me understand what the page being rendered is doing. Is that
| like an interactive app you are serving for user input?
| qbasic_forever wrote:
| It's an observable notebook: https://observablehq.com/
| Basically a notebook where you write JS code and see the
| results immediately rendered in the notebook. In this case it's
| being served locally instead of requiring you to use their
| service website. If you've ever used Jupyter or IPython this is
| very similar (code notebooks) but with some interesting changes
| in philosophy and more of a Javascript implementation instead
| of python.
|
| What might be tripping you up is that in this demo the
| observable notebook isn't showing the code cells, only the
| outputs. The code is in the editor on the left and the output
| on the right is the result of running the code as an observable
| notebook. In some ways it is like a simple interactive web app.
| Isthatablackgsd wrote:
| Is that similar concept to Overleaf for LaTeX?
| chrisweekly wrote:
| OK! I can't put off creating an observablehq acct any longer.
|
| ... Done. Stoked to dive in this weekend!
| simonw wrote:
| This project looks fantastic.
|
| I adore Observable notebooks, but the one thing that makes me
| hesitate in using them for everything is that the editor
| component itself is closed-source and only available on
| https://observablehq.com/
|
| They're great open source ecosystem supporters - they released
| their runtime, their parser, their standard library and all sorts
| of other stuff through https://github.com/observablehq - but the
| editor itself is their proprietary sauce.
|
| I totally support their decision on this - it's what they're
| building their business around, and I want them to be successful.
| But as a user it does give me pause.
|
| This project from Alex Garcia looks like a fix for exactly that.
| Having more-than-one editor for their notebook format (and an
| open source option a that) resolves my hesitancy in leaning hard
| into their ecosystem.
|
| I don't even see it as a competitor to ObservableHQ - the hosted
| Observable editor has collaboration features that don't even make
| sense for a local running version.
|
| Plus, Dataflow has some great ideas of its own - in particular
| the live file attachments thing.
| edtechdev wrote:
| Yeah the lack of open source prevented me from committing to
| observable, too, so I look forward to trying dataflow out.
|
| Just in case this is of interest to others, some other open
| source browser-based computational notebook tools include:
|
| * Starboard https://starboard.gg/ * And of course there's
| always Jupyter, but it requires a server component
|
| And this isn't the same thing, more of a javascript playground
| (open source alternative to codepen and the like), but see also
| Slingcode: https://slingcode.net/
| nautilus12 wrote:
| I see all these notebooks products and I honestly don't know how
| any of them plan to compete with AWS...no body wants self hosted
| anymore, everyone just wants to pay AWS or databricks for it.
|
| Can other people chime in? Maybe i'm just working at the wrong
| place.
| qbasic_forever wrote:
| It's running on localhost here and I presume that's their
| intended use case for this feature. Localhost is critical for
| development--imagine if VS code wouldn't work unless you were
| connected to Github.com. This is fixing that issue with
| observable notebooks so now you can run and develop your
| notebook locally without depending directly on the internet or
| their cloud service.
| simonw wrote:
| https://observablehq.com/ is a cloud hosted platform already.
|
| This thing - Dataflow - is an open source run-on-your-own-
| machine alternative to the official Observable hosted solution,
| taking advantage of the fact that Observable itself is
| JavaScript code with some special sauce that's available as
| open source runtime/parser libraries.
| [deleted]
| mistidoi wrote:
| As a total Observable/Bostock stan who works with HIPAA protected
| data, I love this.
| d--b wrote:
| I am also working on an alternative: https://www.jigdev.com
|
| It's the same idea except that cells are spread out on a 2d
| canvas with tabs similar to excel.
| keeganj wrote:
| I'm not a data scientist, but I've been interested in the idea of
| a "code notebook" ever since Jupyter hit it big. I write mostly
| in JS/TS for application logic, so this looks like it could be
| really useful.
|
| Related, does anyone have any recommendations of a (Postgres) SQL
| "notebook"? I don't really need any visualizations, more just a
| markdown integrated doc that allows me to lay out the different
| queries I use to answer a question.
| amcaskill wrote:
| I am working on a SQL-in-markdown reporting tool called
| evidence.
|
| It's feels like a markdown doc that runs SQL.
|
| https://evidence.dev/
| keeganj wrote:
| This is almost exactly what I was imagining. Just subscribed
| to updates, very interested to see what this becomes!
| qbasic_forever wrote:
| I like the ipython-sql magic in Jupyter:
| https://github.com/catherinedevlin/ipython-sql Depending on
| what you're doing you might be able to get away entirely with
| just using it and some basic queries, i.e. no python glue code
| in the notebook at all. But worst case you might need a cell to
| open up the DB connection and make the magic aware of it, then
| you can execute clean and simple SQL queries in cells using the
| magic.
| robertlacok wrote:
| Deepnote has native Postgres cells :) you can mix them with
| Python too.
|
| Disclaimer - I work there :)
| gradys wrote:
| Maybe just a Python notebook with a Postgres client library and
| some helper functions to keep the amount of Python in the main
| body to a minimum?
| Hasnep wrote:
| Rmarkdown notebooks can contain SQL chunks, so you'd only need
| to use R to configure the connection. [1]
|
| [1] https://bookdown.org/yihui/rmarkdown/language-
| engines.html#s...
| keeganj wrote:
| I didn't know you could write SQL directly in Rmarkdown like
| this, very interesting. Thanks!
| pbowyer wrote:
| Same, when I've read the docs I've always got the
| impression that it was R only supported.
| sixdimensional wrote:
| Apache Zeppelin is one open source option -
| https://zeppelin.apache.org.
| RocketSyntax wrote:
| Lots of jupyter magic `%` commands for that already
| https://www.datacamp.com/community/tutorials/sql-interface-w...
| natrys wrote:
| Emacs and Org-mode has great integration with multiple SQL
| implementations including Postgres (via org-babel). Org-mode
| tables are pretty neat, and you can have query result directly
| populated into tables. Read this blogpost if you are
| interested:
|
| https://fluca1978.github.io/2021/01/18/PostgreSQLLiteratePro...
| tlarkworthy wrote:
| https://observablehq.com/@observablehq/databases
| simonw wrote:
| Weirdly my Django SQL Dashboard project may fit the bill a bit
| here: you can build up a "dashboard" (which is a tiny bit
| notebook-like if you squint at it the right way) with multiple
| SQL queries on it, and save that either as a bookmark or as a
| "saved dashboard" with a URL.
|
| https://django-sql-dashboard.datasette.io/
|
| In my own work I've been using it for the kind of things that I
| would normally use a Jupyter notebook for - gathering together
| research on problems I'm trying to solve.
| keeganj wrote:
| Interesting take, I'm not deep in the python ecosystem, but
| this looks like it's lightweight enough to function as a
| refreshable notebook. Will give this a try, thanks!
| javierluraschi wrote:
| For viz/DS/ML/AI with JS/TS is either observablehq or and IDE
| with custom extensions; this project looks relevant if you are
| already into observablehq.
|
| Shameless plug, we are building a few tools for JS to narrow
| down this gap as well: - https://hal9.ai (Drag&Drop / IDE) -
| https://marketplace.visualstudio.com/items?itemName=Hal9.hal...
| (VSCode extension) -
| https://observablehq.com/@javierluraschi/running-nodejs-in-o...
| (ObservableHQ extension)
|
| Would love to chat if you are interested in providing feedback,
| I'm in javier at hal9.ai. Cheers.
| Siira wrote:
| org-babel should fit the bill.
| shapiromatron wrote:
| re: sql notebook, this came up a few months ago and worked
| great when I played around with it:
| https://blog.jupyter.org/an-sql-solution-for-jupyter-
| ef4a00a.... It's just a different kernel you can install to an
| existing jupyter instance.
| okennedy wrote:
| It's based on Spark rather than Postgresql directly, but I'm
| part of an effort to build a workflow system disguised as a
| notebook callled Vizier [1]. SQL is a first-class primitive in
| Vizier, and the notebook plays nice with postgres (you can load
| from and unload to postgres using Spark's native data loader).
|
| [1] https://vizierdb.info
| thirtyseven wrote:
| I know "dataflow" is kind of a generic name, but the authors
| might want to consider that there is already a 7 year old Google
| Cloud product for running data pipelines called Dataflow.
| taftster wrote:
| Came here to post the same comment. Exactly right. There are
| lots of projects that use the term "dataflow".
|
| To add to this, the name of this product is confusing given the
| context and usecase shown. I assume "dataflow" to the author
| means the ability to watch data being rendered on a page?
|
| To "big data" folks (like myself), the term "dataflow" tends to
| represent the routing and processing of data streams along an
| information pipeline. Not anything to do with a visual
| representation of a dynamic notebook.
| marcinzm wrote:
| And a Cloudera project:
| https://www.cloudera.com/products/cdf.html
|
| And an Azure feature: https://docs.microsoft.com/en-
| us/azure/data-factory/control-...
|
| And a Spring feature: https://spring.io/projects/spring-cloud-
| dataflow
| rectang wrote:
| And an entire programming discipline.
|
| https://en.wikipedia.org/wiki/Dataflow_programming
| [deleted]
___________________________________________________________________
(page generated 2021-05-13 23:00 UTC)