[HN Gopher] Show HN: WASM-powered codespaces for Python notebook...
       ___________________________________________________________________
        
       Show HN: WASM-powered codespaces for Python notebooks on GitHub
        
       Hi HN!  Last year, we shared marimo [1], an open-source reactive
       notebook for Python with support for execution through WebAssembly
       [2].  We wanted to share something new: you can now run marimo and
       Jupyter notebooks directly from GitHub in a Wasm-powered,
       codespace-like environment. What makes this powerful is that we
       mount the GitHub repository's contents as a filesystem in the
       notebook, making it really easy to share notebooks with data.  All
       you need to do is prepend 'marimo.app' to any Python notebook on
       GitHub. Some examples:  - Jupyter Notebook:
       https://marimo.app/github.com/jakevdp/PythonDataScienceHandb...  -
       marimo notebook: https://marimo.app/github.com/marimo-
       team/marimo/blob/07e8d1...  Jupyter notebooks are automatically
       converted into marimo notebooks using basic static analysis and
       source code transformations. Our conversion logic assumes the
       notebook was meant to be run top-down, which is usually but not
       always true [3]. It can convert many notebooks, but there are still
       some edge cases.  We implemented the filesystem mount using our own
       FUSE-like adapter that links the GitHub repository's contents to
       the Python filesystem, leveraging Emscripten's filesystem API. The
       file tree is loaded on startup to avoid waterfall requests when
       reading many directories deep, but loading the file contents is
       lazy. For example, when you write Python that looks like  ```python
       with open("./data/cars.csv") as f: print(f.read())  # or  import
       pandas as pd pd.read_csv("./data/cars.csv")  ```  behind the
       scenes, you make a request [4] to
       https://raw.githubusercontent.com/<org>/<repo>/main/data/car....
       Docs: https://docs.marimo.io/guides/publishing/playground/#open-
       no...  [1] https://github.com/marimo-team/marimo  [2]
       https://news.ycombinator.com/item?id=39552882  [3]
       https://blog.jetbrains.com/datalore/2020/12/17/we-downloaded...
       [4] We technically proxy it through the playground
       https://marimo.app to fix CORS issues and GitHub rate-limiting.
        
       Author : mscolnick
       Score  : 99 points
       Date   : 2025-01-14 17:46 UTC (5 hours ago)
        
 (HTM) web link (docs.marimo.io)
 (TXT) w3m dump (docs.marimo.io)
        
       | hzuo wrote:
       | Super cool to see a real use-case of WASM outside of just game
       | dev and nerding out.
        
         | PKop wrote:
         | Blazor is another example
        
         | pjmlp wrote:
         | We also have Flash, Java Applets, ActiveX and Silverlight back,
         | running on top of WebAssembly.
        
       | dhbradshaw wrote:
       | This is really cool -- going to show it off to my team. I love
       | the fact that you opened it up so that it will work with Jupyter
       | notebooks as well.
        
       | ge96 wrote:
       | wow a python interpreter is "only" 100MB not sure if that's
       | what's happening here
        
         | bagels wrote:
         | Is that too small or too large in your estimation?
        
           | ge96 wrote:
           | Wrt web pages supposedly being a couple megabytes it's a big
           | number but at the same time it seems expected with these kind
           | of applications (usually when I see WASM it's a 3D video
           | game)
        
         | mscolnick wrote:
         | It is much smaller than that, Pyodide is only 2.8mb and the
         | Python stdlib is 2.3mb when zipped
        
           | ge96 wrote:
           | Oh that's great
        
           | miohtama wrote:
           | There is $300k in bounties if you create under 1MB
           | WebAssembly CPython distribution
           | 
           | https://www.reddit.com/r/Python/comments/1huxrs6/python_runn.
           | ..
        
             | ge96 wrote:
             | now that's a weissman score
        
       | HanClinto wrote:
       | I absolutely love that this can be hosted on Github Pages. Am I
       | correct in understanding that these notebooks will run
       | independently, and will not need to proxy through marimo.app (in
       | case the app goes down), or is that what the CORS thing is about
       | in note 4, and it will still need to go through this domain?
        
         | mscolnick wrote:
         | Yea, this can be hosted on GitHub pages without any vendor
         | infra (no marimo.app)
         | 
         | These are two separate features:
         | 
         | 1) marimo.app + github.com/path/to/nb.ipynb does run on
         | marimo.app infra. this is what the Show HN was about
         | 
         | 2) separately, you can use the marimo CLI to export assets to
         | deploy to GitHub page: `marimo export html-wasm notebook.py -o
         | output_dir --mode run` which can then can be uploaded to GH
         | pages. This does not find all the data in your repo, so you
         | would need to stick any data you was to access in a /public
         | folder for your site. More docs here:
         | https://docs.marimo.io/guides/exporting/?h=marimo+export+htm...
        
       | westurner wrote:
       | > [ FUSE to GitHub FS ]
       | 
       | > _Notebooks created from GitHub links have the entire contents
       | of the repository mounted into the notebook 's filesystem. This
       | lets you work with files using regular Python file I/O!_
       | 
       | Could BusyBox sh compiled to WASM (maybe on emscripten-forge)
       | work with files on this same filesystem?
       | 
       | "Opening a GitHub remote with vscode.dev requires GitHub login?
       | #237371" ... but it works with Marimo and JupyterLite:
       | https://github.com/microsoft/vscode/issues/237371
       | 
       | Does Marimo support local file system access?
       | 
       | jupyterlab-filesystem-access only works with Chrome?:
       | https://github.com/jupyterlab-contrib/jupyterlab-filesystem-...
       | 
       | vscode-marimo: https://github.com/marimo-team/vscode-marimo
       | 
       | "Normalize and make Content frontends and backends extensible
       | #315" https://github.com/jupyterlite/jupyterlite/issues/315
       | 
       | "ENH: Pluggable Cloud Storage provider API; git, jupyter/rtc"
       | https://github.com/jupyterlite/jupyterlite/issues/464
       | 
       | Jupyterlite has read only access to GitHub repos without login,
       | but vscode.dev does not.
       | 
       | Anyways, nbreproduce wraps repo2docker and there's also a
       | repo2jupyterlite.
       | 
       | nbreproduce builds a container to run an .ipynb with:
       | https://github.com/econ-ark/nbreproduce
       | 
       | container2wasm wraps vscode-container-wasm:
       | https://github.com/ktock/vscode-container-wasm
       | 
       | container2wasm: https://github.com/ktock/container2wasm
        
       | westurner wrote:
       | > CORS and GitHub
       | 
       | The Godot docs mention coi-serviceworker;
       | https://github.com/orgs/community/discussions/13309 :
       | 
       | gzuidhof/coi-serviceworker: https://github.com/gzuidhof/coi-
       | serviceworker :
       | 
       | > _Cross-origin isolation (COOP and COEP) through a service
       | worker for situations in which you can 't control the headers
       | (e.g. GH pages)_
       | 
       | CF Pages' free unlimited bandwidth and gitops-style deploy might
       | solve for apps that require more than the 100GB software cap of
       | free bandwidth GH has for open source projects.
        
         | mscolnick wrote:
         | Thanks for sharing these resources
        
       | data-ottawa wrote:
       | I love seeing projects like this. When Pyiodide came out I was
       | excited but it was a bit difficult to use, this looks and feels
       | fantastic.
       | 
       | I really like Observable as well, but I've found it difficult to
       | find robust and broad numerical libraries in javascript like what
       | Python has.
       | 
       | I would love for this type of tool to redefine how we do science.
       | It would be amazing if many scientific papers included both their
       | data and the code in an interactive environment with zero
       | installs and configuration. Plus when discussing a paper you
       | could "fork" it and explore different analysis options live which
       | for many fields would be totally feasible to do in the browser.
        
         | dmadisetti wrote:
         | I feel like pytomls and shared source are becoming standard,
         | but yes-
         | 
         | notebooks vs research code are sometimes very separate, very
         | difficult to directly reproduce. A big difficultly with
         | "working out of the box, shared in browser" is that weights,
         | training, inference, simulations- are all still very compute
         | intensive.
         | 
         | BUT the nice thing about a stateless notebook, is that you can
         | precompute values- and cache them. I've been really excited
         | about expanding marimo's caching system, and would love to get
         | to a point whether sharing a notebook means being able to run
         | the research yourself without some big setup dance.
        
       | lordswork wrote:
       | The future is awesome. Thanks for building this!
        
       ___________________________________________________________________
       (page generated 2025-01-14 23:00 UTC)