hngopher.com

       [HN Gopher] Show HN: Marimo - an open-source reactive notebook f...
       ___________________________________________________________________
        
       Show HN: Marimo - an open-source reactive notebook for Python
        
       Hi HN! We're excited to share marimo, an open-source reactive
       notebook for Python [1]. marimo aims to solve well-known problems
       with traditional notebooks [2]: marimo notebooks are reproducible
       (no hidden state), git-friendly (stored as Python files),
       executable as Python scripts, and deployable as web apps.  GitHub
       repo: https://github.com/marimo-team/marimo  In marimo, a
       notebook's code, outputs, and program state are always consistent.
       Run a cell and marimo reacts by automatically running the cells
       that reference its declared variables. Delete a cell and marimo
       scrubs its variables from program memory, eliminating hidden state.
       Our reactive runtime is based on static analysis, so it's
       performant. If you're worried about accidentally triggering
       expensive computations, you can disable specific cells from auto-
       running.  marimo comes with UI elements like sliders, a dataframe
       transformer, and interactive plots that are automatically
       synchronized with Python [3]. Interact with an element and the
       cells that use it are automatically re-run with its latest value.
       Reactivity makes these UI elements more useful and ergonomic than
       Jupyter's ipywidgets.  Every marimo notebook can be run as a script
       from the command line, with cells executed in a topologically
       sorted order, or served as an interactive web app, using the marimo
       CLI.  We're a team of just two developers. We chose to develop
       marimo because we believe that the Python community deserves a
       better programming environment to do research and communicate it;
       experiment with code and share it; and learn computational science
       and teach it. We've seen lots of research start in Jupyter
       notebooks (much of my own has), only to fail to reproduce; lots of
       promising prototypes built that were never made real; and lots of
       tutorials written that failed to engage students.  marimo has been
       developed with the close input of scientists and engineers, and
       with inspiration from many tools, including Pluto.jl and streamlit.
       We open-sourced it recently because we feel it's ready for broader
       use. Please try it out (pip install marimo && marimo tutorial
       intro). We'd appreciate your feedback!  [1]
       https://github.com/marimo-team/marimo  [2]
       https://docs.marimo.io/faq.html#faq-problems  [3]
       https://docs.marimo.io/api/inputs/index.html
        
       Author : akshayka
       Score  : 166 points
       Date   : 2024-01-12 18:33 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | warthog wrote:
       | Did not work a lot with Jupyter nbs but I think it would be good
       | for you to put more emphasis into Jupyter vs Marimo into your
       | website
        
         | alsodumb wrote:
         | Copying from a reddit answer by OP:
         | https://www.reddit.com/r/MachineLearning/comments/191rdwq/co...
         | 
         | marimo solves problems in reproducibility, maintainability,
         | interactivity, reusability, and shareability:
         | 
         | *Reproducibility* In Jupyter notebooks, the code you see
         | doesn't necessarily match the outputs on the page or the
         | program state. Some cases in which this can happen: (1) if you
         | delete a cell, its variables stay in memory, which other cells
         | may still reference (2) users can execute cells in arbitrary
         | order. This leads to widespread reproducibility issues. One
         | study analyzed 1 million Jupyter notebooks and found that 36%
         | of them didn't reproduce
         | (https://blog.jetbrains.com/datalore/2020/12/17/we-
         | downloaded...).
         | 
         | In contrast, marimo guarantees that your code, outputs, and
         | program state are all synchronized, making your notebooks more
         | reproducible by eliminating hidden state. marimo achieves this
         | by intelligently analyzing your code and understanding the
         | relationships between cells, and automatically re-running cells
         | as needed (sort of like a spreadsheet but better).
         | 
         | *Maintainability* marimo notebooks are stored as pure Python
         | programs (.py files). This lets you version them with git; in
         | contrast, Jupyter notebooks are stored as JSON and require
         | extra steps to sensibly version.
         | 
         | *Interactivity* marimo notebooks come with UI elements that are
         | automatically synchronized with Python (like sliders,
         | dropdowns) ... scrub a slider and all cells that reference it
         | are automatically re-run with the new value. This is very
         | difficult to get working in Jupyter notebooks.
         | 
         | *Reusability* marimo notebooks can be executed as Python
         | scripts from the command-line (since they're stored as .py
         | files). In contrast, this requires extra steps/effort to do for
         | Jupyter, such as copying and pasting the code out or using
         | external frameworks. In the future, we'll also let you import
         | symbols (functions, classes) defined in a marimo notebook into
         | other Python programs/notebooks, something you can't really do
         | with Jupyter.
         | 
         | *Shareability* Every marimo notebook can double as an
         | interactive web app, complete with UI elements, which you can
         | serve using our CLI. This isn't possible in Jupyter without
         | substantial extra effort.
         | 
         | You might also want to check out Joel Grus' talk on notebooks.
         | We solve many of the problems he highlights:
         | https://www.youtube.com/watch?v=7jiPeIFXb6U&t=1s
        
         | pvg wrote:
         | It's right in the linked FAQ
         | 
         | https://docs.marimo.io/faq.html#faq-jupyter
        
           | noahlt wrote:
           | It's there, but warthog is right, it should be a toplevel
           | section like "A reactive programming environment" -- yes
           | ideally people would read the description and understand the
           | differences themselves, or consult the FAQ, but the fact is
           | that most people will understand Marimo in relation to
           | Jupyter and so you might as well optimize that path.
        
       | bluish29 wrote:
       | That's one interesting project. As someone who relies heavily on
       | collaboration with people using Jupyter Notebook. The most
       | annoying points about reproducing their work are the environment
       | and the hidden state of Jupyter Notebooks.
       | 
       | This does to address directly the second problem. It does however
       | by sacrificing flexibility. I might need to change a cell just to
       | test a new thing (without affecting the other cells) but thats a
       | trade off if you focus on reproducibility.
       | 
       | I know that requirements.txt is the standard solution to the
       | other problem. But generating and using it is annoying. The
       | command pio freeze will list all the packages in bloated way
       | (there is better ways) but I always hoped to find a notebook
       | system that will integrate this information natively and have a
       | way to embed that into a notebook in a form that I can share with
       | other people. Unfortunately I can't see support for something in
       | any of the available solutions (at least up to my knowledge).
        
         | akshayka wrote:
         | Yes, the second half of reproducibility is for sure packages. A
         | solution for reproducible environments is on our roadmap
         | (https://marimo-team.notion.site/The-marimo-
         | roadmap-e5460b9f2...), but we haven't quite figured it out yet.
         | 
         | It's a bit challenging because Python has so many different
         | solutions for package management. If you have any ideas we'd
         | love to hear them.
        
           | bluish29 wrote:
           | The link redirect does not specify which point in the list
           | you are referring to but I guess it is "Install missing
           | packages from...". If so, then I really wonder if you mean
           | supporting something like '!pip install numpy' like Jupyter
           | or something else?
           | 
           | I don't think this is really a solution, not to mention that
           | this raise the question. Does it support running shell
           | commands using '!' like Jupyter Notebook?
        
             | akshayka wrote:
             | Oh, sorry for not being more clear. That's not the one.
             | It's "Package management: make notebooks reproducible down
             | to the packages they use": https://marimo-
             | team.notion.site/840c475fd7ca4e3a8c6f20c86fce...
             | 
             | Does that align with what you're talking about?
             | 
             | That page has some scrawled brainstormed notes. But we
             | haven't spent time designing a solution yet.
        
               | bluish29 wrote:
               | Thanks. That is precisely what I was talking about in my
               | comment. It would solve the problem if we have some like
               | that integrated natively. I understand that between pip,
               | conda, mamba and all the others it would be hard problem
               | to solve. But at least auto generating requirements.txt
               | would be easier. But to be honest the hard part is
               | identify packages and where they are from not what to do
               | with information. Good luck with the development.
        
           | aidos wrote:
           | People always complain about pip and python packaging but
           | it's never been an issue for me. I create a
           | requirements.base.txt that has the versions of things I want
           | installed. I then:                   pip freeze -r
           | requirements.base.txt > requirements.txt
           | 
           | Install is then simply:                   pip install -r
           | requirements.txt
           | 
           | Updating / installing something new is a matter of adding to
           | the base file and then refreezing.
        
             | bluish29 wrote:
             | There are several problems with this approach, notably you
             | don't get information about specific platform stuff. You
             | don't get information on how these package are installed
             | (conda, mamba..etc).
             | 
             | And it does not account for dependincies version conflicts
             | which life very hard.
        
               | aidos wrote:
               | I don't understand the platform thing, is that something
               | to do with running on Windows? Why wouldn't you just pip
               | install? Why bring conda etc into the mix?
               | 
               | If you have conflicts then you have to reconcile those at
               | point of initial install - pip deals with that for you.
               | I've never had a situation in 15 years of Python packages
               | where there wasn't a working combination of versions.
               | 
               | These are genuine questions btw. I see these common
               | complaints and wonder how I've not ever had issues with
               | it.
        
               | bluish29 wrote:
               | I will try to summarize the complaints (mine at least) in
               | obvious simple points
               | 
               | 1- pip freeze will miss packages not installed by pip
               | (i.e. Conda).
               | 
               | 2- It does include all packages, even not used in the
               | project.
               | 
               | 3- It just dumps all packages, their dependencies and
               | sub-dependencies. Even without conflicts, if you happen
               | to change a package, then it is very hard to keep track
               | of dependencies and sub-dependencies that need to be
               | removed. At some point, your file will be a hot mess.
               | 
               | 4. If you install specific platform package version then
               | this information will not be tracked
        
               | aidos wrote:
               | Ok. I think that's all handled by my workflow, but it
               | does involve taking responsibility for requirements
               | files.
               | 
               | If I want to install something, I pip install and then
               | add the explicit version to the base. I can then freeze
               | the current state to requirements to lock in all the sub
               | dependencies.
               | 
               | It's a bit manual (though you only need a couple of cli
               | commands) but it's simple and robust.
        
               | bluish29 wrote:
               | I don't think that manual handling of requirement.txt in
               | a collaborative environment is a robust process. It will
               | be a waste of time and resources to handle it like that.
               | And I don't know about your workflow but it is obviously
               | not standard and it does not address the first and forth
               | points.
        
               | aidos wrote:
               | Haha. Ok. I think that's where we're just going to have
               | to agree to disagree.
        
       | SushiHippie wrote:
       | Looks cool!
       | 
       | Have you looked into WASM? Something like a jupyterlite [0]
       | alternative for marimo?
       | 
       | And are there plans to integrate linting and formatting with
       | ruff? [1]
       | 
       | [0] https://jupyterlite.readthedocs.io/en/stable/
       | 
       | [1] https://github.com/astral-sh/ruff (ruff format is almost 100%
       | compatible with black formatting)
        
         | akshayka wrote:
         | We started looking into WASM this week, and did some light
         | exploratory coding toward it. It's on our roadmap:
         | https://marimo-team.notion.site/The-marimo-roadmap-e5460b9f2...
         | 
         | A ruff integration is a great idea. I'll add it to the roadmap.
        
           | SushiHippie wrote:
           | Perfect, thank you!
        
           | SushiHippie wrote:
           | <2 cents>
           | 
           | I see some package management stuff on the roadmap.
           | 
           | Maybe you could take a look at the cargo cli, like pixi did
           | [0]. IMO it's a nice user experience.
           | 
           | [0] https://prefix.dev/
           | 
           | </2 cents>
        
             | akshayka wrote:
             | Thanks for the suggestion. We'll definitely take a look.
        
           | prabir wrote:
           | Looking forward to the WASM integration. Being able to use
           | plain filesystem such as nextcloud and able to run it there
           | would be great. I have been trying to get juypterlite wasm in
           | my next cloud alternative that I have been working so would
           | love to try this.
        
       | hedgehog wrote:
       | This looks quite nice and it might compose well with a cache
       | library like the one posted on HN recently (XetCache,
       | https://news.ycombinator.com/item?id=38696631).
        
         | noahlt wrote:
         | Yeah, having worked on alternative notebooks before, one of the
         | big implicit features of Jupyter notebooks is that long-running
         | cells (downloading data, training models) don't get spuriously
         | re-run.
         | 
         | Having an excellent cache might reduce spurious re-running of
         | cells, but I wonder if it would be sufficient.
        
           | akshayka wrote:
           | We've thought briefly about cell-level caching; or at least
           | it's a topic that's come up a couple times now with our
           | users. Perhaps we could add it as a configuration option, at
           | the granularity of individual cells. Our users have found
           | that `functools.cache` goes a long way.
           | 
           | We also let users disable cells (and their descendants),
           | which can be useful if you're iterating on a cell that's
           | close to the root of your notebook DAG:
           | https://docs.marimo.io/guides/reactivity.html#disabling-
           | cell...
        
             | smacke wrote:
             | ipyflow has a %%memoize magic which looks quite similar to
             | %%xetmemo (just without specifying the inputs / outputs
             | explicitly):
             | https://github.com/ipyflow/ipyflow/?tab=readme-ov-
             | file#memoi...
             | 
             | Would be cool if we could come up with a standard that
             | works across notebooks / libraries!
        
       | dimatura wrote:
       | I already use jupytext to store notebooks as code but the
       | improved state management and notebook-as-app features are pretty
       | compelling and I'm trying it out.
       | 
       | Unfortunately, I'm quite used to very specific vim keybindings in
       | Jupyter (https://github.com/lambdalisue/jupyter-vim-binding) that
       | make it pretty hard to use anything else :/
        
       | peter_l_downs wrote:
       | Marimo are wonderful little pets, I used to have some and really
       | liked it. I should get some more. Never failed to start a
       | conversation when guests came over.
       | 
       | https://soltech.com/blogs/blog/how-to-care-for-your-marimo-m...
        
       | mondrian wrote:
       | Looks cool. This is kind of like streamlit, which (I think) tried
       | to escape the limitations of notebooks by giving you an API to
       | quickly make a shareable app with sliders/charts etc. (Yet it
       | retains some notebook concepts like 'cells').
       | 
       | Marimo kind of takes the reactive widgets of streamlit and brings
       | them back into a notebook-like UI, and provides a way to export
       | the notebooks into shareable apps.
        
         | akshayka wrote:
         | Thanks! One way we differ from streamlit is that
         | ML/data/experimentation work can start in marimo -- i.e., you
         | can use marimo for traditional notebooking work, without ever
         | making an app. But you can also use marimo to make shareable
         | apps as you've articulated.
        
       | bsdz wrote:
       | This is a great idea. I'd been planning to create something
       | similar where cells are topologically ordered based on their
       | dependency structure; although I was thinking perhaps to
       | integrate with Jupyter more, eg use their existing kernel web
       | sockets infrastructure. In my mind, one would be able to zoom out
       | and see a graph view where hovering over a node would show its
       | corresponding cell with content / output. Each node might be
       | coloured according to execution status. That said, I'm not a UI
       | expert and I never got around to it. So thanks for your efforts,
       | I'll definitely give it a spin.
        
         | akshayka wrote:
         | That sounds really cool! marimo has a dependency graph viewer
         | built-in, but we could definitely improve it. Coloring nodes by
         | execution status, and annotating cells with their variable
         | defs/refs, would be great quality-of-life improvements.
        
       | simonw wrote:
       | This is amazing. I'm a big user of both Jupyter notebooks and
       | Observable notebooks (https://observablehq.com/) and the thing I
       | miss most from Observable when I'm using Jupyter is the lack of
       | cell reactivity.
       | 
       | You've solved that incredibly well!
       | 
       | I also really like that the Marimo file format is just Python.
       | Here's an example saved file from playing around with the intro:
       | https://gist.github.com/simonw/e6e6e4b45d1bed9fc1482412743b8...
       | 
       | Nice that it's Apache 2 licensed too.
       | 
       | Wow, I just found the GitHub Copilot feature too!
        
         | mscolnick wrote:
         | Myles here (other core contributor) -
         | 
         | We are thrilled to see you have such a strong positive
         | reaction. It means a lot coming from you - I initially learned
         | web development using Django and landed my first contracting
         | gig with Django.
         | 
         | I drifted away from writing Python and towards Typescript - but
         | marimo has brought me back to writing Python.
        
           | arthurwu wrote:
           | let's go!! so excited to see this get deserved attention
        
       | zengid wrote:
       | Very cool! This is something Jack Rusher cries for in his talk
       | "Stop Writing Dead Programs"
       | https://www.youtube.com/watch?v=8Ab3ArE8W3s
        
       | elijahbenizzy wrote:
       | You've built observable but for python. Love it!
        
       | esafak wrote:
       | Could this be used with MDX or something to embed interactive
       | examples in documentation? That is an underserved use case.
        
         | mscolnick wrote:
         | It is not possible at the moment (we use iframes in our
         | documentation), but once we support WASM, it should be
         | possible.
        
       | rurban wrote:
       | I'll definitely try it out tomorrow! Could fix a lot of problems
       | with my current project.
        
       | Beefin wrote:
       | we use the jupyter-server kernel gateway api at https://nux.ai
       | would love to explore using marimo's API for code execution
        
       | smacke wrote:
       | I'm a big fan of Marimo (and of Akshay and Myles in particular);
       | it's great to finally see a viable competitor to Jupyter as it
       | can only mean good things for the ecosystem of scientific tooling
       | as a whole.
        
       | j0e1 wrote:
       | This is a welcome alternative to Jupyter Notebooks/lab- great
       | work! One thing that would be nice is an ability to see previews
       | on GitHub of the Marimo notebook (like Jupyter Notebook). I am
       | not sure if this is possible given you would have to run the code
       | to see the output.
        
       | carterschonwald wrote:
       | Awesome! I've been wanting this sort of thing for a long time.
       | But I've only been aware of the Julia tool pluto
        
       | krawczstef wrote:
       | how do you read the resulting python files? That's what I'm
       | struggling with -- but I guess the point is you don't read them,
       | you use marimo for that?
        
         | akshayka wrote:
         | Thanks for the question. Each cell is represented as a function
         | that maps its referenced variables to the variables it defines.
         | Cells are sorted in the order they appear on the notebook page.
         | 
         | If you run `marimo tutorial fileformat`, that'll open a
         | tutorial notebook that explains the fileformat in some detail.
        
       | aqader wrote:
       | this is really cool, can't wait to try it out for some ML
       | pipeline development. kudos myles and akshay!
        
       | Micoloth wrote:
       | Wow.. Really great work, _finally_ someone is doing it!
       | 
       | Since I've thought about this for a long time (I've actually even
       | made a very simplified version last year [1]), I want to
       | contribute a few thoughts:
       | 
       | - cool that you have a Vscode extension, but I was a little
       | disappointed that it opens a full browser view instead of using
       | the existing, good Notebook interface of Vscode. (I get you want
       | to show the whole Frontend- But I'd love to be able to run the
       | Reactive Kernel within the full Vscode ecosystem.. Included
       | Github Copilot is cool, but that's not all)
       | 
       | - As other comments said, if you want to go for reproducibility,
       | the part about Package Management is very important. And it's
       | also mostly solved, with Poetry etc...
       | 
       | - If you want to go for easy deployment of the NB code to
       | Production, another very cool feature would be to extract (as a
       | script) all the code needed to produce a given cell of output!
       | This should be very easy since you already have the DAG.. It
       | actually even existed at some point in VSCode Python extension,
       | then they removed it
       | 
       | Again, great job
       | 
       | [1] https://github.com/micoloth/vscode-reactive-jupyter
        
       | yowlingcat wrote:
       | [delayed]
        
       | exe34 wrote:
       | That's amazing! Can I edit it in another editor, save the file
       | and have it updated live in the browser notebook? Or does it have
       | to recompute everything?
        
       ___________________________________________________________________
       (page generated 2024-01-12 23:00 UTC)