hngopher.com

       [HN Gopher] Launch HN: Moonglow (YC S24) - Serverless Jupyter No...
       ___________________________________________________________________
        
       Launch HN: Moonglow (YC S24) - Serverless Jupyter Notebooks
        
       Hi Hacker News! We're Leila and Trevor from Moonglow
       (https://moonglow.ai). We let you run local Jupyter notebooks on
       remote cloud GPUs. Here's a quick demo video:
       https://www.youtube.com/watch?v=Bf-xTsDT5FQ. If you want to try it
       out directly, there are instructions below.  With Moonglow, you can
       start and stop pre-configured remote cloud machines within VSCode,
       and it makes those servers appear to VSCode like normal Jupyter
       kernels that you can connect your notebook to.  We built this
       because we learned from talking to data scientists and ML
       researchers that scaling up experiments is hard. Most researchers
       like to start in a Jupyter notebook, but they rapidly hit a wall
       when they need to scale up to more powerful compute resources. To
       do so, they need to spin up a remote machine and then also start a
       Jupyter server on it so they can access it from their laptop. To
       avoid wasting compute resources, they might end up setting this up
       and tearing it down multiple times a day.  When Trevor used to do
       ML research at Stanford, he faced this exact problem: often, he
       needed to move between cloud providers to find GPU availability.
       This meant he was constantly clicking through various cloud compute
       UIs, as well as copying both notebook files and data across
       different providers over and over again.  Our goal with Moonglow is
       to make it easy to transfer your dev environment from your local
       machine to your cloud GPU. If you've used Google Colab, you've seen
       how easy it is to switch from a CPU to a GPU - we want to bring
       that experience to VSCode and Cursor.  If you're curious, here's
       some background on how it works. You can model a local Jupyter
       server as actually having three parts: a frontend (also known as a
       notebook), a server and an underlying backend kernel. The frontend
       is where you enter code into your notebook, the kernel is what
       actually executes it, and the server in the middle is responsible
       for spinning up and restarting kernels. Moonglow is a rewrite of
       this middle server part: where an ordinary Jupyter server would
       just start and stop kernels, we've added extra orchestration that
       provisions a machine from your cloud, starts a kernel on it, then
       sets up a tunnel between you and that kernel.  In the demo video
       (https://www.youtube.com/watch?v=Bf-xTsDT5FQ), you can see Trevor
       demonstrate how he uses Moonglow to train a ResNet to 94% accuracy
       on the CIFAR-10 classification benchmark in 1m16s of wall clock
       time. (In fact, it only takes 5 seconds of H100 time; the rest of
       it is all setup.)  On privacy: we tunnel your code and notebook
       output through our servers. We don't store or log this data if you
       are bringing your own compute provider. However, we do monitor it
       if you are using compute provided by us, to make sure that what you
       are running doesn't break our compute vendor's terms of service.
       We currently aren't charging individuals for Moonglow. When we do,
       we plan to price individuals a reasonable amount per-seat, and we
       have a business plan for teams with more requirements.  Right now,
       we support Runpod and AWS. We'll add support for GCP and Azure
       soon, too. (If you'd like to use us to connect to your own AWS
       account, please email me at trevor@moonglow.ai.)  For today's
       launch on HN only, you can get a free API key at
       https://app.moonglow.ai/hn-launch. You don't need to sign in and
       you don't need to bring your own compute; we'll let you run it on
       servers we provide. This API key will give you enough credit to run
       Moonglow with an A40 for an hour.  If you're signed in, you won't
       be able to see the free credits page, but your account will have
       automatically had free credits added to it.  We're still very
       early, and there are a lot of features we'd still like to add, but
       we'd love to get your feedback on this. We look forward to your
       comments!
        
       Author : tmychow
       Score  : 71 points
       Date   : 2024-08-23 15:20 UTC (7 hours ago)
        
       | williamstein wrote:
       | How do you deal with the filesystem? Eg do you make the local
       | file system visible to the remote kernel somehow?
        
         | bblcla wrote:
         | We don't right now, but it's something a lot of people have
         | asked for, so we're rolling out a file sync feature next week.
         | (It will basically be a nice wrapper over rsync.)
        
       | daft_pink wrote:
       | Will this work with zeds repl system that I believe uses Jupyter
       | kernel spec?
       | 
       | I really like their Jupyter repl format because it separates the
       | cells with python comments so it's much easier to deploy your
       | code when you are done versus a notebook.
        
         | bblcla wrote:
         | Good question! I'm not too familiar with Zed, but here's my
         | high-level guess from reading the website: we don't currently
         | integrate with Zed, but we probably could if it supports remote
         | kernels (the docs I found at https://zed.dev/docs/repl weren't
         | specific about it).
         | 
         | One nice thing about our VSCode extension is that it's not just
         | a remote kernel - our extension also lets you see what kernels
         | you have and other details, so we'd need to write something
         | like it for Zed. We probably wouldn't do this unless there's a
         | lot of demand.
         | 
         | By the way, VSCode also supports the # %% repl spec and
         | Moonglow does work with that (though we haven't optimized for
         | it as much as we've optimized for the notebooks).
        
           | daft_pink wrote:
           | My impression is that if it shows up by running the command
           | Jupyter kernelspec list then it will work in zed out of the
           | box. Does it show up on this list?
        
             | bblcla wrote:
             | I don't think so, if that jupyter command just runs against
             | your local servers. We register the moonglow remote servers
             | with VSCode through the extension, and my guess is we'd
             | need to do something similar with Zed.
        
       | randomcatuser wrote:
       | Cool! Optimal UX!
       | 
       | What do you think of this compared with running a Jupyter server
       | on Modal? (I think Modal is _slightly_ harder, ie, you run a
       | terminal command, but curious!)
       | https://modal.com/docs/guide/notebooks
        
         | tmychow wrote:
         | It seems like there are two ways of using Jupyter with Modal.
         | One is adding Modal decorators to specific functions within a
         | notebook, and for that use case, I think Modal is fantastic (if
         | you are using GPUs on a function-by-function basis)!
         | 
         | On the other hand, if you are trying to run an entire notebook
         | on a remote machine by starting a Jupyter server with Modal,
         | then the workflow with Modal is not that different from other
         | clouds (e.g. you can start an EC2 instance and run a Jupyter
         | server there). For that, Moonglow still makes it easier by
         | letting you stay in your IDE and avoid juggling Jupyter server
         | URLs.
         | 
         | Also, you might need to use a specific cloud e.g. if you have
         | cloud credits, sensitive data that needs to stay on that cloud
         | or just expensive egress fees. One of Moonglow's strengths is
         | that you can do your work in that cloud, rather than having to
         | move stuff around.
        
       | CuriouslyC wrote:
       | Good idea and fun name (even if it doesn't tell me anything about
       | the product). I don't see your path to a billion dollar market
       | like YC usually expects to see though, do you have a plan there?
        
         | bblcla wrote:
         | Thanks!
         | 
         | One thing I've found while working in the ML space is that it
         | seems like ML researchers have to deal with a lot of systems
         | cruft. I think that in the limit, ML researchers basically only
         | care having about a few things set up well:
         | 
         | - secrets and environment management
         | 
         | - making sure their dependencies are installed
         | 
         | - efficient access to their data
         | 
         | - quick access to their code
         | 
         | - using expensive compute efficiently
         | 
         | But to get all this set up for their research they need to wade
         | through a ton of documentation about git, bash, docker
         | containers, mountpoints, availability zones, cluster management
         | and other low-level systems topics.
         | 
         | I think there's space for something like Replit or Vercel for
         | ML researchers, and Moonglow is a (very early!) attempt at
         | creating something like it.
        
       | aresant wrote:
       | It's inevitable that access to the actual GPU compute winds up as
       | an API layer vs captive behind proprietary systems and interfaces
       | as the market matures
       | 
       | This is brilliant and "obvious" in a good way along those lines,
       | congrats on the launch!
        
         | tmychow wrote:
         | Thanks! When I was doing ML research, every moment that my GPU
         | setup was top of mind was a point of frustration, so hopefully
         | we're moving the dial a bit towards making compute an
         | abstraction you don't have to worry about.
        
       | sidcool wrote:
       | Congrats on launching. How does it compare to Google colab?
        
         | bblcla wrote:
         | Thanks!
         | 
         | The big difference is that Google Colab runs in your web
         | browser, whereas Moonglow lets you connect to compute in the
         | VSCode/Cursor notebook interface. We've found a lot of people
         | really like the code-completion in VSCode/Cursor and want to be
         | able to access it while writing notebook code.
         | 
         | Colab only lets you connect to compute provided by Google. For
         | instance, even Colab Pro doesn't offer H100s, whereas you can
         | get that pretty easily on Runpod.
        
           | williamstein wrote:
           | > Colab only lets you connect to compute provided by Google.
           | 
           | That is no longer true - you can use remote kernels on your
           | own compute via colab:
           | https://research.google.com/colaboratory/local-runtimes.html
           | 
           | There is also the same feature in CoCalc, including using the
           | official colab Docker image:
           | https://doc.cocalc.com/compute_server.html#onprem
           | 
           | Cocalc also supports 1 click use of vscode.
           | 
           | (The above might not work with runpod, since their execution
           | environment is locked down. However it works with other
           | clouds like Lambda, Hyperstack, etc.)
        
             | bblcla wrote:
             | Ah, yeah, I misspoke, sorry. I was aware of that feature,
             | but everyone I've talked to said it's so annoying to use
             | they basically never use it, so I didn't think it was worth
             | mentioning.
             | 
             | The big reason it's annoying is because (I believe) Colab
             | still only lets you connect to runtimes running on _your_
             | computer - which is why at the end at the end of that
             | article they suggest using SSH port forwarding if you want
             | to connect to a remote cluster. I know at least one company
             | has written a hacky wrapper that researchers can use to
             | connect to their own cluster through Colab, but it 's not
             | ideal.
             | 
             | I think Moonglow's target audience is slightly different
             | than Colab's though because of the tight VSCode/Cursor
             | integration - many people we've talked to said they really
             | value the code-complete, which you can't get in any web
             | frontend!
        
               | elashri wrote:
               | > I think Moonglow's target audience is slightly
               | different than Colab's though because of the tight
               | VSCode/Cursor integration - many people we've talked to
               | said they really value the code-complete, which you can't
               | get in any web frontend!
               | 
               | At the risk of repeating the famous Dropbox comment
               | 
               | I like the idea and that the ease of usage is your
               | selling point. But I don't know if that is actually a
               | reasonable reason. People who are entrenched that much in
               | VSCode ecosystem wouldn't find it a problem to deploy
               | dockerized Nvidia GPU container and connect to their own
               | compute instance via remote/tunnel plugins on VSCode
               | which one can argue does make more sense.
               | 
               | Congratulations on the launch and good luck with the
               | product.
        
               | tmychow wrote:
               | Thanks! I think the "deploy and connect" workflow is
               | itself not super painful, but even if you're invested in
               | VSCode, doing that again and again every day is pretty
               | annoying (and it certainly was for me when I used to do
               | ML), so hopefully the ease of use is valuable for people.
        
               | dovholuknf wrote:
               | Interesting idea. I'm not very well-versed in training
               | models or LLMs or even Jupyter Notebooks, but the comment
               | about port forwarding SSH caught my eye since I work on a
               | free, open source zero-trust overlay network (OpenZiti).
               | I tried to find some information about moonglow under the
               | hood / how it worked but didn't succeed.
               | 
               | If you're interested, you might find embedding OpenZiti
               | into Moonglow a pretty compelling alternative to port
               | forwarding and it might open even crazier ideas once your
               | connectivitiy is embedded into the app. You can find the
               | cheapest compute for people and just connect them to that
               | cheapest compute using your extension... Might be
               | interesting? Anyway, I'd be happy to discuss some time if
               | that sounds neat... Until then, good luck with your
               | launch!
        
               | williamstein wrote:
               | Is it possible to use OpenZiti with Runpod? Their
               | execution environment is very locked down, which might
               | make ssh the only option.
        
               | dovholuknf wrote:
               | I don't actually know. I'll go poke with Runpod for a few
               | and see :)
        
               | dovholuknf wrote:
               | I poked at it a bit but there was no free trial period. I
               | know a bunch of people are using OpenZiti and zrok for
               | Jupyter notebooks in general... Here's a blog I saw not
               | long back that might help but I wasn't able to
               | prove/test/try it... (sorry)
               | 
               | https://www.pogs.cafe/software/tunneling-sagemaker-kaggle
        
               | qrkourier wrote:
               | At a glance, the RunPod's serverless and pod options
               | would probably work well with OpenZiti. I didn't explore
               | their vLLM option.
               | 
               | Using OpenZiti w/ Serverless probably means integrating
               | an OpenZiti SDK with your serverless application. That
               | way, it'll connect to the OpenZiti network every time it
               | spawns.
               | 
               | The SDK option works anywhere you can deploy your
               | application because it doesn't need any sidecar, agent,
               | proxy, etc, so it's definitely the most flexible and I
               | can give you some examples if you mention the language or
               | framework you're using.
               | 
               | The pod option says "container based" so it'll take some
               | investigation to find out if an OpenZiti sidecar or other
               | tunneling proxy is an option. Would you be looking to
               | publish something running in RunPod (the server is in
               | RunPod), or access something elsewhere from a RunPod pod
               | (the client is in RunPod), or both?
        
               | bblcla wrote:
               | Cool! We actually don't do port forwarding over SSH, we
               | do it over an ngrok-like solution that we
               | forked/modified. I looked at a few options while we were
               | designing this, including Tailscale and ngrok, but none
               | of them exactly suited our needs, and the pricing would
               | have been prohibitive for something that's a pretty core
               | part of our product.
               | 
               | OpenZiti looks really cool though - I'll take a look!
        
       | AnotherGoodName wrote:
       | We can literally transpile and run everything client side these
       | days. You can run a recompiled quake 3 inside your browser. Why
       | is a hosted notebook anything other than a static html+js+css
       | site that runs behind a cdn (effectively free to host)?
       | 
       | I suspect that's a matter of time right?
        
         | AnotherGoodName wrote:
         | In fact a quick google: https://github.com/jtpio/jupyterlite
         | 
         | Yay I really can have serverless notebooks! Not just an easy to
         | manage server environment but literally a static html file that
         | can be passed around and runs the full notebook environment.
         | It's weird it was ever done any other way.
        
           | switchbak wrote:
           | Does that run with your GPU in the way this service offers?
        
             | tomjakubowski wrote:
             | No, it doesn't. But unlike the marketing buzzword
             | "serverless", it is honest about not requiring a server.
             | 
             | Pyodide GPU support is a ways away, but it is theoretically
             | possible once WebGPU is stable.
             | https://github.com/pyodide/pyodide/issues/1911
        
         | dinobones wrote:
         | I too share the dream of transpiling
         | everything/WebAssembly/everything can run everywhere, but we're
         | pretty far.
         | 
         | For a lot of ML/AI workloads and tasks, Python is just a
         | binding for underlying C/C++.
         | 
         | It's already a nightmare to try to reproduce any ML/AI paper,
         | pip breaks 3 times, incompatible peer deps, some obscure
         | library emits an obscure CLANG error that means I need to brew
         | install some libwhatever, etc...
         | 
         | I don't think the WebAssembly toolchain is quite ready for plug
         | and play "pip install" time yet. I hope it eventually will be
         | though.
        
           | OvbiousError wrote:
           | The performance of browser wasm is also incomparable to the
           | same code compiled into a native binary.
        
         | bblcla wrote:
         | A lot of the people we've talked to who get the most value out
         | of remote compute are doing really intensive stuff - they need
         | server-level resources far beyond what you can find on a
         | consumer laptop!
         | 
         | Hopefully someday you'll have 8 H100s on your Macbook, but I
         | think we're still a long way away from that.
        
       | mistrial9 wrote:
       | "Don't fork the Notebook format" --Fernando
        
       | xra_11 wrote:
       | Congrats on the launch!
        
         | tmychow wrote:
         | Thank you!
        
       | whinvik wrote:
       | Feels a bit like what Databricks already does with Dbx etc.
        
         | bblcla wrote:
         | I'm not super familiar with dbx (though its docs at
         | https://docs.databricks.com/en/archive/dev-tools/dbx/dbx.htm...
         | suggest it's deprecated).
         | 
         | However, looking at its replacement here
         | (https://docs.databricks.com/en/dev-tools/bundles/index.html) -
         | I think we're trying to solve the same problems at different
         | levels. My guess is Databricks is the right solution for big
         | teams that need well-defined staging/prod/dev environment.
         | We're targeting smaller teams that might be doing more of their
         | own devops or are still at the 'using a bash script to run
         | notebooks remotely' stage.
        
       | yanniszark wrote:
       | Great work! Was wondering if you deal with transferring the
       | python environment remotely. Usually a large part of the
       | difficulty is dealing with dependencies.
        
         | bblcla wrote:
         | We make sure the remote containers have
         | CUDA/Pytorch/Numpy/Matplotlib set up if you're using a GPU-
         | based machine. It's actually far easier for me to run ML-based
         | code through Moonglow now than on my Macbook - it's really nice
         | to start with a clean environment every time, instead of having
         | to deal with dependency hell.
         | 
         | We don't yet transfer the python environment on the self-serve
         | options, though for customers on AWS we'll help them create and
         | maintain images with the packages they need.
         | 
         | I do have some ideas for making it easy to transfer
         | environments over - it would probably involve letting people
         | specify a requirements.txt and some apt dependencies and then
         | automatically creating/deploying containers around that. Your
         | idea of actually just detecting what's installed locally is
         | pretty neat too, though.
        
       | ayakang31415 wrote:
       | I really like this concept. I SSH to my university HPC to submit
       | Python script for ML related work (sbatch script.py), and
       | sometimes I edit a script with VIM. Now I can use Jupyter Lab on
       | HPC with port forwarding, but it is not as convenient as just
       | running Jupyter lab locally. Does your software have some sort of
       | command line features within Jupyter Notebook that can be run on
       | HPC?
        
         | jerpint wrote:
         | I recommend VScode for that use case (other editors probably
         | support this too) where you can ssh to your cluster, access all
         | your files in the tree, and run Jupyter notebooks from VScode
         | directly using HPC resources, very nice workflow
        
           | samstave wrote:
           | I almost exclusively us VSCode to access remote hosts now.
           | VSCode _is_ my ssh client.
           | 
           | I havent had a personal jupyter need like this thread is
           | about, yet - but I am a curious mind in search of tools to
           | help me curious harder.
           | 
           | I need to find a project that lets me leverage moonglow.
           | 
           | I hope they got the name MoonGlow from Ultima:
           | 
           | > _Moonglow_
           | 
           | >> _Moonglow is a city dedicated to magic and mystical arts,
           | found on the island of Verity. The principle area of the city
           | is fenced and gated, surrounding a maze from which
           | teleporters connect to the more outlying shops and
           | facilities. One of these, the Encyclopedia Magicka, contains
           | a magical pentagram, which proves to be a further teleporter.
           | Saying, or more discretely whispering, the password 'Recdu'
           | while standing on this will transport you to the Lost lands,
           | to a second pentagram in the town of Papua. The city's
           | tinkers have no dedicated shop, instead they can be found in
           | the city's bank._
           | 
           | Seems appropriate for this project...
           | 
           | even has a theme song:
           | https://www.youtube.com/watch?v=XMRlwVmetcc
        
           | bblcla wrote:
           | Yeah, I agree! We looked into integrating Moonglow with
           | academic clusters, because many of my ML PhD friends
           | complained about using them. We unfortunately haven't found a
           | good generalized solution, so I think VSCode's remote SSH +
           | manual server management is probably the best option for now.
        
         | joouha wrote:
         | Toy could try using euporie [1] to run Jupyter notebooks in the
         | terminal
         | 
         | [1] https://github.com/joouha/euporie
        
       | nobarpgp wrote:
       | Congratulations on today's launch. Going to BYOC and test out the
       | API on an A100.
       | 
       | Curious, are the SSH keys stored on Moonglow's internal servers?
        
       ___________________________________________________________________
       (page generated 2024-08-23 23:00 UTC)