[HN Gopher] Reverse Engineering Google Colab
___________________________________________________________________
Reverse Engineering Google Colab
Author : arjvik
Score : 98 points
Date : 2022-06-23 16:03 UTC (6 hours ago)
(HTM) web link (dagshub.com)
(TXT) w3m dump (dagshub.com)
| a-dub wrote:
| https://github.com/singhsidhukuldeep/Google-Colab-Shell
|
| pops a terminal inline in the colab notbook on the backing vm.
| super useful if you get tired of having to shell execute all the
| time via the cell interface.
| datageddon wrote:
| There's also colab-ssh [1] that sets up an SSH tunnel (through
| cloudflared) and allows you to connect from your ssh client in
| your own terminal.
|
| [1] https://github.com/WassimBenzarti/colab-ssh
| BrianHenryIE wrote:
| There's an active effort to (again) implement Swift on Colab:
|
| https://github.com/philipturner/swift-colab/
| [deleted]
| cperry wrote:
| Impressive work.
|
| Just came here to note that we read all of our in-product
| feedback submissions as well as GitHub issues:
| https://github.com/googlecolab/colabtools/issues
|
| If you've got feature requests or encounter bugs we appreciate
| you filing!
| DiogenesKynikos wrote:
| Question: Why does Google not allow children to use Colab?
|
| I can imagine plenty of teenagers interested in programming
| would like to tinker on Colab. However, Google restricts the
| service to people 18 and above.
| cperry wrote:
| Where are you seeing 18+ restrictions? I went through a lot
| last year to get us approved for 13+ so we'd be good at least
| down to middle school ish.
| nalzok wrote:
| Do you have a plan to expose some high-level API endpoints? I
| have been dreaming about something like
| `run.colab.research.google.com/<notebook_url>?runtime=gpu`
| which executes a Colab notebook without human interference.
| This can be extremely helpful in CI/CD environments when you
| have a lot of notebooks to test, e.g. for
| https://github.com/probml/pyprobml.
| elashri wrote:
| Colab by design is made to be interactive. They even
| introduced CAPTCHA to make sure you don't train long models
| and go do something else.
| cperry wrote:
| No plans at this time; we try to prioritize interactive
| compute features. But this would be really cool to do! Maybe
| in the future.
| sillysaurusx wrote:
| It's about a bazillion times easier to reverse engineer colab if
| you just SSH into it. You can set up a reverse proxy. I used
| ngrok back in the day, but maybe they blocked it.
|
| The most interesting thing was a custom binary that mounts your
| Google drive as a folder. I was able to copy it off colab and use
| it on my own Linux boxes, which was handy in a "oh neat, lookie
| there" kind of way. I assume it'll break whenever they update
| their api, but you'd still be able to just grab the new binary
| from a random colab instance.
|
| There's also a custom script they run to set up everything, using
| Node. It spawns a bunch of stuff that I've forgotten. (It was
| 2019 when I was poking around, and a pandemic has a nice way of
| wiping one's memory of ye olden hacking days. Still a bit sad I
| never got to go to the tensorflow conference.)
|
| Anyway just ssh in and ls -la / and you'll see one or two
| interesting folders. You can rsync them down to your box and
| examine at your leisure.
| sva_ wrote:
| It should be noted that it is against their rules and you might
| get worse instances if they somehow detect it
|
| https://research.google.com/colaboratory/faq.html#limitation...
| teruakohatu wrote:
| Doesn't Pro allow SSH?
| sva_ wrote:
| I looked at the license agreement, and it says under "5.
| Restrictions"
|
| _> circumvent, reverse-engineer, modify, disable, or
| otherwise tamper with any security technology that Google
| uses to protect the Paid Service or encourage or help
| anyone else to do so;
|
| > access the Paid Service other than by means authorized by
| Google; or _
|
| I'm not sure what exactly they mean by "means authorized by
| Google".
|
| https://colab.research.google.com/pro/terms/v1
| sillysaurusx wrote:
| I think they're just trying to fight abuse. You can do
| everything from colab that you can do from ssh anyway. It's
| just faster to enter commands.
|
| Good catch though. I didn't know that.
|
| When I originally figured out how to ssh in, I kept it a
| secret figuring that it'd be a matter of time till they
| clamped down. Guess it took a few years, or I just missed it.
| Bunch of us in the ML scene used to do it regularly, since
| it's way easier to monitor a training run via tmux.
| RyEgswuCsn wrote:
| I think they do shut you out if you try to spin any process
| through "unauthorised" means. There have many projects that
| offer automated setup of SSH/VNC/VSCode on a colab
| instance, and my experience has been that colab somehow is
| able to manage to shut off the connections soon after I
| start them.
| dinvlad wrote:
| I would imagine their threat model pretty much assumes anyone
| can do anything on that host :-)
| minimaxir wrote:
| > However, it's incredibly difficult to harness the compute power
| of Colab for anything beyond Jupyter notebooks. For Machine
| Learning engineers that want to productionize their models and
| bring them out of the notebook stage, this is a particularly
| relevant issue; notebooks, while perfect for exploration, don't
| play well with more advanced MLOps tools that codify the training
| process into a formal pipeline.
|
| That isn't what Colab is intended for. Google has better and more
| productive tools for companies who can fit the bill, which is
| getting cheaper over time.
|
| AI Notebooks behave the same in practice as Google Colab with
| one-click one/off for model testing + JupyterLab. If you want to
| minimize costs via spot instances, you can deploy a Compute
| Engine with the Deep Learning VM image, which also includes a
| running JupyterLab on launch if need to use that workflow, and
| also saves time by including your framework of choice. A spot VM
| with a T4 GPU is about $0.18/hour.
___________________________________________________________________
(page generated 2022-06-23 23:01 UTC)