[HN Gopher] Show HN: Use your own clusters to train, track, depl...
       ___________________________________________________________________
        
       Show HN: Use your own clusters to train, track, deploy, and monitor
       ML models
        
       Author : Jugurtha
       Score  : 11 points
       Date   : 2021-10-06 19:53 UTC (3 hours ago)
        
 (HTM) web link (iko.ai)
 (TXT) w3m dump (iko.ai)
        
       | Jugurtha wrote:
       | Hi,
       | 
       | iko.ai offers real-time collaborative notebooks to train, track,
       | deploy, and monitor models.
       | 
       | We built it as an internal platform, first, to make our ML
       | consulting easier and not burn out. It enables you to deliver
       | data products faster by reducing technical debt in machine
       | learning projects. It reduces lowers barriers to entry and gives
       | you leverage to do things that typically require a team of
       | experts.
       | 
       | No-setup notebook:
       | 
       | You can just start a fresh notebook server from several Docker
       | images that just work. You don't need to troubleshoot
       | environments, deal with NVIDIA drivers, or lose hours or days
       | fixing your laptop. For people without strong systems skills,
       | they become autonomous instead of relying on others to fix their
       | system.
       | 
       | Real-time collaboration
       | 
       | You can share a notebook with other users and collectively edit
       | it, see each others' cursors and selections. This is ideal for
       | pair-programming, troubleshooting, tutoring, and a must when
       | working remotely. Scheduled long-running notebooks Regular
       | notebooks lose computation results when there is a disconnection,
       | which often happens in long-running training jobs. You can
       | schedule your notebooks in a fire and forget fashion, right from
       | the notebook interface, without interrupting your flow, and watch
       | their output stream from other devices.
       | 
       | AppBooks
       | 
       | You can run or have others run your notebooks with different
       | values without changing your code. You can click "Publish" and
       | have a parametrized version without changing cell metadata or
       | adding tags. One use case is having a subject-matter expert tweak
       | some domain specific parameters to train a model without being
       | overwhelmed by the code, or the notebook interface. It becomes a
       | form on top of the notebook with the parameters you want to
       | expose.
       | 
       | Bring your own Kubernetes clusters
       | 
       | You can use your own existing Kubernetes clusters from Google,
       | Amazon, Microsoft, or DigitalOcean on iko.ai which will use them
       | for your notebook servers, your scheduled notebooks, and
       | different other workloads. You don't have to worry about metered
       | billing in yet another service. You can control access right from
       | your cloud provider's console or interface, and grant your own
       | customers or users access to your clusters. We also shut down
       | inactive notebooks that have no computation running
       | automatically, to save on cloud spend.
       | 
       | Bring your own S3
       | 
       | You do not need to upload data to iko.ai to start working, you
       | can just add an external S3 bucket and be able to access it as if
       | it were a filesystem. As if your files were on disk. This is
       | ideal not to pollute your code with S3 specific code (boto3,
       | tensorflow_io), and reduces friction. This also ensures people
       | work on the same data, and avoids common errors when working on
       | local copies that were changed by accident.
       | 
       | Automatic Experiment Tracking
       | 
       | You can't improve what you cannot measure and track. Model
       | training is an iterative process and involves varying approaches,
       | parameters, hyperparameters, which give models with different
       | metrics. You can't keep all of this in your head or rely on ad-
       | hoc notes. Your experiments are automatically tracked on iko,
       | everything is saved. This makes collaboration easier as well,
       | because several team members will be able to compare results and
       | choose the best approach.
       | 
       | One-click model deployment
       | 
       | You can deploy your model by clicking a button, instead of
       | relying on a colleague to do it for you. You get a REST endpoint
       | to invoke your model with requests, which makes it easy for
       | developers to use these models without knowing anything about
       | machine learning. You also get a convenience page to upload a CSV
       | and get predictions, or enter JSON values and get predictions
       | from that model. This is often needed for non-technical users who
       | want a graphical interface on top of the model. The page also
       | contains all the information of how that model was produced,
       | which experiment produced it, so you know which parameters, and
       | which metrics.
       | 
       | One-click model packaging
       | 
       | You don't need to worry about sending pickles and weights or
       | using ssh and scp. You can click a button and iko.ai will package
       | your model in a Docker image and push it to a registry. You can
       | then take your model anywhere. If you have a client for which you
       | do on-premises work, you can just docker pull your image there,
       | and start that container and expose that model to other internal
       | systems.
       | 
       | Model monitoring
       | 
       | For each model you deploy, you get a live dashboard that shows
       | you how it is behaving. How many requests, how many errors, etc.
       | This enables you to become aware as soon as something is wrong
       | and fix it. Right now, we're adding data logging so you can have
       | access to data distribution or outliers. We're also adding alerts
       | so you get an alert when something changes above a certain
       | threshold, and are exposing model grading, so you can note your
       | model when it gets something wrong or right, and visualize its
       | decaying performance. Automatic retraining is on the short term
       | roadmap.
       | 
       | Integrations with Streamlit, Voila, and Superset
       | 
       | You can create dashboards and applications right from the
       | notebook interface, as opposed to having someone provision a
       | virtual machine on GCP, create the environment, push the code,
       | start a web server, add authentication, remember the IP address
       | for a demo. You can also create Superset dashboards to work on
       | your data.
       | 
       | APIs everywhere
       | 
       | You can use most of these features with an HTTP request, as
       | opposed to going through the web interface. This is really
       | important for instrumentation and integrations. We're also adding
       | webhooks soon, so iko.ai can emit and get events to and from
       | other external systems. One application of this is Slack alerts,
       | for example, or automatic retraining based on events you choose.
       | 
       | Previous thread: https://news.ycombinator.com/item?id=28373127
        
       ___________________________________________________________________
       (page generated 2021-10-06 23:01 UTC)