[HN Gopher] Launch HN: Runops (YC W21) - A better cloud shell fo...
___________________________________________________________________
Launch HN: Runops (YC W21) - A better cloud shell for production
apps
Hello HN! I'm Andrios, from Runops.io - we're building a proxy to
commands you run in the terminal that adds Git, code reviews in
Slack, and removes sensitive data from results. It's like the Cloud
Shells from GCP/AWS, but with more features and using your local
zsh/bash terminal. You run an AWS CLI command in the terminal and
it goes to Runops instead of AWS. Runops adds the command to Git
and gets peer reviews (when required) in Slack before sending it to
AWS. After it runs, we deliver the results back in the terminal,
but with all sensitive data masked. It works for AWS, Kubernetes,
databases, and others. I was leading the Infra team at a Fintech
(pismo.io/en), and we wanted to give autonomy to all developers in
production. But we couldn't give them direct access due to
compliance requirements. The solution was to have a small number of
people (my team) with "full access" to production systems.
Engineers would ask us when they needed to run one-off scripts in
production. Our goal was to deliver automations so that other teams
wouldn't need to ask us to do things. We would build a way for them
to do it with compliance, security, and reliability. It didn't
work. We were spending 80% of the time processing the queue of
requests, and 20% building automations. The backlog was always
increasing, and the team was burning out. Engineers were not happy
as their requests took a long time to process and clients were
angry at them. But some nice automations came out of that. For
instance: we needed to review ad-hoc prod database reads to avoid
bad queries. So we built a Jenkins pipeline that ran SQL queries
from Git after code review using Flyway. Any engineer could run
queries in prod, leaving traces on who did it, reviews, when it
happened, and why, for every query. When talking to friends at
similar companies, I saw the problem was even worse. Some of them
weren't trying to automate, they already had dedicated people for
running these scripts, i.e., an ops team. I knew there was a better
way, so I set out to build it. I quit this job mid last year, with
about 8 months' worth of savings to make this work before I'd need
to find a job again. It was tough in the beginning, as I'm an
engineer and had to learn sales, marketing and product management
on the job, but after getting the first few customers things
started improving. The goal for Runops is to let any engineer run
anything in production as if they had full access, automating as
much as possible of security and compliance. When human interaction
is needed, we make it synchronous using Slack. Now, instead of
having a single team as a bottleneck, you can have everyone do
things in production. Centralizing teams with most of the access to
AWS, Kubernetes, and databases is bad. It makes for slow Change
Management processes using Jira or other tools with manual
executions at the end. Runops let's you add quick reviews from
experts (Infra, DBA, security, etc), and automates executions. The
primary interface is a CLI, where you run scripts that goes from
SQL queries to kubectl exec and AWS CLI commands. We don't create
new abstractions, you use the same commands and docs available, we
just proxy them. A nice benefit is replacing VPNs and the 10 client
tools/credentials you would need today. We also support templates
for custom actions in a bunch of languages. We built it using
Github Actions for executing commands. We store configurations and
credentials as Actions Secrets and they get injected when a command
requires them. It's nice because we can run anything that goes in a
Docker container in <15 seconds. We have plans to improve it beyond
Actions by creating a real-time proxy. That will enable a REPL-like
experience. Runops doesn't have a web interface, this is on
purpose, we don't want to be one more tool engineers have to learn.
Most interactions happen with our CLI or Slack. We have a simple
admin UI in Retool. We do everything using Lisp. The CLI uses
Clojurescript; the REST API uses Clojure. It's great to have the
same language everywhere, and Lisp is also a fantastic advantage.
Today we have big Fintechs using Runops. They use it to let
developers run commands inside Kubernetes pods, like Rails Runner
and Elixir IEx, SQL queries, DynamoDB queries, and making internal
API calls in private networks using cURL. One of the best parts of
building this has been seeing developers doing more production
work. Regulated companies that never considered giving this level
of autonomy to all developers are changing their minds. It's great
to see a tool impacting the culture, increasing trust. We're
really happy we get to show this to you all, thank you for reading
about it! Please let us know your thoughts and questions.
Author : andriosr
Score : 55 points
Date : 2021-03-08 13:26 UTC (9 hours ago)
| cpressland wrote:
| We're an Azure/M365 house, but some of the things this tool
| explicitly solves were mentioned as areas of improvement for us
| during PCI assessment recently. I'll be keeping an eye on this.
| Great work so far!
| andriosr wrote:
| Glad to hear we could help in the future, feel free to reach
| out any time. We would love to hear more about your use cases
| and the alternatives you guys have in mind to improve the PCI
| assessment results. We support Azure :)
| thisisxavier wrote:
| How did you acquire your first customers?
| andriosr wrote:
| It was a combination of multiple things. The first customer
| came from the newsletter I run called SRE Teams
| (https://sreteams.substack.com). Others came from intros from
| my network and from reaching out to people I thought we could
| help. When I was running the DevOps team at Pismo we used to
| organize meetups and knowledge sharing sessions with other
| companies having similar problems, this also helped.
| 0xbadcafebee wrote:
| What you've basically created is automated change control, but
| lacking some change control features. You might want to add a set
| of features specific to managing change control, because
| otherwise I'll have to build a change management system around
| this.
|
| > You run an AWS CLI command in the terminal and it goes to
| Runops instead of AWS
|
| Most enterprises are wary of handing over control to a vendor,
| especially if it's the underpinning of all operations in the
| company. I suggest a self-hosted/Enterprise release. After a few
| years of trying to make it work, the Enterprise will gladly pony
| up more money for a hosted cloud solution, but the self-hosted
| will get you in more doors.
|
| > We do everything using Lisp. The CLI uses Clojurescript; the
| REST API uses Clojure.
|
| Do you expect regular people to be able to contribute to or
| modify this? Do you find a lot of Lisp/Clojure devs out there for
| when you need to expand?
|
| > Painless audit trails: No need complex for ETL to connect
| trails from Cloud Trail, Database Audit Logs, Kubernetes audit,
| etc.
|
| You still have to audit those things. If a hacker gets in to your
| infrastructure, you have to know what they did.
| andriosr wrote:
| I agree with many of these points, here are some thoughts on
| how we deal them:
|
| > Most enterprises are wary of handing over control to a vendor
|
| Great point, we do have the Enterprise version, which is self-
| hosted.
|
| > Do you find a lot of Lisp/Clojure devs out there for when you
| need to expand?
|
| We won't hire engineers based on the language they know, but
| instead in general engineering skills, and they can learn
| Clojure here (already worked for the first one:)
|
| > You still have to audit those things
|
| Yes we do, but mostly to trigger alerts if anything happens
| there and to show that the accesses are either from Runops or
| the applications during audits. This is way lighter than
| relying on these as the source of truth for trails.
|
| I'm curious about the Change Management features you think are
| missing. We do have review workflows and other CM-related
| features I didn't add here, this demo shows some of it:
| https://see.runops.io/videos/demo
| tyingq wrote:
| I'm curious if it's visually obvious that commands are running in
| a non-dev environment. Saving people from the scenario where they
| walk away for a coffee, return, then accidentally start typing
| into the wrong terminal window.
| andriosr wrote:
| I've done that and can relate to the problem! It's common for
| Kubernetes, where you never know which cluster kubectl is
| pointing to. The Target (what we can where you are running
| things), is one of the options in the CLI. So you have to at
| least provide: the Target and the script to run a command. This
| way you always know where you are running things, it's
| something like this:
|
| runops tasks create --target mysql-demo --script 'select * from
| dundermifflin.customers;
| tyingq wrote:
| Ah, great, thanks. Some of the wording made it sound like
| perhaps it was hooked transparently. This appears very clear.
| ystad wrote:
| The information on https://runops.io/ is light, does not have
| information on examples, workflow etc.
|
| what is the setup like (is it cloud hosted or hosted by onself in
| a cloud). Is your code open source? how is authn/authz if I want
| to use this?
| andriosr wrote:
| Yes, we have a lot of work to do on our landing page to better
| explain these points. It's early days, but we will get there!
| Here is some light on them:
|
| It's cloud hosted, and we do support self-hosting for
| enterprises. The code is not open-source. We support Okta,
| Google, and other OAuth providers for Authentication. For
| Authorization we have the concept of Targets, which are
| abstractions of your cloud resources to users/developers. Say
| you have a Mysql database, you can create a read-only Target
| and let everyone use it, and create a second Target for the
| same database with write access. In the second Target you
| require reviews from tech leads, or let selected groups run
| queries.
| ystad wrote:
| Thanks!.
|
| how does your service compare to services such as teleport
|
| https://github.com/gravitational/teleport
| andriosr wrote:
| Teleport is a fantastic tool. The main difference are: 1)
| Runops doesn't require you to have tools (kubectl, psql,
| etc) installed locally and don't download temporary
| credentials to access resources, commands execute in the
| Cloud. 2) Runops has synchronous reviews workflows on the
| command/intent level, again as opposed to getting an open
| session for a period of time. 2) We automatically remove
| sensitive data from the results of every command. 4) Runops
| uses Git as the source of truth for the audit trails.
| 1vuio0pswjnm7 wrote:
| Could one refer to this as a so-called "API Gateway".
| jeremyis wrote:
| I haven't personally been on an infra team but I've seen Infra /
| Dev tools teams being overwhelmed with requests. This seems like
| a really helpful and elegant solution!
| andriosr wrote:
| Curiously I started in the dev team and migrated to infra in an
| attempt to fix things :)
| candiddevmike wrote:
| So hosted PAM, but I don't see any compliance certifications on
| your website? How do you have fintech customers today using it
| (or really, anyone using it)? Why would anyone trust you guys to
| proxy access to their environments?
| andriosr wrote:
| Yes, I like your definition. We don't have certifications yet,
| but the team has done the biggest ones before we are keeping
| everything ready to get them. We should start the processes in
| the next couple of months. That being said, not all
| certifications require all software you use to also have the
| certifications. I understand this is critical for PCI, where
| anything with access to the data is also scope, but for SOC2
| this is not the case. Most of our customers today are fintech,
| we are very transparent about our architecture and how we do
| things with our customers, that is where the trust come from.
| We the best solutions available to deal with things like
| storing credentials and sensitive data. That being said, you
| can always opt for the self-hosted enterprise version.
___________________________________________________________________
(page generated 2021-03-08 23:01 UTC)