[HN Gopher] Show HN: Datree (YC W20): Prevent K8s misconfigurati...
       ___________________________________________________________________
        
       Show HN: Datree (YC W20): Prevent K8s misconfigurations from
       reaching production
        
       Hi HN, this is Shimon and Eyar of Datree (https://www.datree.io/).
       When I was an Engineering Manager of Infrastructure at ironSource
       (NASDAQ:IS) for 400 developers, a developer made a mistake, causing
       a misconfiguration to reach production, which caused major problems
       for the company's infrastructure.  Mistakes happen all the time -
       you learn from them and hope to never make them again. But how can
       we prevent a production issue from recurring, or, how about a
       bigger challenge -- how can you prevent the next one from the get-
       go?  In our case, we tried sending emails to our devs, writing
       Wikis, and hosting meetups and live sessions to educate our
       developers, but I felt that it just wasn't driving the message
       home. How can developers be expected to remember to configure a
       liveness probe or to put a memory limit in place for their
       Kubernetes workload when there are so many things that a dev must
       remember? Infra just isn't their primary focus.  Today,
       organizations want to delegate infra-as-code responsibilities to
       developers, but face a dilemma -- even a small misconfiguration can
       cause major production issues. Some companies lock up infra changes
       and require ops teams to review all changes, which frustrates both
       sides. Developers want to ship features without waiting for infra.
       And infra teams don't want to "babysit" developers by reviewing
       config files all day long, essentially acting as human debuggers
       for misconfigurations.  That's why I teamed up with Eyar to found
       Datree. Our mission is to help engineering teams prevent Kubernetes
       misconfigurations from reaching production. We believe that
       providing guardrails to developers protects their infra changes and
       frees up DevOps teams to focus on what matters most.  Datree
       provides a CLI tool (https://github.com/datreeio/datree) that runs
       automated policy checks against your Kubernetes manifests and Helm
       charts, identifies any misconfigurations within, and suggests how
       to fix them. The tool comes with dozens of preset, best-practice
       rules covering the most common mistakes that could affect your
       production. In addition, you can write custom rules for your
       policy.  Our built-in rules are based on hundreds of Kubernetes
       post-mortems to ensure the prevention of issues such as resource
       limits/requests (MEM/CPU), liveness and readiness probes, labels on
       resources, Kubernetes schema validation, API version deprecation,
       and more.  Datree comes with a centralized policy dashboard
       enabling the infra team to dynamically configure rules that run on
       dev computers during the development phase, as well as within the
       CI/CD process. This central control point propagates policy checks
       automatically to all developers/machines in your company.  We
       initially launched Datree as a general purpose policy engine (see
       our YC Launch https://news.ycombinator.com/item?id=22536228) in
       which you could configure all sorts of rules, but the market drove
       our focus toward infrastructure-as-code and, more specifically,
       Kubernetes, one of the most painful points of friction between
       developers and infrastructure teams.  When we adjusted to a
       Kubernetes-focused product, we pivoted our top-down sales-driven
       model to a bottom-up adoption-driven model focused on the user.
       Our new dev tool is self-served and open-source. Hundreds of
       companies are using it to prevent Kubernetes misconfigurations and,
       in turn, are helping the tool improve by opening issues and
       submitting pull requests on GitHub. Our product is well suited for
       self-evaluation and immediate value delivery. No demo calls -- just
       2 quick steps to try the product yourself!  TechWorld with Nana did
       a deep technical review of our product, which can be viewed at
       https://www.youtube.com/watch?v=hgUfH9Ab258.  We look forward to
       hearing your feedback and answering any questions you may have.
       Thank you :)
        
       Author : shimont
       Score  : 133 points
       Date   : 2021-10-19 15:04 UTC (7 hours ago)
        
       | verchol wrote:
       | K8s manifest are yamls that translated to tens of K8s actions
       | that change production environment .Misconfigurtions that can be
       | prevented by integrating datree cli into CI/CD cycle can save
       | hours of production unreliability. For me it's must have phase in
       | K8s release flow.
        
       | orweis wrote:
       | K8s can really be a pain especially for larger teams. This is a
       | breath of fresh air!
        
         | [deleted]
        
       | exdsq wrote:
       | I can't believe how complex a tool has to be for several startups
       | to get funding purely to try stop it doing bad things. Kudos to
       | Datree for making the most of this, but feels like something's
       | wrong with k8s for this to be such a thing.
        
         | nonameiguess wrote:
         | The problem is right here in the blurb. It's not that k8s
         | itself is complex. Ops is complex. Companies used to have
         | entire dedicated IT departments full of sysadmins, storage
         | engineers, network engineers, and security engineers with
         | decades of experience configuring, deploying, maintaining, and
         | monitoring servers, networks, hypervisors, and data centers.
         | 
         | Newer companies just push this onto their application
         | developers, expecting them to figure this stuff out on top of
         | being developers, "full-stack" now meaning you need to
         | understand everything down to filesystems, overlay networks,
         | container runtimes. This is not a reasonable expectation.
         | Nobody can be an expert in everything.
         | 
         | Of course, I'm not sure full automation can really replace
         | human expertise, either.
        
           | romanlab wrote:
           | It's true that a developer can't be an expert in every aspect
           | of the stack but that's where services like Datree or many of
           | AWS' services for example come in, they bring the domain
           | expertise and require the developer to only be familiar with
           | the subject. The experts moved to be domain experts, working
           | for the companies that develop the tools.
           | 
           | You don't really need a resident storage expert in every
           | company, since most companies have similar needs.
        
             | exdsq wrote:
             | But suddenly you need to keep up with a ton of different
             | tools and if they break you either dig into the topic or
             | you're f**ed. This is such a productivity killer it's
             | crazy.
        
           | [deleted]
        
         | shimont wrote:
         | I know! I think that the fact that developers are dealing more
         | and more with infra is very empowering but on the other hand
         | brings new challenges. It is no longer Dev VS OPS, but now Devs
         | also need to learn infra best practices, so tools like ours
         | help them :) thank you for your Kudos! <3
        
         | geofft wrote:
         | IMO Kubernetes is of the level of complexity of a programming
         | language or an OS. It's just at a larger scope, and we don't
         | have a lot of things in that space, so we don't have well-
         | defined concepts like "language" or "OS" to encompass them.
         | 
         | There are basically entire industries dedicated to stopping
         | programming languages from doing bad things (static analysis
         | vendors, auditing consultancies, formal verification tool
         | vendors, etc.) and to stopping OSes from doing bad things
         | (application-focused monitoring tools, security-focused
         | intrusion detection tools, policy enforcement / device
         | management vendors, etc.), and we don't really say "Wow,
         | something is wrong with C++" or "Wow, something is wrong with
         | Linux." We understand that they are high-power tools and you do
         | want additional tools to focus that power.
         | 
         | (Well, to be fair, _I_ say something is wrong with C++, but my
         | preferred solution to that is _even more complicated_
         | programming languages :) )
        
           | romanlab wrote:
           | That may be true but programming languages are there to give
           | you the power to develop any idea into a working software
           | solution. I think K8s differs because I see it as something
           | that simplifies infra and abstracts infra vendor specific
           | concepts. The complexity in K8s doesn't add power, just
           | confusion ;)
        
             | geofft wrote:
             | Kubernetes _does_ simplify and abstract - just like C++
             | simplifies and abstracts compared to writing assembly. :)
             | In both cases, the scope of what you can do expands
             | significantly, which means that the systems you build with
             | the higher-level tool are significantly more complex, which
             | means that you see much more confusing C++ programs and
             | Kubernetes deployments than assembly programs and five-
             | lovingly-handcrafted-servers deployments.... but that 's
             | because you can successfully start with a more complex idea
             | and make it happen.
        
           | TeMPOraL wrote:
           | Don't we? :). I say[0] something is wrong with programming
           | itself, on a more fundamental level than the evolutionary
           | history of C++.
           | 
           | I can't give a coherent and detailed analysis of it yet[1] -
           | but I have this growing feeling that we're drowning in
           | accidental complexity all across the board, at every
           | abstraction layer. Like an inverse iceberg - where we see
           | this whole, humongous mountain of tooling required to build
           | and maintain software systems, but you can't shake the
           | impression that we should be able to do the job with just the
           | bit that's sticking above the waterline.
           | 
           | Speaking of k8s being "of the level of complexity of a
           | programming language or an OS", I bet there's some formal way
           | to show some isomorphism here - them being different
           | incarnations of the same abstract structure. It's another
           | kind of feeling I get when jumping up and down the software
           | stack[2]. Maybe one day we'll figure it all out.
           | 
           | --
           | 
           | [0] - https://news.ycombinator.com/item?id=28568053
           | 
           | [1] - But I am collecting observations and trying to mull the
           | problem over in my subconscious mind.
           | 
           | [2] - Like e.g. code is data is code; your config parser is
           | an interpreter of a programming language. Often enough, it
           | grows to look like a typical PL, then gets replaced by
           | one[3]. If your config happens to describe infrastructure, at
           | some point you might realize you're writing "function calls"
           | for business logic that are implemented in terms of spinning
           | clusters up and down. Or e.g. the realization that DCOM is
           | essentially microservices, so for some two decades or more,
           | every Windows installation had something similar to k8s deep
           | inside its bowels.
           | 
           | [3] - https://mikehadlow.blogspot.com/2012/05/configuration-
           | comple...
        
         | nijave wrote:
         | Not much different than Windows, Linux, etc (operating systems)
         | 
         | K8s basically implements a distributed OS that schedules across
         | machines instead of CPU cores
         | 
         | Not-too-long-ago AWS had a significant outage due to
         | misconfigured ulimits (Linux os)
        
         | wvh wrote:
         | Have you ever tried to set up HA environments before Kubernetes
         | and friends, let alone anything that can do failover and some
         | form of auto-scaling? Kubernetes might have some unwarranted
         | complexity, but let's not pretend it was a walk in the park
         | setting environments up with ad-hoc shell scripts, HA proxy, IP
         | failover, manual package updates and more fun stuff. Tools like
         | Kubernetes and their general abstractions are vastly better
         | than anything I could come up with in a time before even Puppet
         | and Ansible existed. You have to take into consideration all
         | that orchestrators like Kubernetes are trying to replace before
         | you can meaningfully criticise their complexity.
         | 
         | Sadly enough it's all YAML, but at least it's all YAML and not
         | random mostly undocumented arcane configuration formats...
        
       | tesken wrote:
       | Nice. I tried playing with it but I have multiple file types in
       | my repo, not just Kubernetes. Can I still run those in Datree or
       | will it fail?
        
         | eyarz wrote:
         | You're right, it will fail. For this reason, we added the flag
         | `--only-k8s-files` that you can use to skip files w/o the keys
         | `apiVersion` and `kind`. This way, you will see exit code 1
         | when scanning non-k8s-files with Datree.
        
       | NimrodKramer wrote:
       | Can I use Travis CI with Datree?
        
       | shar1z wrote:
       | Nice! Open source + Kubernetes + DevOps - doesn't get better than
       | that. Great work Datree team.
        
       | sbkg0002 wrote:
       | Is it me, or is the dashboard only for paying customers?
       | 
       | What does this add compared to Polaris by fairwinds ?
        
         | shimont wrote:
         | The dashboard is also offered as part of our freemium offering
         | :) we offer 1000 policy checks per month for free. Including
         | the dashboard.
         | 
         | In terms of what we offer compared to Polaris: We offer pre-
         | defined policies that comes out of the box along with the
         | ability to write custom rules for your policy by your self.
         | 
         | Take us for a spin and let me know what you think! thank you
        
           | bbrennan wrote:
           | FYI - Polaris open source has both these things :)
           | 
           | (Disclosure - I'm a maintainer)
        
       | bobbiechen wrote:
       | Looks like this is a policy-as-code tool in the vein of Terraform
       | Cloud's Sentinel Policies, or more generally Open Policy Agent,
       | but specifically targeted at k8s use cases.
       | 
       | From the custom rules overview https://hub.datree.io/custom-
       | rules-overview , though it is docs WIP, I noticed these are
       | defined as YAML/JSON somehow. That's a contrast to HashiCorp's
       | Sentinel https://docs.hashicorp.com/sentinel/concepts/language
       | and OPA's Rego
       | https://www.openpolicyagent.org/docs/latest/policy-language/ . Is
       | this an intentional design decision?
        
       | kunal_kushwaha wrote:
       | Been using Datree for a while now and its become a habit for me.
       | It's so simple yet powerful, super easy to get started with. No
       | one likes production clusters to fail, and using Datree is a step
       | in the right direction to prevent that by detecting
       | misconfigurations in manifest files. Keep up the great work!
       | Looking forward to future announcements.
        
         | [deleted]
        
       | almogbaku wrote:
       | Sounds fascinating.
       | 
       | Tools nowadays are so cumbersome it's so easy to misconfigure
       | them.
       | 
       | Thanks for sharing
        
       | nawgz wrote:
       | Why is this whole thread dead? Just because of the blatant
       | appearance of astroturf or what?
        
         | dang wrote:
         | I killed all the booster comments. We tell YC startups not to
         | do that. All startups, of course, but especially YC startups.
         | 
         | Sometimes it happens inadvertently (e.g. users find out about
         | the thread and rush in to 'help'), but obviously we want the
         | discussion here to be substantive.
        
           | shimont wrote:
           | Hey, some of our friends were over eager to help hehe :)
           | 
           | I look forward to hearing your feedback! Thank you
        
           | nawgz wrote:
           | Thanks, makes sense, appreciate the reply!
        
       | _orcaman_ wrote:
       | Datree is awesome!
        
         | [deleted]
        
       | [deleted]
        
       | iland wrote:
       | Wish this tool was in existence two years ago, would have saved
       | me many 3am wakeup calls.
        
         | eyarz wrote:
         | If you don't wake up at 3am any more, I guess you stopped using
         | K8s ;)
        
       | lovely_koala wrote:
       | How can I integrate datree into my CircleCI pipeline?
        
         | eyarz wrote:
         | Datree is a CLI tool, so it can be integrated with all types of
         | CI's. Check out our instructions for integrating with CircleCI:
         | https://hub.datree.io/ci-circleci
        
       ___________________________________________________________________
       (page generated 2021-10-19 23:01 UTC)