[HN Gopher] Cloud Infrastructure as SQL
       ___________________________________________________________________
        
       Cloud Infrastructure as SQL
        
       Author : pombo
       Score  : 80 points
       Date   : 2021-09-16 16:47 UTC (6 hours ago)
        
 (HTM) web link (www.iasql.com)
 (TXT) w3m dump (www.iasql.com)
        
       | emersonrsantos wrote:
       | DROP DATABASE ooops
        
       | Thaxll wrote:
       | Just why... why would anyone uses / learn SQL to manage
       | infrastructure. Also fyi people managing infra are usually not
       | the one doing SQL.
        
       | [deleted]
        
       | ryanisnan wrote:
       | I can see SELECTs being useful here, but yeesh
       | INSERT/UPDATE/DELETEs would scare the heck out of me.
       | 
       | Also not clear on how/why this would be a SaaS product - this
       | seems like it's a library you would download, throw some keys at,
       | and thanks.
        
         | dnautics wrote:
         | > yeesh INSERT/UPDATE/DELETEs would scare the heck out of me
         | 
         | Out of curiosity, why would they?
        
           | cryptonector wrote:
           | I guess mass administration -> one mistake can destroy
           | everything.
        
       | rmetzler wrote:
       | I wonder how this works. Maybe it's similar to osquery with
       | virtual tables on top of sqlite? But then, how would you market
       | this as SaaS?
        
       | yevpats wrote:
       | Interesting how it implemented under-the-hood. Does it use
       | cloudquery (https://github.com/cloudquery/cloudquery) or
       | steampipe (https://github.com/turbot/steampipe) under-the-hood or
       | does it implement everything from scratch.
       | 
       | Disclaimer: Im the founder of CloudQuery.
       | 
       | I get why you would want to do select * from infra, but not sure
       | I understand why you would want to do "insert * into infra" and
       | not use something like terraform? interested in hearing the use-
       | case.
        
         | bink wrote:
         | It does sound like one of those "because it's cool" features
         | rather than "because anyone should ever actually use this".
        
           | neonlex wrote:
           | Which part of what?
        
         | whoomp12342 wrote:
         | personally, I love terraform. I dont like statefiles though.
         | Its annoying to have them in a vault system
        
       | econti wrote:
       | Being able to check in a SQL file to our repo to manage all of
       | our infra sounds like a dream. Added myself to the early access.
        
       | rubiquity wrote:
       | I see a lot of criticisms for not wanting to use SQL to do writes
       | and I think that is misguided. The current state of your
       | infrastructure is absolutely state and SQL is a great language
       | for working with state. While Terraform and all these other
       | "declarative" infrastructure tools are better than what came
       | before them, you're ultimately playing Relation Stitcher by
       | needing to connect the various pieces together. There is nothing
       | declarative about Terraform and others. Infrasturcture is
       | absolutely stateful and relational so why not use SQL and
       | relations to manage it?
       | 
       | There are mentions of other tools that address the read side, and
       | that's useful for obvious reasons, but you've punted on the hard
       | problem which is the writes. The key to getting writes right will
       | be constraints and triggers. Constraints can absolutely help
       | operators to not cause outages by creating guard rails around
       | certain state mutations. Triggers are important because unlike
       | data that never sees an update, infrastructure is living and
       | being able to consume those changes is important.
       | 
       | I might just have confirmation bias because I have this idea
       | written down and think it should exist. Regardless, good luck!
        
         | orf wrote:
         | Infrastructure is made up of many parts. A single S3 bucket may
         | have multiple resources (a key, a policy, a notification
         | queue). How is having several "insert into" statements for each
         | of these any different from "resource" blocks in terraform?
         | 
         | If anything it would be much worse, because you either write
         | some ungodly huge sql statement to create multiple resources or
         | you loose the ability to know what resources depend on each
         | other and form a graph of dependencies.
         | 
         | This results in much slower plans, as you don't have a dag and
         | you need to potentially refresh state much more often, or
         | something that looks like terraform but with way way more
         | boilerplate and irrelevant syntax.
        
           | rubiquity wrote:
           | We need to be creative and make types relevant to the
           | resources being modeled. If the infrastructure database is
           | just a bunch of string identifier fields then it isn't very
           | helpful. You have to find good abstractions and model the
           | resources in types that bring meaning to the data.
        
             | orf wrote:
             | Sure, but that's got nothing to do with SQL and could be
             | modelled in terraform. Or better yet, indirectly using
             | terraform via IAC providers in languages like Typescript.
             | 
             | Show me a proper example of creating an actual s3 bucket
             | that you'd use in production. KMS key, inventory
             | configuration, resource policy, lifecycle policy, logging
             | enabled. Created via SQL.
             | 
             | Now show me how you'd take this and make it a reusable
             | module so we can have a standard template for buckets that
             | everyone can use.
        
               | rubiquity wrote:
               | You're focusing too much on the initial creation rather
               | than on going maintenance and evolution. SQL and
               | relations are much better suited to handle evolution by
               | enforcing constraints than a graph of stitched together
               | pseudo JSON.
        
               | orf wrote:
               | No, I'm not. There isn't a difference between creating
               | and ongoing maintenance- it's the same thing. You
               | describe your state, something reconciles that. How you
               | describe your state and the dependencies between
               | resources is absolutely key, and on the face of it it
               | looks like this interface is totally inadequate.
               | 
               | So, again, show me even a brief sketch of how you would
               | describe what I said above with this model, and you'll
               | see it quickly falls apart.
        
               | rubiquity wrote:
               | You're coming off a bit worked up, but I'll humor you
               | anyway.
               | 
               | I'm not going to type out a bunch of SQL as an example.
               | SQL vs HCL isn't the point and they basically break even
               | on expressiveness. After you've typed out your pseudo-
               | JSON, what exactly are the existing tools saving you?
               | From having to use some wrapper around the cloud API?
               | That's the easy part.
               | 
               | By overly focusing on SQL you're missing the forest for
               | the trees. The point is relations and RDBMS features such
               | as constraints, triggers, stored procedures. Such a
               | platform would be always online rather than just a
               | tfstate file waiting for humans to munge it. It's also
               | time to stop thinking about the cloud as literal
               | resources like current tools do and start moving to more
               | abstract concepts (more on that later).
               | 
               | > No, I'm not. There isn't a difference between creating
               | and ongoing maintenance- it's the same thing.
               | 
               | I run very critical infrastructure for a living and there
               | absolutely is a difference. Creating new resources is
               | easy -- they aren't being used and the world doesn't have
               | any expectations for their performance or reliability.
               | The bacon is made in evolving existing infrastructure
               | without impacting user experience negatively. Terraform
               | and other such generators give you very little in guard
               | rails or help and silly outages happen all of the time
               | because of it.
               | 
               | Database engineers have been creating sophisticated
               | execution engines for decades. The creator of SQLite
               | cites treating every SQL query as its own program to be
               | ran by a byte code VM as a key design decision. Writing
               | Terraform or what have you is like programming in an AST
               | (sorry Lispers). To date query execution engines figure
               | out how to manage on-disk structures. There is no reason
               | they couldn't be creating smart plans for infrastructure
               | changes at a much higher abstraction level than "glue
               | this KMS key to this bucket."
        
         | tmp_anon_22 wrote:
         | Infrastructure is quantum state. AWS APIs lie to you. Status
         | pages lie to you. If I had a nickel every time the solution to
         | a failing HTTP call was "just send it again"...
         | 
         | Use whatever tool you like: SQL, Rust, carrier-pigeon - but it
         | wont be a panacea to solving cloud infrastructure.
        
           | dnautics wrote:
           | Everything is "quantum state". You do a database fetch and in
           | the time that the result travels over the wire to the FE
           | another node could have made a mutation.
           | 
           | The nice thing is that adapters to sql engines often have
           | exactly the sort of protections you want baked in, e.g.
           | "check to make sure the updated at timestamps match before
           | committing an update"
        
           | rubiquity wrote:
           | Everything in the real world is quantum state but that
           | doesn't stop every SaaS application out there from using an
           | RDBMS as their system of record. This is why we have
           | reconciliation processes. Terraform state files are just an
           | ad-hoc storage format without any of the features that SQL
           | and RDBMS have had for decades (I don't mean to pick on
           | terraform so much, it's just the one I'm most familiar with).
        
             | stingraycharles wrote:
             | Like someone else says, SELECTs make sense,
             | INSERT/UPDATE/DELETE to manage infrastructure state, rather
             | than using "proper" infrastructure as code, sounds like a
             | path to hell to me.
        
               | eloff wrote:
               | Why specifically is that worse than config files?
        
         | dnautics wrote:
         | It kind of already exists... Libvirt is basically a database
         | under the covers. If anything I'm surprised the implementation
         | isn't "AWS backend for libvirt" > "sql frontend for libvirt"
         | 
         | In a past life I was managing virtual machines in elixir and
         | not too long after writing a libvirt adapter I realized that
         | what I should be doing is frontending libvirt with ecto
         | (elixirs database tool)
        
           | rubiquity wrote:
           | That's interesting! I never thought of libvirt that way.
        
       | stingraycharles wrote:
       | What problem does this solve, as opposed to a git repositories?
       | 
       | To me, declarative infrastructure management is incredibly
       | important in order to reason about a deployment. Even though
       | under the hood, yes, everything is stateful, it's something I
       | want abstracted away, not as a primary way of interacting with my
       | infrastructure.
       | 
       | I guess what I'm asking is what the canonical use case / target
       | user of this is.
        
         | bink wrote:
         | I think most large companies have written similar tools to deal
         | with various use cases.
         | 
         | As part of a security team I often have the need to query for
         | what instance has been assigned what IP, what team owns what
         | AWS account, which security groups have port X open. You can do
         | all of this using API queries but it's tedious, slow, and you
         | can run the risk of hitting API rate limits. Most of this
         | information is not easily queryable via Terraform and git.
         | 
         | At my last place we had a custom designed tool that would
         | regularly fetch information from the AWS API and our CM servers
         | and store it in a database. At my current place we have a tool
         | that can query from CM but doesn't integrate with the AWS API
         | so we're still doing things manually there. Having this
         | database available via SQL is a tremendous help.
         | 
         | Now, the _write_ side of this I do not have a use case for.
        
         | [deleted]
        
       | kwertyoowiyop wrote:
       | Why not just a text file? Is this the main reason?
       | 
       | > Unlike IaC, IaSQL makes the relations between pieces of your
       | infrastructure first-class citizens, enforcing type safety on the
       | data and changes to it.
       | 
       | It seems easier to solve that problem via a source-control hook
       | to check a text file, than move over to SQL. But maybe this
       | proposal gets you other things too?
        
       | LaserToy wrote:
       | We discussed writing a plugin for Trino to support Kubernetes
       | operations. Was more like a fun thought experiment. Looks like
       | someone built it...
        
       | whoomp12342 wrote:
       | isnt the big advantage to infrastructure as code the fact that
       | you can version control it? and isnt it notoriously difficult to
       | version control SQL? maybe I am missing something
        
       | nathanwallace wrote:
       | Steampipe (https://steampipe.io) is an open source CLI to query
       | cloud infrastructure using SQL (e.g. AWS, GitHub, Slack, k8s,
       | etc). It also has a HCL language to define security benchmarks
       | and controls (e.g. AWS CIS, etc).
       | 
       | We are a Postgres FDW under the hood with Go-based plugins, so
       | write would be possible, but we've chosen to focus on read only
       | so far. Definitely interested to see how you approach create,
       | update and delete in the SQL model!
       | 
       | Notes: Not related to iasql. I'm a lead on the Steampipe project.
        
         | Aaronstotle wrote:
         | As someone who has to do lots of compliance activities, this is
         | a fantastic tool, thanks!
        
           | nathanwallace wrote:
           | Great to hear :-) Please check out our open source mods (e.g.
           | AWS Compliance, GitHub Sherlock, DigitalOcean Thrifty) - we'd
           | love your feedback & help! https://hub.steampipe.io/mods
        
             | Aaronstotle wrote:
             | Will do! Thanks again
        
         | heinrichhartman wrote:
         | Neat. Does that use steampipe services/API under the hood, or
         | does it consume the AWS/GigHub APIs/... directly?
        
           | nathanwallace wrote:
           | It talks directly to the cloud provider APIs. Plugins are
           | written in go-lang (similar to Terraform providers).
           | 
           | OSS plugins - https://github.com/topics/steampipe-plugin
           | 
           | Docs - https://hub.steampipe.io/plugins
        
             | heinrichhartman wrote:
             | WOOT! That's awesome.
        
       | bob1029 wrote:
       | This is really interesting to me. I absolutely love SQL and the
       | power it has for modeling complexity.
       | 
       | I also don't think this whole thing has to be absolutely pure
       | either... I would have no problems seeing user-defined functions
       | that have side-effects in this kind of scope. You can certainly
       | wrap a declarative + gateway pattern around domain registration
       | or other global one-time deals, but perhaps exposing that side-
       | effect to the SQL user would encourage more robust application
       | development. Exception handling is something that is very hard to
       | perfectly abstract away in a declarative sense.
        
       | theplague42 wrote:
       | When will an ORM be available?
       | 
       | Still not sure whether this is serious or not, but it's not
       | really infrastructure as SQL, it's infrastructure as database
       | records which is stateful and defeats the point.
        
         | zomglings wrote:
         | Is this sarcasm? I have a broken sarcasm detector.
        
       | brodouevencode wrote:
       | Is the signup form broken?
        
         | pombo wrote:
         | No, just double checked. Why did you think so?
        
           | brodouevencode wrote:
           | Hit the sign up button at the top, scrolls me to the bottom
           | but there's nothing there. Using Brave on Mac, even tried
           | with blockers disabled.
        
             | gigatexal wrote:
             | yeah the button scrolls you to the bottom but there's no
             | way to sign up.
        
               | pombo wrote:
               | ah it was a bug in some screens. we think we just fixed
               | it. where you guys on mobile, can you check again?
        
       | Dizcorded wrote:
       | This has absolutely no relevance to me but that SQL snippet just
       | bugs the hell out of me. I can't fathom why you would do a
       | subquery in the from section and then not utilize a join. You're
       | just bringing 2 datasets and letting them sit side by side, why
       | not just move the subquery to the select section? Execution plan
       | should result the same so I guess this is just a preference
       | thing?
        
         | theplague42 wrote:
         | It's using an implied cross join so each image is deployed onto
         | a t2.micro
        
           | Dizcorded wrote:
           | Ahh, gotcha. I appreciate the response there as I wasn't
           | aware of that notation and even then I can't think of any
           | time I've used a cross join. Not sure which syntax I would
           | use personally.
        
             | disgruntledphd2 wrote:
             | > I've used a cross join
             | 
             | They're good for getting rates on small datasets. think
             | (select grouper, count(1) from data) cross join select
             | count(1) from data) I think I've mostly used them in
             | interviews, tbh.
        
       | exabrial wrote:
       | If it doesn't support window functions or CTEs I'm out.
        
       ___________________________________________________________________
       (page generated 2021-09-16 23:02 UTC)