[HN Gopher] Select * from cloud
___________________________________________________________________
Select * from cloud
Author : kiyanwang
Score : 524 points
Date : 2022-09-30 13:08 UTC (9 hours ago)
(HTM) web link (steampipe.io)
(TXT) w3m dump (steampipe.io)
| intrasight wrote:
| How does this relate to or differ from Multicorn? I see that in
| most of the "Type" columns in the "specific web wrappers" section
| of the Postgres FDW page
|
| https://wiki.postgresql.org/wiki/Foreign_data_wrappers
| nathanwallace wrote:
| Postgres FDWs are extensions installed in Postgres. Multicorn
| provides an easy way to implement Postgres FDW extensions in
| Python, leading to many FDW implementations. Steampipe runs as
| a Postgres FDW extension with a pluggable interface to easily
| build and manage data sources. Multicorn-based FDWs might be
| installed as part of your existing Postgres server, while
| Steampipe is optimized to run more standalone / alongside your
| existing Postgres server.
| almenon wrote:
| Does this have autocomplete in the terminal?
| nathanwallace wrote:
| "steampipe query" has auto-complete for SQL queries in the
| interactive terminal - https://steampipe.io/docs/query/query-
| shell#autocomplete
|
| "steampipe service" is a standard Postgres endpoint, so works
| with any BI tool or SQL client and their autocompletion
| capabilities - https://steampipe.io/docs/query/third-party
|
| "steampipe completion" provides terminal completion for the
| Steampipe CLI -
| https://steampipe.io/docs/reference/cli/completion
| serhatcan wrote:
| Great to see steampipe here :)
|
| Founder of Resmo here!
|
| Select * from cloud & SaaS + a fully managed SaaS platform with a
| free offering. Make sure to check us out!
| swalsh wrote:
| Since starting my career, the tech stack I use has changed
| numerous times. Technologies have come, gone, evolved. But except
| for a brief period in 2010 where we thought we could kill it, SQL
| is the one technology that has stayed around. It might be the
| only TECHNICAL knowledge I learned in the first year of my career
| that is still relavent. You're probably not making the wrong
| decision if you decide to build on top of SQL.
| that_guy_iain wrote:
| Honestly, mines has mostly stayed the same. I've played with
| various techs but I mostly work the same underlying tech. Some
| items get added but very rarely does something get replaced or
| removed. The tech gets improved and new features added .
|
| The last time I can remember something getting replaced is
| vargrant for docker. And honestly, I think that's a mistake for
| many companies. Before that it was git for svn.
|
| The only way I can see tech coming and going is you're always
| an early adoptor and need to replace things that failed. If you
| build on stable tech, that stuff stays around for decades even
| if it was terrible in the first place. A good example is PHP.
| It was terrible at first still around and improving year on
| year.
| arjvik wrote:
| Why do you think Vagrant->Docker was a mistake?
| that_guy_iain wrote:
| Most companies have a large amount of devs on macbooks.
| Docker on the macbook has historically been painful. Mainly
| due to the filesystem. Whereas, vagrant doesn't really have
| that problem. I've been using Linux for so long that I
| haven't really experienced the pain myself. But I have
| gotten to enjoy the smugness of them all complaininga about
| their performance while mines is super snappy.
|
| Then if you look at the toolchain surounding vagrant such
| as packer. It looks really nice. Docker doesn't seem so
| nice. In fact, I still use packer to build my docker
| container images because I like the setup nicer.
|
| With Docker it's all about the production env. I see the
| benefits for prod but for dev when you're literally not
| using the same docker setup as you would in prod it makes
| no real sense to me.
| 0xbadcafebee wrote:
| Vagrant still allows people to have deviating personal
| environments, which destroys productivity when people
| have to constantly diagnose what changes they have in
| their local env don't match other things. Vagrant+Docker
| is a decent compromise. Best is cloud-native development.
| If everyone [on a single team] SSH's into one box in the
| cloud with one set of dependencies, everyone gets the
| same environment every time, and updates go out to
| everyone immediately without fail. K8s has similar
| projects to let you develop in a pod on the actual
| cluster, eliminating difference between dev<->prod.
| robertlagrant wrote:
| Docker was painful on Mac, but quite a while ago. I don't
| have any problems these days.
| that_guy_iain wrote:
| The last complaints I heard were about a year ago. Which
| point those devs left.
| marcrosoft wrote:
| Can you elaborate on why vagrant to docker is a mistake? I
| find the docker workflow infinitely better and faster to
| iterate for making dev environments.
| pphysch wrote:
| If your deployment target is a VM/bare-metal rather than
| container, Vagrant makes a lot of sense for development.
|
| E.G. my dev Vagrantfile fits in a small screen, and
| provisioning is handled by a single Ansible playbook that
| works for all environments (using conditionals). So there's
| no need to maintain a separate dev container, and a lot of
| subtle headaches are avoided in the deployment process
|
| Setting up the dev environment consists of running `vagrant
| up` (assuming Vagrant/Ansible are installed). That's it.
|
| You get all the benefits of IaC (fast ~reproducible dev)
| without shoehorning containers in.
| alfalfasprout wrote:
| It also comes with a lot of downsides and significantly
| increased complexity in many cases. Depends what you're
| doing.
| strbean wrote:
| > except for a brief period in 2010 where we thought we could
| kill it
|
| Unfortunately, we never actually tried to kill SQL. We tried to
| kill relational databases, with the death of SQL as a side
| effect. This proved to be a stupid idea - non-relational data
| stores have value of course, but so do relational databases.
|
| SQL, as you point out, is 48 years old. It shows. The
| ergonomics are awful. It uses natural-language inspired syntax
| that has fallen completely out of favor (with good reason
| imho). The ordering of fields and tables precludes the use of
| autocomplete. It uses orthogonal syntax for identical
| constructions, i.e. INSERT ... (col1, col2,
| ...) VALUES (val1, val2, ...)
|
| vs. UPDATE ... col1=val1, col2=val2, ...
|
| And worst of all, it comes with such a woefully inadequate
| standard that half of the universally supported features are
| expressed in a completely different fashion for each vendor.
|
| It is very common to argue that its longevity is due to its
| excellence as a language. I think it's longevity is due to its
| broad target audience, which includes plenty of people in less
| technical roles who are high on the org chart. Tech workers are
| happy to jump on new technology that improves ergonomics (or
| shiny new toys that are strictly more interesting because they
| are new). Executives who's primary way of interacting with the
| tech stack is querying a read only replica are much less open
| to learning a new language.
| sanderjd wrote:
| I think your last sentence gets at the crux of it. I think
| SQL is truly great for that use case of analytical queries
| over a read only database / data warehouse / parquet /
| whatever (ie. "OLAP" use cases). I think it's less great for
| the "OLTP" use cases. It still works fine, which is why it
| sticks around, but other approaches are more convenient for
| doing lots of smaller inserts, updates, and lookups and fewer
| table scans and aggregations.
| mfashby wrote:
| No mention of PRQL yet, it's an effort to tackle some of
| these shortcomings
|
| https://prql-lang.org/
| denysonique wrote:
| > It uses orthogonal syntax for identical constructions
|
| There is an alternative INSERT syntax matching the UPDATE SET
| one: INSERT INTO table SET col1='val1',
| col2='val2';
| xwolfi wrote:
| I disagree, the people high on the org chart are clamouring
| for cloud-based low structure stores on microsoft azure cloud
| maintained in India, while we warn them that we can migrate
| at a giant cost for a net loss in efficiency.
|
| SQL and dbs are fine, they work, we could discuss beautifying
| sql for sure, but frankly we shouldnt solving data
| structuring problems that were already solved 50 years ago:
| we have real business to support instead.
| lelanthran wrote:
| > It uses orthogonal syntax for identical constructions, i.e.
|
| Regarding your complaint ... those aren't identical
| constructions.
|
| Insert has to have the syntax it does because it lets you
| insert multiple rows all with different values in a single
| statement: INSERT ... (C1, C2) VALUES (v1,
| v2), (v3, v4), (v5, v6)
|
| Update doesn't allow that. The syntax for each of the above
| makes sense _for the specific intention_ of each statement.
| RussianCow wrote:
| They could have consolidated the syntax for both, though.
| No reason you couldn't have this instead:
| INSERT ... VALUES (C1=v1, C2=v2), (C1=v3, C2=v4), (C1=v5,
| C2=v6)
|
| This would actually be more powerful because it would allow
| you to arbitrarily omit columns on a per-row basis, which
| you can't do with SQL today.
| layer8 wrote:
| It would maybe be more powerful, but also less efficient.
| If you want to insert different sets of columns, you just
| use multiple INSERT statements. The point of the current
| syntax is to be able to efficiently bulk-insert many rows
| using the same set of columns. That set of columns needs
| to be validated and mapped to the internal offsets (or
| whatever) only once for inserting an arbitrarily large
| number of rows, instead of having to repeat that
| validation and preprocessing for each row.
| smallnamespace wrote:
| You can omit columns on a per-row basis with the DEFAULT
| keyword, it's just more cumbersome if your rows are very
| sparse. Your proposed syntax of: INSERT
| ... VALUES (C1=v1, C2=v2), (C3=v3, C4=v4), (C5=v5, C6=v6)
|
| in standard SQL is INSERT ... (C1, C2,
| ... C6) VALUES (v1, v2, DEFAULT, DEFAULT, DEFAULT,
| DEFAULT), (DEFAULT, DEFAULT, v3, v4, DEFAULT, DEFAULT),
| ...
| aidos wrote:
| And further, insert ... (select ...)
| duxup wrote:
| Hey man if SQL is so great why are people always trying to
| kill it??!?!?
|
| /s
| jayd16 wrote:
| I also feel like part of the staying power of SQL is because
| it's the lingua franca of the database. I wonder if it's time
| to build and popularize some lower level language that could
| be targeted by ORMs and others to speak to DB engines. If we
| had a more flexible assembly language we might get more
| freedom at the programmer language level.
| scarface74 wrote:
| C# already does that, it converts LINQ to expression trees
| that is then converted by providers to their target query
| language. I've used it on one project to pass around the
| same query expressions to SQL server and Mongo - you pass a
| type of IQueryable<T> which represents an expression tree.
|
| For instance
|
| var seniors = query(from u in Users where u.Age > 65 &&
| u.Sex == "male" select u)
|
| IEnumerable<Users> query(IQuery<Users>)
| johannes1234321 wrote:
| MySQL has something in that direction, where some
| operations can be expressed as expression trees by using
| Protobuf where a client can put some arbitrary query
| language on top:
| https://dev.mysql.com/doc/internals/en/x-protocol-
| messages-m... it is still a bit limited, though (like no
| join ...)
| mastersummoner wrote:
| When I was first learning to code, I once told someone that
| working with SQL felt like trying to wrestle with a
| mainframe, and I stand by that assertion.
| omegalulw wrote:
| You are selling SQL way short. SQL has stuck around because,
| compared to its complexity, it lets you do incredibly
| powerful things.
| strbean wrote:
| Look at the analogy to general purpose programming
| languages.
|
| _Every single one_ , without exception, lets you do
| incredibly powerful things compared to its complexity.
|
| This hasn't stopped the progression of general purpose
| programming languages, and I think the vast majority of
| developers would agree that there have been huge
| improvements in the last 50 years.
| duxup wrote:
| It's also great that much of the time if I come up with a
| crappy query... SQL knows how to optimize it for me. Easy
| to use, and fixes my stupids.
| wyager wrote:
| I have plenty of beef with SQL, but your insert versus update
| example doesn't make sense to me. That's a perfectly
| reasonable difference; insert is always a dense operation
| while update is usually a sparse operation.
| [deleted]
| netule wrote:
| It's also the simplest use case of INSERT imaginable and
| fails to be applicable with a slightly more complex one:
| INSERT INTO table1 (col_1, col_2, col3) SELECT
| COUNT(col_6), col_4, col_5 FROM table2 INNER JOIN...
| dllthomas wrote:
| I think in principle we could allow
| INSERT INTO table1 col_1 =
| COUNT(table2.col_6), col_2 = table2.col_4,
| col3 = table2.col_5 FROM table2 INNER JOIN ...
| codegeek wrote:
| To my primitive brain, NoSQL always sounded like "I am going
| to makeup data model and integrity on the fly and let the
| user figure out how to handle it". No Thanks. But may be I
| don't get it. GraphQL honestly gives me the same vibes. Happy
| to be corrected.
| catlifeonmars wrote:
| Just to be pedantic you can use SQL to query NoSQL
| databases, just look at the PartiQL ExecuteStatement API
| for DynamoDB. I assume your question is about why anyone
| would use a NoSQL databases instead of a relational
| database. The short answer is scale. If you have well
| define access patterns that rely only on a primary index,
| you can effectively scale to billions of QPS. If you want
| to make relational queries, and/or can't know your access
| patterns in advance, use a relational database.
| blagie wrote:
| "Replacements" for SQL have been around as long as SQL.
| Remember the OODBMS rage of the eighties and nineties?
|
| https://en.wikipedia.org/wiki/Object_database
|
| I also wanted to kill old, obsolete, crummy SQL when I
| first learned it as a teenager. It reminded me of FORTRAN
| or COBAL or something.
|
| It took quite a lot of theory and maturity to understand
| why RDBMS is the right paradigm. SQL is imperfect, but it's
| good enough to not be worth replacing.
|
| Whenever I've seen benchmarks of postgresql managing JSON
| versus mongodb, postgresql won (at least comparing apples-
| to-apples; postgresql is much more conservative by default,
| in terms of reliability and data consistency, than
| mongodb). There's literally no reason to going with a lot
| of NoSQL databases over using a SQL database as a key-value
| store.
| flockonus wrote:
| > SQL is imperfect, but it's good enough to not be worth
| replacing.
|
| Great way to put it.
|
| > There's literally no reason to going with a lot of
| NoSQL databases over using a SQL database as a key-value
| store.
|
| To be devil's advocate, the arguments I hear range more
| on ease of ops/scalability for noSQL, apparently some do
| it better than SQL implementations, and some drivers can
| make trivial operations easier as well. That said, I'd
| pick PostgreSQL for pretty much any new project I get to
| work on.
| strbean wrote:
| > I also wanted to kill old, obsolete, crummy SQL when I
| first learned it as a teenager. It reminded me of FORTRAN
| or COBAL or something.
|
| > It took quite a lot of theory and maturity to
| understand why RDBMS is the right paradigm.
|
| You and the guy you are replying to are doing it again,
| conflating SQL and relational databases. :(
| xapata wrote:
| Sometimes you just want to store data and worry about
| validation later. I'm in that situation now, with different
| tasks requiring different kinds of validation for the same
| data. Sometimes I want all the postal codes that users have
| entered, sometimes I want only the valid postal codes.
| tlarkworthy wrote:
| NoSQL is like simple assignment to storage and one layer of
| obvious indexing. Its lower level, and therefore much
| faster and predictable because there is no query planner
| guessing how to assemble things.
|
| Tuning SQL on the other hand is a guessing game with the
| query planner. You can express _a lot more_ but 1% of those
| expressions could down your DB.
|
| I like both, but if you want performance NoSQL is the
| winner.
| the_af wrote:
| > _if you want performance NoSQL is the winner._
|
| Performance for _which use cases_?
|
| It matters to clarify this because if you get rid of
| every use case SQL+RDBMS optimize for, NoSQL is obviously
| going to be the winner. A car with no safety measures and
| without most consumer-oriented features is probably going
| to be "faster" than one that has them.
| RussianCow wrote:
| > GraphQL honestly gives me the same vibes.
|
| Not that this is particularly relevant to the discussion,
| but GraphQL is actually much closer to SQL than most NoSQL
| solutions because it's statically typed and the schema is
| statically defined. The GraphQL schema acts as the contract
| of what the API can serve to the client, much like a schema
| in a relational database.
| levkk wrote:
| SQL expresses relational algebra [0] which is the language
| defining data relationships and manipulation. It hasn't
| changed in 50 years for the same reason why B-Trees and
| sorting algorithms haven't: it's foundational to computer
| science.
|
| Database-specific "SQL extensions" are in my experience just
| administration commands, e.g. `VACUUM` or `CREATE EXTENSION`
| in Postgres. They help operate the DB, but have little to do
| with the actual data manipulation.
|
| Killing SQL is like trying to kill computer science: you
| better come up with something better than a synonym.
|
| [0] https://en.wikipedia.org/wiki/Relational_algebra
| dllthomas wrote:
| Nothing about relational algebra says you have to put your
| projection before you specify what relations are involved.
| SQL is a bad fit for modern tooling in ways that are a
| property of the language and are unrelated to it expressing
| relational algebra.
|
| The notion that nothing has changed about B-Trees or
| sorting algorithms doesn't, I think, actually hold up...
| but this question is more like changing the API for a
| standard-ish B-Tree library than about modifying how the
| B-Tree itself works.
| scarface74 wrote:
| And that's why Microsoft's LINQ puts the entities (table
| names) first and then autocomplete just works.
| https://learn.microsoft.com/en-
| us/dotnet/csharp/linq/perform...
|
| As an aside: LINQ is the only "ORM" that is truly worth
| using. It is integrated into the language and you can use
| the same expressions and query expressions are first
| class types that can be passed around as expression trees
| and are then only translated either as code if you're
| using in memory lists, database specific SQL, MongoQuery
| etc based on the provider you pass the query to.
| blacklight wrote:
| Db-specific extensions unfortunately aren't only admin
| commands. They include the whole PL/SQL thing, with
| concepts like triggers and stored procedures that should
| definitely be fundamental in SQL, but that nobody bothered
| to standardize, and eventually everybody implemented as
| they liked.
| Nitramp wrote:
| SQL does express relational algebra, but OP is arguing it
| expresses it poorly, in a cumbersome way. There's nothing
| foundational about how exactly SQL chooses to express
| relational algebra, you could imagine many different
| syntaxes and specific semantics.
|
| I concur, SQL is a liability in many ways.
| bryanrasmussen wrote:
| >Killing SQL is like trying to kill computer science
|
| SQL is a language that expresses relational algebra,
| killing it is like killing any other language i.e. nearly
| impossible because there is still code sitting around
| written in it, thus needing competent programmers to
| maintain, competent programmers who will be asked to write
| more code and will choose to do it in SQL if they can get
| away with it, rinse and repeat and so forth.
|
| Once a language has a certain installed base it may be
| eternal - what size that installed base is I don't know.
| samatman wrote:
| The Third Manifesto[0] is 27 years old at this point. If it
| were possible to take seriously the proposition that SQL
| and relational algebra over databases were equivalent
| before its publication, it hasn't been since.
|
| [0]: https://en.wikipedia.org/wiki/The_Third_Manifesto
|
| I've been trying to get my hands on a corpus of Dataphor
| code, because Tutorial D is unsatisfactory for different
| reasons than SQL. What I've been able to glean of Dataphor
| is quite promising, but it isn't much.
| matwood wrote:
| > It might be the only TECHNICAL knowledge I learned in the
| first year of my career that is still relavent.
|
| Agree. IIRC, I wrote my first SQL is 1996 and wrote some today.
| While not perfect, it's amazing at what it does. I suggest
| reading Joe Celko to really up your SQL skills.
| api wrote:
| That's not random. SQL is basically math. The syntax is a
| little old school and arcane (but not terrible) but the
| principles behind it are set theory and logic.
| tiffanyh wrote:
| Declarative vs Imperative Language.
|
| A huge benefit of SQL is that it's a declarative language (you
| describe _what_ the output should be); whereas with imperative
| languages like C /Python/Ruby you have to describe _how_ to
| generate the output with specific instructions and procedures.
| jooz wrote:
| I consider SQL imperative. You say exactly how to do what you
| want, not what do you want.
|
| If we take as example a table with a single row:
|
| In imperative (SQL):
|
| INSERT INTO TABLE1 (name, wage) VALUES ('John','50k');
|
| Then one day John gets a raise:
|
| UPDATE TABLE TABLE1 SET wage = '60k' where name = 'John'
|
| In declarative (pseudo code in yaml):
|
| - table: TABLE1 - name: John wage: 50k
|
| Then one day John gets a raise:
|
| - table: TABLE1 - name: John
| wage: 60k
| zmmmmm wrote:
| It's sort of a weird hybrid monster really, some times it's
| the worst of both worlds.
|
| You state what you want, yes - but you are forced to
| articulate it as a specific set of table navigations /
| logistics through the relational model, as if you were
| writing the implementation. Then, however, the database
| then may choose ignore those and do something else to
| resolve the data you asked for if it wants, if it can prove
| the outcome is equivalent.
|
| For example I want all the ice creams bought by John. Even
| though the schema knows the foreign key relationship
| between "user" and "purchase" I have to tell it back to the
| database engine in my query. But even after doing that
| there's no requirement the database will actually implement
| the steps I was forced so ungraciously to specify. It may
| "optimise" them away and do something else.
| tiffanyh wrote:
| You're describing to the RDBMS _what_ you wanted added to
| the database. The SQL /RDBMS takes care of how that's done.
|
| In C, you'd have to program _how_ the data should be stored
| (data structure) and written to disk.
| jayd16 wrote:
| There's plenty of improvement to be had while still keeping
| it declarative.
| 411111111111111 wrote:
| You might notice that the only way to create a declarative
| language is to use an Imperative language.
|
| Declarative languages are basically a DSL, which (hopefully)
| translate the desired steps into efficient instructions.
| Nonetheless, your cpu will execute imperative code at the
| end.
|
| SQL is an example of a very well established and generally
| well done declarative language, but that doesn't mean that
| declarative languages are inherently better.
| acmiyaguchi wrote:
| There's nothing stopping you from writing a compiler or
| interpreter in a non-imperative language like prolog.
|
| Also performing the equivalent of a table join in a non-
| declarative query language is going to be a burdensome
| task. It's certainly beneficial to let a query planner
| figure out the details for you, instead of iterating over
| all the rows of your data.
| 411111111111111 wrote:
| Which compiles it to bytecode, which is Imperatively
| executed on your cpu.
|
| You didn't think that one through, did ya?
|
| I'm not even sure where your outage comes from. DSLs
| aren't inherently bad either.
| pjmlp wrote:
| Design and analysis of hardware for high-performance
| Prolog
|
| https://www.sciencedirect.com/science/article/pii/S074310
| 669...
| samatman wrote:
| Furthermore, the only way to implement an imperative
| language is to use a declarative language.
|
| Verilog and VDSL, you see, are declarative. They have to
| be.
|
| It's not a better or worse thing. It's a domain thing.
| 0xbadcafebee wrote:
| People keep twisting themselves into knots trying to fit
| things into these labels, but the net effect is unrelated
| to the description.
|
| Using a "declarative language": my_bucket =
| aws_s3_bucket(aws_region, bucket_name)
|
| Using an "imperative language": if
| aws_s3_bucket.exists(aws_region, bucket_name):
| my_bucket = aws_s3_bucket.update(aws_region, bucket_name)
| else: my_bucket = aws_s3_bucket.create(aws_region,
| bucket_name)
|
| I prefer the latter. In every DSL I've ever used, people
| end up needing to handle weird edge cases, and it's
| extremely hard to do that with a "declarative-only" DSL, so
| they end up adding imperative-ness to the DSL. Give me a
| regular "imperative" programming language and lots of
| convenience functions that do black magic behind the
| scenes, and I'll do regular programming when the black
| magic falls short.
|
| This is basically why AWS CDK / Terraform CDK / Pulumi
| exist.
| mirekrusin wrote:
| ...and you have nice imperative race condition right
| there in your simple example.
| 0xbadcafebee wrote:
| From a simplistic point of view, yes, but from a
| practical standpoint, no. IaC assumes a global lock on
| any resources being managed. As soon as that expectation
| is violated, everything starts breaking, regardless of
| how good the code is.
| skwirl wrote:
| It means you can let the database engine determine the best
| way to get you what you are after. If you write imperative
| code, you are telling the DB exactly what to do and you
| don't really give it much ability to optimize your queries.
| Before SQL most DB engine queries were imperative and it
| was terrible... which is why we don't do it anymore.
| freedomben wrote:
| I think you're reading into something that isn't there. GP
| never said anything about declarative being _better_ than
| imperative, merely that they are different approaches and
| SQL is declarative.
|
| But even if they had, your argument that the CPU ends up
| running imperative code makes it better seems silly. The
| CPU ends up "running" machine code and a compiler has to
| turn the vast majority of imperative code into a different
| form. Does that make machine code better than assembly?
| better than C?
|
| I don't think so. They are simply different levels of
| abstraction, each with their own pros and cons. Neither are
| "better" than the other. Each are "better" at some tasks
| and worse at others.
|
| GP said there was a "benefit" with SQL being declarative,
| because when you want to abstractly request some
| arbitrarily structured data, it's beneficial in most cases
| to not have to know exactly how to find and retrieve that
| data. It's certainly not "better" if you need to ensure
| some specific bit format on the hard drive. But it's
| "better" if you want to succinctly express a query that is
| broadly reusable and understandable even to people who have
| no idea how the database works on the inside.
| 411111111111111 wrote:
| You're reading something into my comment I didn't want to
| say too
|
| I just felt the need to point out that declarative
| languages are essentially always a DSL, wherever this DSL
| is actually performant and should be used depends on it's
| implementation.
|
| Generally speaking, SQL is very well implemented so using
| any of the well established databases is probably a good
| choice. Nonetheless, few declarative languages come even
| close to SQLs efficient implementation so they're very
| rarely the answer.
| irrational wrote:
| I've been using HTML since the mid 1990s. That is probably the
| oldest technology that I still use on a regular basis. SQL
| would be a close second (or maybe third - I probably started
| learning sql around 2001 and I can't remember when JavaScript
| came out. Surely before that.)
| jrm4 wrote:
| As above has noted, Both HTML and SQL are effectively
| _declarative,_ hence their staying power.
| irrational wrote:
| And JavaScripts staying power is backwards compatibility
| for billions of web pages, don't want to introduce another
| language to the browser because it would cost a lot of
| money for developers to build the engine and would increase
| the security attack surface, and all the browsers would
| have to add a new engine around the same time so developers
| would be willing to use it.
| youainti wrote:
| And yet webassembly seems to be going somewhere.
| irrational wrote:
| It isn't replacing JavaScript. At the very least you
| still need JS to access the DOM. And, for backwards
| compatibility, I don't see JS going away during our
| lifetimes. And I don't see another language being added
| to browsers in place of JS (not including webassembly,
| which is not a replacement for JS).
| rubyist5eva wrote:
| I feel my abilities as a developer really took off when I took
| the time to really learn SQL and take advantage of the features
| that my database offers. For example, learning to use window
| functions felt like that galaxy brain meme where you become one
| with the universe.
|
| Being able to take 50 lines of ruby code (manual (anti-)joins
| and aggregates, result partitioning, etc...) and replace it
| with a couple lines of sql that is much faster and less buggy
| is a life changing experience.
|
| The only other time I had such a dramatic shift in the way that
| I look at building applications is when I learned Scala
| (functional programming).
| munificent wrote:
| I'm sure this will ruffle some feathres, but the technical
| skill that has been the most persistently valuable across the
| past 25 years of my career is object-oriented programming and
| design. It's not the only paradigm I use, but being able to
| organize large programs in terms of OOP has been incredibly
| valuable and hugely productive for me across several languages
| and domains.
| pmarreck wrote:
| Tell me you've never worked on large FP codebases (as a point
| of comparison) without telling me you've never worked on
| large FP codebases ;)
| munificent wrote:
| That's true, I haven't.
|
| Weird how many large OOP codebases there are to find
| oneself working on and how few large FP ones. :) Maybe
| that's just coincidental, or maybe not.
| avgcorrection wrote:
| Just a slow-to-die 90's fad in the end.
| kwhitefoot wrote:
| OOP is considerably older than '90s. Turbo Pascal 5.5 had
| objects in 1989 and was far from the first.
|
| Anyway structs plus FP is no less object oriented. You just
| don't have to write the methods in the same code module.
| See NIM Method Call Syntax for example: https://nim-
| lang.org/docs/tut2.html#object-oriented-programm...
|
| Yes, I know NIM is not purely functional. Use F# instead to
| get a bit closer to purity, it also does OOP quite nicely
| as does pretty much any functional language that has
| closures.
|
| OOP is a tool just as Git is a tool.
|
| See also: https://medium.com/extreme-programming/oop-vs-
| fp-182475457a0...
| avgcorrection wrote:
| > OOP is considerably older than '90s.
|
| Fad means popular (in a deragetory way). 90's was its
| popularity heyday.
|
| > See NIM Method Call Syntax for example: https://nim-
| lang.org/docs/tut2.html#object-oriented-programm...
|
| So oop is when you `obj.methodName(args)` instead of
| `methodName(obj, args)`? Well if that's all it is about
| then I can get behind it...
|
| > OOP is a tool just as Git is a tool.
|
| An iron maiden is just a tool. Don't be hatin' it's just
| a tool what has it done to you?
| kortex wrote:
| On the contrary, I think that's a self-fulfilling perception.
| If assets are already organized as objects, then it'll
| continue to work well with oop. But if you take the approach
| from the get-go to use data and FP approaches, things can be
| quite nice. Examples:
|
| - interface/trait based programming / structural subtyping
| (widely used in go/rust, increasingly in TS and python)
|
| - terraform
|
| - react
|
| - aws lambda / cloud functions
|
| - flow-based data processing (the whole of deep learning,
| spark/hadoop)
|
| - and of course, anything declarative DSL based (SQL, jq and
| friends)
|
| So I would counter that the more valuable skill is "how do I
| solve problems in terms of applying and composing transforms
| to data"
|
| To clarify, since everyone has their own definition of OOP,
| and of the four pillars, Abstraction, Polymorphism, aren't at
| all unique to OOP, and Encapsulation is just Abstraction: the
| defining features of OOP are inheritance and poking-and-
| prodding-state into opaque objects. Inheritance is subsumed
| by interfaces / structural subtyping, and poking at state is
| contrasted with reactor patterns, event sourcing, persistent
| data structures, etc.
|
| Oop really shines at the middlin-low level, in languages
| without a lifetime (state for things like IO resourcese, at
| the GUI widget level, and the (micro)service level, which is
| more like the original smalltalk sort of objects, in which
| case inheritance isn't a think.
| munificent wrote:
| _> On the contrary, I think that 's a self-fulfilling
| perception._
|
| So is using SQL. My point, as well as the parent author's
| point, is that I've found a technology that has been a
| useful source of value for my entire career. I make no
| claims that there aren't other useful sources of value, but
| I've never ever for a second regretted the time I've spent
| getting better at writing object-oriented code.
|
| I'm also quite comfortable writing functional and dataflow-
| oriented code, but I've never found those to be as exciting
| as some people seem do.
| Scarbutt wrote:
| React is implemented using OOP, which I think is great for
| libraries that are defining and defending boundaries.
| continuational wrote:
| Hooks is not really OOP. The original OOP interface was
| abandoned because it lead to a very complicated lifecycle
| that was hard to reason about.
| Scarbutt wrote:
| I'm referring to the implementation of React, not its
| API.
| yamtaddle wrote:
| Can confirm, unless it's changed radically since I last
| looked at it, yes, everydamnthing in React ends up as an
| object representing a component, and I don't just mean in
| the way that JS functions are technically also objects.
| pjmlp wrote:
| Except the little detail that JavaScript is an OOP
| language in the linage of SELF, where even basic types
| have methods.
| mirekrusin wrote:
| No, it's not.
| yamtaddle wrote:
| It is, in large part, right down at what you might call
| the heart of the whole thing. Though I haven't read the
| React source code since a bit after Hooks were added so I
| guess maybe that's changed--though I doubt it.
| Jweb_Guru wrote:
| Everyone has a different experience! Object oriented
| programming and design have had approximately no impact on my
| own career, while SQL knowledge has been indispensable. I
| don't think they're really in competition.
| Scarbutt wrote:
| I can see how can that be the case if one never works with
| products that rely heavily on having a database.
| munificent wrote:
| Yes, my career is possibly a bit unusual in that I've
| mostly not worked on heavily database-driven stuff. Mostly
| client applications, games, tools, and programming
| languages.
| quickthrower2 wrote:
| The problem I see here is building SQL on top of the cloud,
| which is like building SQL on top of a database where the
| tables are always changing - not just adding columns but the
| meaning of the columns and rows, or even the existence of the
| table.
|
| I think this is a good idea, but the cloud doesn't seem to care
| too much about backwards compatibility. If you use
| MySQL/PostgreSQL/SQL Server, you can be fairly sure in 20 years
| time your SQL will still work :-).
|
| The need for this library is more an indictment on "the cloud"
| than anything. I am mostly an Azure user so AWS/Google might be
| better but man it changes it's UI and CLI interfaces alot - way
| too much.
| sleiben wrote:
| Reminds me of Git.
| shepherdjerred wrote:
| Git is surprisingly only 17 years old while SQL is 48 years.
| Not that you're wrong, it's just interesting how
| comparatively newer it is
| marcosdumay wrote:
| Also, git had major competition for almost half of that
| time where it was not clear at all what software people
| would settle on (or even if they they would actually settle
| down).
|
| SQL was nearly undisputed its entire life. It quickly
| killed every previous architecture, and everything that
| came after it made a point on being compatible.
| TremendousJudge wrote:
| >It quickly killed every previous architecture
|
| This feat was much easier to achieve when the industry
| was much, much smaller, and a single company mostly owned
| the entire business sector (who were the people that
| needed databases in the first place).
| marcosdumay wrote:
| Well, a single company with a huge unexplainable help
| from the US government.
|
| But SQL stayed unopposed due to its qualities, not
| because of its iffy start.
| swalsh wrote:
| Git is a bit newer, and there have been many attempts to
| solve the same problem it solves, but it solves that specific
| problem VERY well... it'll be a long time before it's
| dethroned in my opinion.
| irrational wrote:
| Eh, I've seen various things like Git come and go. I spent
| the majority of my career in the days before git existed and
| we used things like SVN. I wouldn't be surprised to see Git
| supplanted. I would be surprised to see SQL supplanted.
| TremendousJudge wrote:
| SQL is at least 30 years older than Git. Git has yet to prove
| its longevity in the same way.
| nightski wrote:
| That statement will be true forever. It doesn't tell you
| much.
| TremendousJudge wrote:
| Let me restate in relative terms then. SQL is over twice
| as old as Git. When Git was originally released, SQL had
| been around for longer than Git has been today. Git has
| had a pretty good track record, but it's not comparable
| yet in terms of age or even in terms of mass adoption --
| plenty of orgs don't use Git to manage their code.
| nathanwallace wrote:
| Excited to see Steampipe shared here - thanks kiyanwang! I'm a
| lead on the project, so sharing some quick info below and happy
| to answer any questions.
|
| Steampipe is open source [1] and uses Postgres foreign data
| wrappers under the hood [2]. We have 84+ plugins to SQL query
| AWS, GitHub, Slack, HN, etc [3]. Mods (written in HCL) provide
| dashboards as code and automated security & compliance benchmarks
| [3]. We'd love your help & feedback!
|
| 1 - https://github.com/turbot/steampipe 2 -
| https://steampipe.io/docs/develop/overview 3 -
| https://hub.steampipe.io/
| petercooper wrote:
| It's a very cool project!
|
| It might just be a coincidence, but an hour before this HN
| post, I discovered it way back in our queue of things to review
| for https://golangweekly.com/ and featured it in today's issue.
| Hopefully kiyanwang is one of our readers :-D
| nathanwallace wrote:
| Awesome - thanks for the shout out in golangweekly!
| petercooper wrote:
| No worries, it's my job :-) I'm just so intrigued to see
| (well, guess, in this case) how the path of word of mouth
| works because most of the time it's a complete mystery!
| VectorLock wrote:
| Are people usually setting up centralized shared instances of
| Steampipe or is it more of a "run on everyone's laptop"
| deployment preferred?
| rmetzler wrote:
| My guess is, that it makes sense to get regular reports, e.g.
| weekly. But you also want to experiment and develop queries.
| So probably both. Not sure if there is something like a
| notebook for steampipe.
| nathanwallace wrote:
| Because it's just Postgres, connecting Jupyter notebooks to
| Steampipe via SQL works really well :-)
| nathanwallace wrote:
| We see users running Steampipe on their desktop, in pipelines
| (e.g. to scan Terraform files) and also as a central service.
| See "service mode" in the CLI for a Postgres endpoint and a
| dashboard web server [1] or try docker / kubernetes [2]. We
| also have a cloud hosted option with dashboard hosting,
| multi-user support, etc [3].
|
| 1 - https://steampipe.io/docs/managing/service 2 -
| https://steampipe.io/docs/managing/containers 3 -
| https://steampipe.io/docs/cloud/overview
| klysm wrote:
| Love to see Postgres FDW used. It's a really powerful and imo
| not utilized as much as it could be.
| ezekg wrote:
| This is awesome. Makes me want to write a plugin for my SaaS,
| just because it looks fun.
| nathanwallace wrote:
| I'm biased, but writing plugins is really fun - and then you
| can create mods / dashboards for your users as well! Please
| give it a go - we have a guide [1], 84+ open source plugins
| to copy [2], many authors in our Slack community [3] and
| we're happy to get your plugin added to the Steampipe Hub
| [4].
|
| 1 - https://steampipe.io/docs/develop/writing-plugins 2 -
| https://github.com/topics/steampipe-plugin 3 -
| https://steampipe.io/community/join 4 -
| https://hub.steampipe.io/plugins
| breck wrote:
| This is amazing. Can't believe I hadn't seen it before. Nice
| job
| cube2222 wrote:
| Hey! Steampipe looks great and I think the architecture you
| chose is very smart. Some feedback from myself:
|
| I've tried setting up steampipe with metabase to do some
| dashboarding. However, I've found that it mostly exposes
| "configuration" parameters, so to say. I couldn't find dynamic
| info, like S3 bucket size or autoscaling group instance count.
|
| Have I done something backwards or not noticed a table, or is
| that a design decision of some sort? That was half a year ago,
| so things might've changed since then, too.
| nathanwallace wrote:
| Steampipe mostly exposes underlying APIs or data through SQL.
| So, "select * from aws_s3_bucket" will return bucket names,
| tags, policies, etc all together. But, bucket size is not a
| direct API so doesn't have a direct table.
|
| In some high value cases we've added tables to simplify /
| abstract / normalize data - for example AWS IAM policies -
| https://steampipe.io/blog/normalizing-aws-iam-policies-
| for-a...
|
| This example query returns the number of instances attached
| to an autoscaling group - https://hub.steampipe.io/plugins/tu
| rbot/aws/tables/aws_ec2_a...
|
| BTW, we recently published a Metabase integration guide -
| https://steampipe.io/docs/cloud/integrations/metabase
| ComodoHacker wrote:
| What about joins, are they supported? Can't see them in the
| examples.
|
| The real power of SQL is locked until you can join different
| data sources.
| nathanwallace wrote:
| Yes - joins work across tables and schemas! This (toy)
| example connects IAM user records with Slack user data:
| select u.name, s.id, s.display_name from aws_iam_user
| as u, slack_user as s where u.name = s.email
| judell wrote:
| Here is one of our favorite examples:
| https://steampipe.io/blog/use-shodan-to-test-aws-public-ip
| chousuke wrote:
| Steampipe is pretty much "just" PostgreSQL with a foreign
| data wrapper and some fancy extras on top. The data is in
| tables from the database's perspective, so pretty much
| everything you can do with PostgreSQL, you can do with
| steampipe, including creating your own tables, views,
| functions and whatnot.
| davesque wrote:
| Hey Nathan. Can you comment on some of the known performance
| pitfalls of steampipe? I'm not super familiar with the Postgres
| foreign data wrappers API. I assume steampipe inherits many of
| its technical limitations from this API.
|
| Having done some work in this space, I'm aware that it's no
| small thing to compile a high-level SQL statement that
| describes some analytics task to be executed on a dataset into
| a low-level program that will efficiently perform that task.
| This is especially true in the context of big data and
| distributed analytics. Also true if you're trying to blend
| different data sources that themselves might not have efficient
| query engines.
|
| Would love to use this tool, but just curious about some of the
| details of its implementation.
| nathanwallace wrote:
| The premise of virtual tables (Postgres FDW) is to not store
| the data but instead query it from the original source. So,
| the primary performance challenge is API throttling.
| Steampipe optimizes these API calls with smart caching,
| massive parallelization through goroutines and calling the
| minimum set of APIs to hydrate the exact columns requested.
| For example, "select instance_id from aws_ec2_instance" will
| do 2 API calls get 100 instances in 2 pages, while "select
| instance_id, tags from aws_ec2_instance" would do 2 calls
| (instance paging) + 100 tag API calls (one per instance).
| We've also implemented support for qualifiers (i.e. where
| clauses) so API calls can be reduced even further - e.g. get
| 1 EC2 instance without pagination etc.
|
| The Postgres planner is not really optimized for foreign
| tables, but we can give it hints to indicate optimal paths.
| We've gradually ironed out many cases here in our FDW
| implementation particularly for joins etc.
|
| If you can tolerate my Aussie accent, I explain many of the
| details in this Yow Data talk -
| https://www.youtube.com/watch?v=2BNzIU5SFaw
| davesque wrote:
| Awesome, thanks so much for that! I guess then it sounds
| like, even if bandwidth were not an issue for a datasource,
| there would still be the physical limitations of the
| machine on which steampipe would do any later processing
| (such as joins between datasources). In other words,
| there's no sense in which steampipe distributes this work
| across multiple processes or machine. Is that correct?
|
| Sorry if a dumb question. I could be thinking entirely in
| terms of the wrong paradigm here since my work in this
| space was primarily concerned with distributed computing
| and big data.
| nathanwallace wrote:
| Steampipe computes the SQL part in a single Postgres
| instance (not distributed) - we've not found this to be a
| significant limit (so far). A difference in this case is
| that the query is really combining and organizing results
| that may been processed at the source. In 2022 most
| services are API first and offer great searching and
| filtering, so we can use that to do much of the query
| processing in many cases.
| davesque wrote:
| Great, thanks again! Yeah, a lot of the things steampipe
| is doing are right up my alley. Fun to learn about!
| Watching your talk now and enjoying it.
| aaronharnly wrote:
| Congratulations! This looks incredibly powerful and I'm excited
| to check it out.
|
| Although this is pitched primarily as a "live" query tool, it
| feels like we could get the most value out of combining this
| with our existing warehouse, ELT, and BI toolchain.
|
| Do you see people trying to do this, and any advice on
| approaches? For example, do folks perform joins in the BI
| layer? (Notably, Looker doesn't do that well.) Or do people
| just do bulk queries to basically treat your plugins as a
| Fivetran/Stitch competitor?
| nathanwallace wrote:
| While it's extensible, the primary use case for Steampipe so
| far is to query DevOps data (e.g. AWS, GitHub, Kubernetes,
| Terraform files, etc) - so often they are working within the
| Steampipe DB itself and combining with CSV data etc.
|
| But, because it's just Postgres, it can be integrated into
| many different data infrastructure strategies. Many users
| query Steampipe from standard BI tools, others use it to
| extract data into S3, it has also been connected with
| BigQuery - https://briansuk.medium.com/connecting-steampipe-
| with-google...
|
| As opposed to a lake or a warehouse, we think of it as a
| "Data Rainbow" - structured, ephemeral queries on live API
| data. Because it doesn't have to import the data it works
| uniquely well for small, wide data and joining large data
| sets (e.g. search, querying logs). I spoke about this in
| detail at the Yow Data conference -
| https://www.youtube.com/watch?v=2BNzIU5SFaw
| ryanmarsh wrote:
| Is this IaSQL reborn or a completely new product?
| nathanwallace wrote:
| Completely separate projects, with different goals. IaSQL is
| focused on infrastructure as code (like Terraform) in SQL.
| Steampipe is an "ops as code" platform to query, visualize and
| report on all your DevOps data (infra, SaaS, IaC, logs, etc).
| Steampipe's first version was released in Jan 2021.
| diceduckmonk wrote:
| We built a UI-centric cross-cloud search engine, too. <edit: link
| redacted as other commenter perceived this to be an ad>
|
| The first VC we pitched to felt like this was too niche of a
| problem. They wanted us to come back with a different, grander,
| pitch, so we'll see. In the current fundraising climate, it's
| been difficult to gather data points on whether we are on the
| right track (this post makes us feel like we're not crazy). We
| reached out to investors outside America, where we're based, and
| we're quickly realizing the VCs aren't as tech-savvy as we
| expected. After going thru the YC application process this cycle,
| we've have much greater appreciation for YC. They understand tech
| and startups, both. For one, we're instructed to define a toy /
| small problem, "don't talk, do" as opposed to pretty slides and
| ideas.
|
| Best of luck to Steampipe and whoever else is working on this
| problem.
| payphonefiend wrote:
| Please make your site mobile friendly.
| diceduckmonk wrote:
| Apologies. This was an experimental landing page haphazardly
| thrown together after the investors we're pitching to wanted
| a two-slide problem/solution. The rest of the app is actually
| made mobile-first.
| haggy wrote:
| Not a fan of what feels like an advertisement hijacking the OPs
| post. Most would probably prefer you just make a separate HN
| post.
| diceduckmonk wrote:
| I'm sorry it was perceived as such. I should have been more
| mindful. I genuinely intended for the substantive part of the
| post to be about our frustrations as programmers navigating
| the business world. It felt like the short introductory
| sentence was necessary to establish context (I'll remove the
| link, but leave post intact so other viewers aren't
| confused). Still off topic but, at the time, it seemed to be
| an on-brand discussion for HackerNews. What we were really
| phishing for was rapport from other engineers that we aren't
| crazy. I even feel vulnerable putting our project out here,
| at this stage, and the quick cost-benefit analysis in my head
| was this was a net-neutral.
|
| I doubt Steampipe knew we existed before this post. It seems
| like fate because I stumbled upon Steampipe just last week. I
| deliberated how to craft a not-awkward email along the lines
| of "Hey guys, we just realized we're tackling the same
| problem. Cheers." When I made the original post, I felt like
| it was the less awkward way of saying hi but now I'm not
| sure. At any rate, I hope you understand we did it in good
| will and not malice.
|
| Edit: I also hope "Best of luck to Steampipe and whoever else
| is working on this problem" doesn't get misconstrued as being
| a passive aggressive jab. Our team would honestly be thrilled
| if all the friction points of IaaS goes away, regardless of
| who solves it. As a reverse-ad, I'll say a few nice things
| about Steampipe that stood out to me: written in Go, supports
| querying HackerNews, SQL/Tables are a really good medium.
| blibble wrote:
| looks cool, think you could expose it as a rest API?
|
| (I kid, I kid)
| ashrodan wrote:
| I'd want to know how to have a hosted solution.
|
| Ideally build reports and alerts
| CSDude wrote:
| Steampipe has their own cloud at
| https://steampipe.io/docs/cloud/overview
|
| Alternative is Cloudquery but they don't offer cloud anymore.
| https://www.cloudquery.io/
|
| Plug, not based on Steampipe, but we also offer similar
| managed SQL on your cloud assets. https://www.resmo.com/
| nathanwallace wrote:
| You can run Steampipe CLI in "service mode", getting access
| to a Postgres endpoint and a dashboard web server [1]. Other
| users leverage docker / kubernetes to run it [2]. We also
| have a cloud hosted option with dashboard hosting, multi-user
| support, etc [3].
|
| 1 - https://steampipe.io/docs/managing/service 2 -
| https://steampipe.io/docs/managing/containers 3 -
| https://steampipe.io/docs/cloud/overview
| nathanwallace wrote:
| You kid, but we found that useful for scripts etc! So, in
| addition to a secure public Postgres endpoint [1], Steampipe
| Cloud workspaces have a rest API to run SQL queries and get
| results in JSON or other formats [2].
|
| 1 - https://steampipe.io/docs/cloud/integrations/overview 2 -
| https://steampipe.io/docs/cloud/develop/query-api
| forty wrote:
| I have discovered steampipe only recently. We work with a few AWS
| accounts and the steampipe dashboard and query are the best tool
| I have seen since terraform (of which I'm pretty sure it borrows
| the resource modules). Totally recommend it for making sense of
| your AWS accounts mess and for audits (soc2 style for example)
| Canada wrote:
| I really like how it just worked when I installed it and the AWS
| plugin. The second one I tried was tailscale... I get some
| different results to the same query which is weird and broken,
| and an error "device can only be managed by original tailnet
| (403) (SQLSTATE HV000)"
|
| Still, getting access to all this stuff with a SQL interface is
| very compelling so I'll try to get the plugins for all the stuff
| I have working properly.
| nathanwallace wrote:
| Glad AWS worked. Sorry you hit a bump with tailscale - we'd
| love feedback / contributions for that plugin [1], we just
| released it this month so may be edges for more complex
| configurations :-(
|
| 1 - https://github.com/turbot/steampipe-plugin-tailscale
| d0gsg0w00f wrote:
| Could someone use this query across many hundreds of accounts?
| nathanwallace wrote:
| Yes - our users run queries across hundreds of AWS accounts and
| regions. TBH, we never expected it to scale that far when we
| started, but it has. Steampipe parallelizes the API calls out
| and performs appropriate backoff in throttling to gather the
| results.
|
| Steampipe does the API calls needed for the column data
| requested [1]. So "select instance_id from aws_ec2_instance" is
| much faster than "select id, tags from aws_ec2_instance" (which
| does a tags API call per row). But, it will still work better
| than you expect, and Steampipe caches results in memory to make
| subsequent queries instant.
|
| 1 - https://steampipe.io/blog/selective-select
| xenophonf wrote:
| This cannot possibly require WSL2. What the heck? If I had an
| afternoon to waste RTFSing, I'd fix this. Some of us are stuck
| running WSL1 because Hyper-V breaks features we need in other
| hypervisors.
| nathanwallace wrote:
| Steampipe installs Postgres as an embedded database under the
| hood. We found that the initdb step kept failing on WSL1 and
| couldn't find a way to make it work (yet). Similar reports on
| StackOverflow below [1] [2]. Suggestions or PRs would be very
| welcome!
|
| 1 - https://stackoverflow.com/questions/72358475/postgresql-
| wsl-... 2 -
| https://stackoverflow.com/questions/69808468/postgres-is-stu...
| hedora wrote:
| On the one hand, I see your point; hacking it together probably
| isn't hard.
|
| On the other, they're a startup with limited resources, and
| time to market matters dearly for them. They probably don't
| even have product-market fit yet, and supporting Microsoft
| technologies that Microsoft has abandoned seems questionable.
|
| Anyway, I upvoted your comment, because it could be that their
| target market is running in weird enterprise-y environments and
| your problem is common. Feedback is helpful and hard fought at
| small companies.
|
| (No affiliation.)
| xenophonf wrote:
| The downvotes actually motivate me to fix this, except the
| docs have zero information on how to set up a development
| environment. I can tweak the CLI and `go build` it, but the
| plugin install process ends up overwriting my changes
| somehow. I've wasted about two hours learning Go and digging
| into the project's CI/CD pipeline, trying to figure out how
| everything works, and I haven't even begin debugging the
| actual problem I'm having. Ultimately, this goes on the pile
| of neat things I saw on HN but don't actually work IRL. It's
| too bad because this looks super cool.
| nathanwallace wrote:
| Sorry you didn't have a smooth start there :-(. Not sure
| how the behavior you are describing is happening. We'd be
| happy to help get things going if you want to open an issue
| [1] or discuss on Slack [2].
|
| 1 - https://github.com/turbot/steampipe/issues 2 -
| https://steampipe.io/community/join
| FBISurveillance wrote:
| I wonder if we can connect this with Hasura and query our cloud
| inventory via GraphQL. I'm only half-joking, actually.
| nathanwallace wrote:
| Because it's just PostgreSQL, Steampipe works out of the box
| with SQL clients, BI tools and other integrations [1]. Please
| let us know how you go!
|
| 1 - https://steampipe.io/docs/query/third-party
| ezekg wrote:
| How does this handle pagination and also how does it behave
| towards rate limiters? For example, my API (keygen.sh) has a page
| size limit of 100, and often times there can be hundreds or even
| thousands of pages for certain resources. How would Steampipe
| handle `select count(*) from keygen_licenses` while obeying the
| rate limiter? What if a heavy query is run twice?
| judell wrote:
| Pagination may be handled by the API's golang SDK, if not
| that's a responsibility of the plugin author.
|
| The plugin SDK provides a default retry/backoff mechanism, and
| the plugin author can enhance that.
|
| If a heavy query runs twice it'll load from cache the second
| time, if within the (user-configurable) cache TTL.
| chrsstrm wrote:
| So this looks interesting and I wanted to test it on a use-case I
| have currently. I want to query both a csv and a zipped csv
| living online but it doesn't appear that is possible. The csv
| plugin implies you can only work with local files, nothing over
| http(s). This kind of seems like a big oversight for a product
| like this, no?
| nathanwallace wrote:
| We have work in progress to support go-getter [1] allowing use
| of files from many different sources. There is a WIP branch [2]
| building that into the Steampipe Plugin SDK, so it can be
| immediately enabled for all our file based plugins (CSV,
| Terraform, etc) [3]. Watch this space and hopefully you can
| help us test!
|
| 1 - https://github.com/hashicorp/go-getter 2 -
| https://github.com/turbot/steampipe-plugin-sdk/compare/main....
| 3 - https://github.com/turbot/steampipe-plugin-
| terraform/tree/ad...
| Canada wrote:
| This looks fantastic!
| evilotto wrote:
| This sounds like what YQL should have grown into.
|
| https://en.wikipedia.org/wiki/Yahoo!_Query_Language
| xwowsersx wrote:
| I'm in the process of determining which RDS instance type to
| switch to and I've had to have a browser open with a million tabs
| copying and pasting information and pricing data from AWS. Could
| I use steam pipe to help me with this?
| nathanwallace wrote:
| I feel your pain - but Steampipe won't solve this (yet). We
| have AWS Thrifty [1] to scan your env for cost savings and
| other tables to query your spend [2], but haven't tackled the
| 1.8GB of pricing JSON data into a table yet.
|
| 1 - https://hub.steampipe.io/mods/turbot/aws_thrifty 2 -
| https://hub.steampipe.io/plugins/turbot/aws/tables?filter=co...
| xwowsersx wrote:
| Haha thanks for the reply. Insane how much pricing data there
| is, right? The least AWS can do is offer an export on the
| particular pricing pages!
| debarshri wrote:
| There is another contender in infrasql realm called
| cloudquery[1]. Steampipe has been around for very long. Mind you
| the steampipe has an AGPL license.
|
| [1]https://www.cloudquery.io
| mikessoft_gmail wrote:
| [deleted]
| pstuart wrote:
| Seems like the next step past https://osquery.io/
| nathanwallace wrote:
| osquery is awesome. It was definitely part of our inspiration!
| yevpats wrote:
| Good to see steampipe here :)
|
| Im the author of CloudQuery (https://github.com/cloudquery) and I
| believe we started at the same time though took some different
| approaches.
|
| PG FDW - is def an exciting PostgreSQL tech and is great for
| things like on-demand querying and filtering.
|
| We took a more standard ELT approach where we have a protocol
| between source plugins and destination plugins so we can support
| multiple databases, data-lakes, storage layers, kinda similar to
| Fivetran and airbyte but with focus on infrastructure tools.
|
| Also, as a more standard ELT approach our policies use standard
| SQL directly and dashboards re-use existing BI tools rather then
| implementing dashboarding in-house.
|
| I can see the power of all-in-one and FDW and def will be
| following this project!
| alexchantavy wrote:
| /me waves
|
| Immediately thought of CloudQuery when I saw this and sure
| enough, here you are :)
| ezconnect wrote:
| Yahoo was really ahead of its time, so sad it was mismanaged.
| jedberg wrote:
| Steampipe killed my startup. I couldn't be happier about it.
|
| I started a company six years ago to do exactly this -- make a
| SQL interface for the AWS API. Steampipe out-executed us and made
| a way better product, in part because their technology choices
| were much smarter than ours. I shut down the startup as soon as I
| saw steampipe.
|
| I wish them all the best, it's a great product!
| patrec wrote:
| I'm curious: what were the bad technological choices you made
| and the good ones they made?
| jedberg wrote:
| They use Postgres foreign data wrappers and caching with live
| queries against the Api. We were doing periodic scanning and
| then storing it in elastic. Then trying to convert the SQL to
| elastic queries.
|
| The elastic part let us do free text queries but converting
| from SQL was challenging (this was before elastic had it
| built in). And also everything was only as fresh as our last
| scan.
| encryptluks2 wrote:
| Can't speak for op but one common mistake I see is having a
| bunch of JS developers try to make a CLI and being surprised
| it is slow.
| TruthWillHurt wrote:
| wow.
| hugofromboss wrote:
| Shameless plug for those of you who want to query your assets
| with Osuqery and cloud assets with Steampipe. Check out
| https://www.github.com/zercurity/zercurity.
| Aaronstotle wrote:
| I found Steampipe from a HN comment about a year and some change
| ago, in my last role I had to routinely audit cloud
| configurations and it was amazing to have a SQL query to check
| all S3 buckets that were encrypted, security groups, etc.
| RicoElectrico wrote:
| https://developer.valvesoftware.com/wiki/SteamPipe
|
| No way they did not know about that name collision as SteamPipe
| has been around since 2013-ish.
| robertlagrant wrote:
| I'm about to blow your mind, then.
| https://www.sincocustoms.com/shop/product/209312/steampipe--...
| benbristow wrote:
| I'm about to blow your mind, then
|
| https://en.wikipedia.org/wiki/Stephenson%27s_Rocket
| masukomi wrote:
| I'm about to blow your mind, then
|
| https://en.wikipedia.org/wiki/Stirling_engine
| breck wrote:
| Try dropping a few thousand years
|
| https://en.wikipedia.org/wiki/Aeolipile
| jcims wrote:
| I've been around since 1973 and didn't know.
| [deleted]
| isamuel wrote:
| Name collision researcher here!
|
| This made me curious and I started doing research on other
| problematic name collisions. Did you know that Valve's game
| distribution service has a name that's already widely used in
| the scientific community for the gaseous form of heated water?
| It looks like steam has been around for several centuries at
| least so Valve really ought to have known better.
|
| Anyway, if you're interested, I'm putting together a long
| investigation on whether Microsoft's flagship operating system
| is in collision with a very common term used in the
| construction and glass industry. Hope to have it done soon
| TazeTSchnitzel wrote:
| Both things called "steampipe" are cloud-based technology
| services. They even both involve a CDN. They're close enough
| to be confusing. The name of a gas isn't. This is silly.
| ashayh wrote:
| Highly recommend Steampipe. No other open source tool was able to
| deal with the large number of Route53 records I had to deal with.
| chousuke wrote:
| Steampipe is unbeliveably powerful.
|
| Writing custom scripts for even the simplest of queries comes
| nowhere near the convenience of using PostgreSQL to get what you
| want. If you wanted to find and report all instances in an AWS
| account across all regions, with steampipe it's just:
| SELECT * FROM aws_ec2_instance;
|
| Even the simplest implementation with a Python script won't be
| nearly as nice, and once you start combining tables and doing
| more complicated queries, there's just no competition.
| Hikikomori wrote:
| Tried steampipe out to replace my simple python script. It
| wasn't able to handle our 1k+ accounts and multiple regions,
| initialization took over 1 hour before I was even able to do a
| query.
|
| Meanwhile my python script can run any API calls in all our
| accounts and regions and finish in a few minutes. Maybe 100
| lines of python using boto3 and multiprocessing that outputs to
| json or yaml.
| xwowsersx wrote:
| 1K accounts? What're you doing over there exactly? I'm
| intrigued!
| nathanwallace wrote:
| Python and boto3 are very powerful in many cases for sure -
| glad they work well for you. No doubt that multi-region for
| 1k+ accounts is a large surface area, so we'd love to learn
| from your experience! Performance is an ongoing focus area
| for us, including radical improvements in memory usage and
| startup time in our most recent version -
| https://steampipe.io/blog/release-0-16-0
| phillu wrote:
| I don't agree that this simple query wouldn't be as nice with
| boto3. But combining different services is definitely a nice
| feature, if you have the use case for it.
| chousuke wrote:
| the fun part of that query is that it'll search across _all_
| regions that you have configured steampipe to use. If you did
| that in Python, you at the very least have to manually
| iterate over each region, handle pagination yourself and
| combine the results.
|
| It gets much more convenient when you want to ask more
| complicated questions like "What is the most used instance
| type per region" or "How much EBS capacity do instances use
| across all regions, grouped by environment tag and disk
| type?"
| michaelsalim wrote:
| This makes me want to create a Grafana dashboard for everything.
| Instead of relying on custom plugins (or pulling & storing the
| data yourself), now it's just another postgres data source. I
| hope this becomes mainstream!
| nathanwallace wrote:
| 100% - works great with Grafana, Metabase, Tableau, etc because
| it's just Postgres :-)
|
| If you are into the idea of "dashboards as code", you may also
| enjoy our HCL+SQL approach in mods [1] - runs in your cli, open
| source, composable, great DX [2]
|
| [1] - https://hub.steampipe.io/mods [2] -
| https://steampipe.io/blog/dashboards-as-code
| haolez wrote:
| Looks fun. I assume it's just a visibility tool and there is no
| intention of offering INSERT, UPDATE and DELETE, right?
| nathanwallace wrote:
| Yes, we've focused on SELECT for read only - which provides a
| lot of value as a "universal API" to live query these many
| sources. Steampipe is built on Postgres Foreign Data Wrappers
| [1] under the hood, so could be extended to full CRUD in time.
| Our current belief is that it's better to follow CQRS [2] in
| this case though - using SQL for the queries and more native
| tools for the commands.
|
| 1 - https://www.postgresql.org/docs/current/ddl-foreign-
| data.htm... 2 - https://martinfowler.com/bliki/CQRS.html
___________________________________________________________________
(page generated 2022-09-30 23:00 UTC)