[HN Gopher] Apache Superset
___________________________________________________________________
Apache Superset
Author : tosh
Score : 665 points
Date : 2024-02-26 14:36 UTC (1 days ago)
(HTM) web link (superset.apache.org)
(TXT) w3m dump (superset.apache.org)
| marcinzm wrote:
| > Superset is fast, lightweight, intuitive, and loaded with
| options that make it easy for users of all skill sets to explore
| and visualize their data, from simple line charts to highly
| detailed geospatial charts.
|
| I tried Superset a few years back, and maybe it's changed since
| then, but intuitive is about the last thing I'd use to describe
| it. Things which I could figure out in a few minutes on any other
| BI tool literally took me hours of searching. It didn't help that
| they decided to rename core concepts at some point so half the
| online documentation made no sense anymore. Others at those
| companies who tried it at the time said similar things.
| fayten wrote:
| I also found Superset unintuitive to use and setup as well. I
| settled on standing up Metabase because it was so simple to get
| started with trying it since it can be launched as a single
| jar. The business users loved it and so did I and
| administration with a Postgres backend instead of the internal
| h2 database was a breeze.
| datadrivenangel wrote:
| Metabase is great. It truly is a BI tool. Superset is more of
| a visualization platform, which works great if you have
| engineers building reports. Less good if you expect more
| junior analysts to be super productive.
| sumoboy wrote:
| We ran into the same exact issue with Superset not being
| intuitive, just for a different audience that is more
| technical. Also went with Metabase which is good, easy to
| use, lacks some a few chart types but overall the past year
| has seen quite a few changes and bug fixes consistently
| happening.
| fuzztester wrote:
| I just took a look at Metabase.
|
| https://www.metabase.com/demo
|
| Demo is nice.
| c0brac0bra wrote:
| Yep, we've really liked Metabase for embedding in our
| platform.
| xtracto wrote:
| I had the same experience. Featurewise Superset looked
| better, but after wasting a couple of hours trying to install
| it, I just gave up.
|
| Instead I installed Metabase in 5 minutes tops: spin ec2
| instance, whether and java -jar . I've never looked back.
|
| The only thing that turns me off I'd that it's implemented in
| an obscure language. At one time I wanted to add some custom
| postprocessing to an api (given an sql query, get some
| python/pandas postproc command from a sql comment and execute
| it in the returned table), but the used language is just not
| for me (some lisp dialect)
| NoThisIsMe wrote:
| Clojure is not particularly obscure
| hodgesrm wrote:
| My experience with Superset was the opposite. It's easy to
| install using containers. You can have it up and running and
| connected to ClickHouse in a few minutes. I also found the
| internal design pretty intuitive--the SQL query lab is much
| easier than Grafana's editor.
|
| I like Grafana too, but there's basically no isolation
| between your query and the SQL database at least in the
| Altinity Grafana plugin for ClickHouse which is the main one
| I use.
| bushbaba wrote:
| It's more intuitive than the open source alternatives but is
| not as intuitive as tableau and others.
| marcinzm wrote:
| Metabase is more intuitive. Also, being unintuitive isn't
| great but not the worst thing. A project not even realizing
| that (and thinking the exact opposite) is much much worse.
| Unintuitive can be fixed with PRs over time. Delusional
| project leadership cannot.
| codeduck wrote:
| I've just been playing with superset. I'd have to agree. Things
| which are easy in SQL are... disturbingly hard or nonobvious in
| superset.
|
| And the documentation is sparse at best.
| staticautomatic wrote:
| Let's be honest, intuitive is the last word we'd use to
| describe most Apache projects.
| pseudosavant wrote:
| They are doing pretty well that it is even clear what the
| project is really even about. Good luck figuring that out
| within 30 seconds of hitting the average Apache project
| homepage.
| slyall wrote:
| There seem to be dozens of Apache "Big Data" projects that
| all look kinda the same unless you are a Big Data person.
| dzamo_norton wrote:
| Even if you are a data person. The ASF doesn't mind
| overlap between projects [1], it spreads its bets and
| lets the market choose the winners.
|
| 1. https://www.apache.org/foundation/how-it-
| works/#incubator
| atombender wrote:
| Are there better alternatives?
| FridgeSeal wrote:
| It wasn't fast either when I used it.
|
| What it was though, was riddled with dozens of Python runtime
| errors and innumerable glitches.
|
| Metabase is where it's at.
| mritchie712 wrote:
| Had a similar experience with Superset. A few others have
| mentioned Metabase and I agree it's better, but if you're
| looking for a different approach to data, check out Definite
| (https://www.definite.app/). It's a "data stack in a box". A
| few things we're doing differently:
|
| 1. Built-in data warehouse - We spin up a duckdb database for
| you to load data to
|
| 2. 500+ connectors - You don't need to buy a separate ETL and
| you can pull in all your data (e.g. Postgres, Stripe, HubSpot,
| Zendesk, etc.) automatically
|
| 3. Semantic layer - Define dimensions, measures, and joins in
| one place. We have pre-built models for all the sources we
| support (e.g. the Stripe model already has measures for MRR,
| churn, etc.)
|
| 4. Simple BI - Build a table with the data you want and
| generate visuals off that table
|
| I'm mike@definite.app if you have any questions.
| cogman10 wrote:
| This looks like grafana, right? Why would I use this instead of
| grafana?
| pachico wrote:
| The fundamental difference is that Grafana isn't great at cross
| referencing data in different data sources. (I love Grafana and
| I pay for the Cloud version.)
| peterleiser wrote:
| I found that running TrinoDB in a docker container and adding
| the trino plugin to grafana was very straightforward. TrinoDB
| feels magical sometimes, except that the SQL syntax they use
| seemed awkward IIRC. Also, there are inexplicable performance
| problems with certain queries that require trying subtlety
| different SQL queries until it snaps out of it.
| prpl wrote:
| You can't trivially plug grafana in front of any SQL database,
| and grafana is more about graphing/plotting (usually time
| series).
| xyzzy_plugh wrote:
| You can actually plug grafana in front of any SQL database,
| but I'm not sure it's a good idea.
| samuell wrote:
| Much more focused on interactive slicing and dicing of data,
| rather than mostly following a few pre-defined time-series, as
| is the focus of Grafana.
|
| As such, closer to an open source replacement for PowerBI.
| bfung wrote:
| grafana is built more for operational and timeseries data, but
| not so optimal for complex analytical queries. Ex: up-to-second
| data on cpu load on a host.
|
| superset is the flip side of grafana; not good for up-to-second
| updates, but good for complex queries. Also, non-time series
| stuff. Ex: Which customer groups bought which products for all
| time? <-- that type of BI stuff.
| skadamat wrote:
| I love Grafana but Grafana doesn't really support non-time-
| series visualization that well.
| sgt wrote:
| Why is that, though? I'd think that there'd be some
| plugins/extensions for Grafana that could do this. Grafana
| could then become the next PowerBI/Tableau/Superset killer
| eventually.
| skadamat wrote:
| Different audience / use case. I've noticed that products
| often lean towards speaking to app builders (full stack
| swe's) or data builders (data analysts / scientists / data
| engineers). They require different mental models I feel.
|
| Grafana I sense is culturally focused on observability
| visualization (aka needs of full stack devs). Culture is
| very hard to change!
| jldugger wrote:
| They're both washboarding apps, and while I'm sure they each
| have panel types the other doesn't yet support, I don't think
| that's intrinsic. The differentiation as I see it, is that
| Superset is designed to craft SQL queries and visualize the
| results. The query builder is probably where this shows the
| most.
|
| To make it more concrete -- coworkers tell me Grafana doesn't
| work so well with Apache Druid, while Superset supports it
| quite well.
| jldugger wrote:
| *dashboarding, yikes
| totalhack wrote:
| I thought this was some jargon I didn't know haha.
| rongenre wrote:
| We use this at my ginormous employer in order to give devs
| limited access to production data.
| rusackas wrote:
| Maybe you can't say who, but I'm sure curious. Add yourself to
| this page if you can:
| https://github.com/apache/superset/blob/master/RESOURCES/INT...
| adlpz wrote:
| Has anyone tried both this and Metabase? I've used Metabase in a
| few projects and I find it very nice. This seems more powerful,
| perhaps?
|
| Is it worth it for BI on small datasets?
| CalRobert wrote:
| Yes, I am at a company using Metabase, but I have a decent
| amount of experience with Superset (albeit from many years
| ago).
|
| The reason we chose Metabase was that it had table joins, while
| Superset doesn't (unless it has added them since I used it). It
| also looks a bit sleeker. But I strongly prefer Superset; I
| found that with Metabase I had to turn a lot of things off to
| make it usable (Let me see "the_table" not "The Table"!), I was
| constantly annoyed at the opacity around models vs "questions",
| etc. and every time I wanted to change a question Metabase
| insisted on creating a new one instead. The real issue here was
| when we wanted to swap out the data source for a lot of
| questions but there was no clean way to do so without MB just
| creating new questions.
|
| Also, Metabase doesn't have serialization unless you pay them
| AND you self-host, (if I'm self hosting then what exactly am I
| paying for?) and that's pretty annoying.
| https://www.metabase.com/docs/latest/installation-and-
| operat....
|
| But it does let you join tables. Sometimes that's enough to
| make MB worth dealing with.
| noduerme wrote:
| The "model" vs "question" thing is really annoying as there's
| no real difference from the user's perspective, and it's easy
| to accidentally convert a model back to a question without
| noticing when you publish something. You notice when you try
| to drill into the chart. There's a lot of annoying manual
| labor in metabase, e.g. I want to filter something into 10
| different charts and I need to duplicate it 10 times and
| change a filter on each one. Still yeah joins are nice. A
| non-bugged aggregate count/sum as a window function would be
| nicer.
| itsoktocry wrote:
| > _I was constantly annoyed at the opacity around models vs
| "questions"_
|
| Yeah, somewhere along the line Metabase decided to get
| opinionated on "self-serve". I imagine it works well for some
| teams and companies, but for the tech-oriented, it's
| annoying.
|
| I prefer my BI tools to be platforms that make for easy
| charting and cross-filters, while I build and control the
| models behind the scenes with a tool like dbt.
| adlpz wrote:
| Thanks! Very detailed answer.
|
| I've found the weird "make it easy" mindset a bit annoying
| with Metabase too. The whole questions, nice table names...
|
| I'll give Superset a try in my next project I think.
| rusackas wrote:
| Superset lets you join tables within the same database. If
| you want to do cross-DB joins, we have a new (beta) in-memory
| meta-DB that lets you do this, but we generally see and
| recommend people using things like Trino for this.
| Cilvic wrote:
| Is that new? Last time I checked this was the major
| downside from superset
| CalRobert wrote:
| Nice! When was that added?
| skadamat wrote:
| Metabase is a bit more user-friendly to be honest than
| Superset. Superset has a WAY more liberal license, so it's
| ideal for people who want to customize Superset and build data
| apps.
| sokols wrote:
| Metabase is great, I use it with a Oracle Database.
| code_biologist wrote:
| Reposting from a comment of mine about 60 days ago:
|
| I recently ran a little shootout between Superset, Metabase,
| and Lightdash -- all open source with hosted options. All have
| nontrivial weaknesses but I ended up picking Lightdash.
| Superset is the best of them at _data visualization_ but I
| honestly found it almost useless for self-serve _BI_ by
| business users if you have existing star schema. This issue on
| how to do joins in Superset (with stalebot making a mess XD) is
| everything difficult about Superset for BI in a nutshell.
| https://github.com/apache/superset/issues/8645
|
| Metabase is pretty great and it's definitely the right choice
| for a startup looking to get low cost BI set up. It still has a
| very table centric view, but feels built for _BI_ rather than
| visualization alone.
|
| Lightdash has significant warts (YAML, pivoting being done in
| the frontend, no symmetric aggregates) but the Looker
| inspiration is obvious and it makes it easy to present _groups
| of tables_ to business users ready to rock. I liked Looker
| before Google acquired it. My business users are comfortable
| with star and snowflake schemas (not that they know those
| words) and it was easy to drop Lightdash on top of our existing
| data warehouse.
| ldjkfkdsjnv wrote:
| I remember using Superset in 2017 or so, was forced to by a
| manager that would not pay for off the shelf software. I also did
| a few open source contributions to fix some bugs, it was a
| disaster. A huge rats nest of python. Might have changed in the
| last few years, am surprised its still active
| jimvin wrote:
| It's definitely come a long way since 2017! It's improved
| markedly in terms of functionality and performance. It looks
| much prettier now as well.
| Maro wrote:
| I love Superset.
|
| I've been running it in production since 2017, at two jobs, the
| current one a big corporation.
|
| Best general-purpose, database-backed dashboarding system out
| there. I would never pay for Tableau or PowerBI.
|
| Same for Airflow.
| atlas_hugged wrote:
| Same for Airflow? I'm not sure I understand what you mean.
| luccasiau wrote:
| They were both made by Airbnb and then open-sourced, which is
| the similarity I assume they meant
| jerrygenser wrote:
| They were also more specifically authored by the same
| individual!
| rusackas wrote:
| Maxime, the original author of Airflow/Superset, is also
| the CEO of Preset (where I work), so he/we are still
| working on Superset every day :)
| edanm wrote:
| Oh that's awesome! Must be awesome to work on that. We've
| been using Airflow in production for 6 years at this
| point with various clients and it's been great, and we're
| trying to sell people on Superset now as well.
| dang wrote:
| Related. Others?
|
| _Open source Business intelligence platform made with Python_ -
| https://news.ycombinator.com/item?id=29368664 - Nov 2021 (49
| comments)
|
| _Apache Superset 1.1_ -
| https://news.ycombinator.com/item?id=27439939 - June 2021 (28
| comments)
|
| _The Apache Software Foundation Announces Apache Superset as a
| Top-Level Project_ -
| https://news.ycombinator.com/item?id=25905277 - Jan 2021 (1
| comment)
|
| _Apache Superset is an enterprise-ready business intelligence
| web application_ - https://news.ycombinator.com/item?id=21133931
| - Oct 2019 (7 comments)
| throwaw12 wrote:
| anyone knows how does it compare to Looker?
| skadamat wrote:
| No built-in thick semantic layer, compared to Looker.
|
| I wrote about Superset's semantic layer here:
| https://preset.io/blog/understanding-superset-semantic-layer...
|
| One popular option is to use dbt or Cube for the semantic layer
| and pair with Superset: https://preset.io/blog/announcing-
| presets-ui-integration-wit... and https://preset.io/blog/open-
| source-looker-cube-superset/
| totalhack wrote:
| The lack of a semantic layer and join limitations are what
| made me pass on superset, but that was a couple years ago so
| maybe those features have been added.
|
| I built my own semantic layer instead. I use this in
| production in my company but obviously use at your own risk
| as it's a one-man show.
|
| https://github.com/totalhack/zillion
| anentropic wrote:
| This looks interesting for me, but I'd really like more
| detail about the architecture and deployment in the docs.
|
| There is this:
|
| > A final SQL query against the combined data from the
| DataSource Layer
|
| > The Combined Layer is just another SQL database (in-
| memory SQLite by default) that is used to tie the
| datasource data together and apply a few additional
| features such as rollups, row filters, row limits, sorting,
| pivots, and technical computations.
|
| But it leaves me with questions - how/when does this get
| populated? What other options are there besides in-memory
| SQLite? (I presume that's just a convenience for
| development and would use something else in production?)
|
| Or is it just what Superset calls a 'metastore' i.e. data
| about the data, and the queries are run against the data
| source layer?
| anentropic wrote:
| Or from a comment elsewhere in this thread about
| Superset:
|
| > Superset lets you join tables within the same database.
| If you want to do cross-DB joins, we have a new (beta)
| in-memory meta-DB that lets you do this
|
| ...is it this?
| code_biologist wrote:
| I posted a comparison to Looker in a different thread:
| https://news.ycombinator.com/item?id=39523454
| noduerme wrote:
| So it's irritating to me that this is ranking #1 on HN (why is
| it, btw?) I just pulled the trigger on a large data gathering
| project using Metabase, and feel a bit hampered by the
| limitations in terms of charts and plugins... but I considered
| Superset first, and after a lot of thought I decided that almost
| everything I've ever worked with that was run by the Apache
| foundation turned out to be semi-abandoned disasterware over
| time. In fact I wasn't even sure if Superset was still an active
| project or if it just looked like one, in the way e.g. no one
| bothered to pull the OpenOffice website offline.
|
| So now that I picked Metabase, Superset is topping HN for no
| apparent reason. Why?
| hasty_pudding wrote:
| > Everything I've ever worked with that was run by the Apache
| foundation turned out to be semi-abandoned disasterware over
| time.
|
| Amen brother.
| swalsh wrote:
| Yeah, I'm in a similar thought process. I've been burned
| multiple times by Apache, will not touch ever again.
| lars_francke wrote:
| > almost everything I've ever worked with that was run by the
| Apache foundation turned out to be semi-abandoned disasterware
| over time
|
| Can you name a few examples?
| bombcar wrote:
| Here's the list:
| https://projects.apache.org/projects.html?name
|
| OpenOffice is probably the most famous (it still has the
| name, but it is dead, LibreOffice is the real "active" fork).
|
| And the things in the "Attic" are officially dead -
| https://projects.apache.org/committee.html?attic and many
| more projects should be there.
| lars_francke wrote:
| I think it's a great feature to have explicit lifecycle for
| open source projects.
|
| Lots of other projects just die silently and/or you are
| unsure of the status.
|
| Here you at least have a chance to revive them if you like
| as there is always an overarching organisation.
| bombcar wrote:
| The problem really is that some Apache projects are
| actually alive (Apache itself, apparently Superset,
| Groovy, etc) and some _appear_ alive at first glance.
|
| More things should move into the Attic, like OpenOffice.
| smaudet wrote:
| Ivy, Netbeans, Open Office, Shiro, Solr all jump at me off
| this list:
|
| https://projects.apache.org/projects.html?name
|
| These are all projects that once were (more) relevant,
| however seem to have become rather niche (Gradle,
| Jetbrains/VSCode, GoogleDocs/Libreoffice e.g. for the first
| three are the dominant competitors).
|
| Most of these projects (like the massive commons listings)
| are either used by some Java library somewhere (meaning their
| success/relevance is tied to the usage of Java), or are
| obscure enough that they are no longer used widely and so
| suffer from lack of interest.
|
| There are gems in this list, to be sure, but if you just run
| into half-maintained projects all the time you're not likely
| to associate good things with the Apache name?
| noduerme wrote:
| Well, OpenOffice as I said. Cordova is/was a hot mess (with
| some nice pioneering features, just really not well
| maintained imo and felt like quicksand to build even a small
| app on) Then the sort of long slow death of Flex (now
| Royale?) Apache seems like where software no one loves
| anymore goes to die.
| rpeden wrote:
| I suppose it depends on projects you're using. For many
| developers their primary exposure to the Apache Foundation
| is through projects like Maven and Kafka, and those
| certainly don't feel dead.
| smallmancontrov wrote:
| Because we (the FBI Surveillance Van) saw that you picked
| Metabase, called our shady French-accented overlord, and he
| told us to dump it.
| noduerme wrote:
| I knew it!!
| CoastalCoder wrote:
| I thought your _outrageous_ French accent just meant you 're
| going to taunt him a second time.
| esafak wrote:
| He's working on that accent.
| https://www.youtube.com/watch?v=Z6oeAdemFZw
| smaudet wrote:
| "semi-abandoned disasterware"
|
| Hmm. I suppose all open source looks that way if it doesn't get
| regular funding/attention.
|
| Apache does house a lot of abandonware. They had some relevance
| as recently as 6-7 years ago but they've been largely replaced
| by nginx I think. That being said, I view them like the local
| soup-kitchen - important to have and maintain, but not where I
| want to go for a 5-star meal.
| malfist wrote:
| The Apache foundation is way larger than just the server
| smaudet wrote:
| Yes, I agree. However a lot of their forward facing
| projects seem to be effective abandon-ware (few people
| interested in contributing, competing more popular
| solutions based on forks, or just no longer relevant).
|
| These projects don't give the apache foundation an
| appearance of importance or relevance, rather they make it
| look rather rundown.
| DaiPlusPlus wrote:
| That's how open-source abandonware is supposed to work
| though: the idea is that whenever a (for-profit) company
| produces something that it can't afford to run anymore
| but also can't afford to shut-down and damage their
| customer relationships, then they'll open-source the
| project and give it to an open-source foundation for
| stewardship and repo hosting. Yes, it's where software
| goes-to-die-a-long-death, but it also gives some people
| hope, and the possibility of giving it a new life in
| future. Currently, the Apache Foundation is the go-to
| place for that, and it benefits everyone considering the
| alternatives are worse.
|
| Obivously the main "alternative" is for the original
| company to simply shut down the product/service, which
| can do irreperable harm to a company when they have high-
| profile customers who are utterly dependent on a service.
|
| Another alternative is to use an open-source foundation
| that's directly managed by the original company, which is
| what Microsoft did with its DotNet Foundation (
| https://dotnetfoundation.org/ ) - and while Microsoft's
| legal team ensures the foundation is "legally"
| independent, in practice we know all the significant
| shots are being called from within Microsoft-proper; but
| it does give us some modest reassurances that .NET won't
| suddenly return to being closed-source overnight.
|
| Another alternative is to not open-source it and to
| instead sell it off to another company that can maintain
| it while still being profitable - this is what Adobe did
| with Flash: they sold it all off to Samsung because their
| Harman division wanted to continue using Flash for
| embedded/automotive UX work. This approach can work, but
| doesn't benefit the wider ecosystem the way that open-
| sourcing does - and something something shareholder value
| and return-on-investment by selling rather than writing-
| it-off...
|
| What companies won't do is let any of their engs that are
| passionate about a project split-off from the company to
| run and maintain it, le sigh.
| KptMarchewa wrote:
| I would consider Airflow, Spark and Flink to be their
| forward facing projects, and they are all very actively
| developed.
| jakjak123 wrote:
| The Apache Foundation also takes on projects that are
| literally abandoned. It acts as an umbrella that takes
| over hosting a project for commercial actors that can no
| longer develop it, but want to at least give existing
| users a open source (Apache License) version of the
| software to continue with/depend on.
| jakjak123 wrote:
| Apache hosts many, many projects, some good, some bad, some
| abandoned, some fucking great.
| nekoashide wrote:
| Any time I hear "Apache Foundation" my stomach turns as I
| hesitate to ask my next question. "What we are trying to use
| from them is built on Java right"
| stuff4ben wrote:
| That would be anything hosted by the Eclipse Foundation.
| Either Java-based or abandonware or sometimes both.
| beastman82 wrote:
| > topping HN for no apparent reason
|
| I think the HN algo is pretty easily manipulated. I worked at a
| startup that had an effective process to get things to the
| front page
| CoastalCoder wrote:
| > I think the HN algo is pretty easily manipulated. I worked
| at a startup that had an effective process to get things to
| the front page
|
| That sounds (potentially) sleazy. If you think it's a
| technique that HN could potentially defend against, I
| encourage you to explain it to hn@ycombinator.com.
| noduerme wrote:
| Maybe it's a YC startup.
| ambigious7777 wrote:
| AFAIK YC startups don't get any more boost on the front
| page than normal posts.
| ativzzz wrote:
| > That sounds (potentially) sleazy.
|
| Pretty sure it's as simple as posting in your general slack
| channel "@here we posted a new article to HN, go upvote and
| write a comment"
| rickspencer3 wrote:
| I think that there is an active company behind Superset called
| Preset.
|
| https://preset.io/
|
| I don't think it's semi-abandoned. I had a brief interaction
| with the project in my previous job, and I found the community
| and the company to be reasonably engaged and responsive.
| skadamat wrote:
| Apache Airflow, Kafka, Spark, ECharts, and many others are
| still going strong! It really depends on the project to be
| honest.
| jakjak123 wrote:
| I have the opposite experience. Lots of good stuff is hosted by
| Apache Foundation, such as Kafka, Maven, Cassandra, Camel, the
| Tika project, Superset, Solr, but I will admit they had more
| relevance 10 years ago. And I dont think there are many
| organizations that keep open source projects alive longer than
| the Apache projects.
| tomnipotent wrote:
| I used Metabase at my last gig (CTO @ e-commerce, 30+ users)
| and it was well-received and dare I say even a bit adored. It
| was the only self-hosted tool I'd receive after-hours text
| messages about going down that someone urgently needed back up
| for some task due tomorrow.
|
| Business users loved the self-serve query builder, and it
| wasn't uncommon to walk around the office and see Metabase up
| on someones screen. My CEO absolutely loved it, and used it
| daily including to put together data for board decks.
|
| None of my users cared about visualizations, and lived in
| tabular data. This included finance, marketing, merchandising,
| operations, and executives (CEO/COO/CFO). The only people that
| lamented the limited visualization were analysts. Power users
| did all their day-to-day work in Excel or other tools anyway,
| such as managing marketing spend or inventory allocations.
|
| Metabase was great for dashboards and self-service (ad-hoc).
| 10/10 would deploy again.
| renewiltord wrote:
| Apache Software Foundation is just an umbrella organization to
| keep things on life support till someone can apply sufficient
| motive force to resurrect. I think that's really valuable. Lots
| of projects there have had that effort applied to them and kept
| going.
| indymike wrote:
| Can vouch for Superset. I use it in a couple of my companies and
| love it.
| lars_francke wrote:
| We've built a Kubernetes Operator for Apache Superset at
| Stackable: https://github.com/stackabletech/superset-operator/
|
| It's part of our Open Source Data Platform and it's one of the
| few open source BI tools out there and there are not a lot of
| alternatives in this space. We generally like it.
| adeptima wrote:
| Had a very good experience with Superset.
|
| Superset allowed us to replace Tableau and not looking back
|
| Took me a while figure out how to embed it into my app using
| Superset Embedded SDK.
|
| Superset Embedded SDK - "Embedded SDK allows you to embed
| dashboards from Superset into your own app, using your app's
| authentication. Embedding is done by inserting an iframe,
| containing a Superset page, into the host application."
|
| https://github.com/apache/superset/tree/master/superset-embe...
|
| Superset is based on very high quality and well maintained chart
| library eChart
|
| https://echarts.apache.org/examples/en/#chart-type-linesG
|
| Community Roadmap
|
| https://github.com/apache/superset/projects?query=is%3Aopen
|
| Huge respect to Preset.io and its team for contributing to the
| project and keep it in a great shape
|
| https://preset.io/blog/
|
| Superset source code is very easy to read and understand, and as
| a result it's possible to implement some advanced caching
| techniques reduce the load on charts.
|
| No BI is perfect.
|
| Watching Superset for years gives me confidence the project will
| work as supposed down the road, and eventually some of its
| packages can be reusable for all kind of visualizations and data
| hacking.
|
| Our main approach to visualisation is to start with eChart and
| simple Reactjs wrapping and spin off Superset on subdomain for
| power users, and later see which one works better. Same look
| gives a very pleasant experience.
| boyka wrote:
| I have no experience with Superset. Can you elaborate on a few
| points where you see it excel beyond Tableau?
| adeptima wrote:
| I dont want to start a rant against Tableau. It's a
| powerhouse. It's a great superior software. But when it comes
| to optimizing cost and comparing the total cost of ownership
| and opportunity to stop paying for Tableau server license we
| voted in favor of Superset and mix of Reactjs+Echarts
| widgets.
|
| https://www.tableau.com/products/server
|
| If you have money, dedicated team of data analytics who are
| already familiar with Tableau - no need to torture them with
| other tools.
| skadamat wrote:
| Honestly it's so hard to compare Tableau and Superset.
| Tableau has every feature and bell / whistle imagine-able.
| But it's heavy, desktop oriented, and pricey.
|
| Superset is lightweight and open source, but only has 5% of
| the features. So it really depends what you need!
| Jzush wrote:
| I'd like to see these types of apps start offering SVG
| embedding of things like graphs. Frames are such a pain.
| rusackas wrote:
| That's probably not trivial, but it seems plausible. The
| beauty of open source is that you can help contribute this if
| you're fired up about it!
| wswope wrote:
| Bokeh is an option in the frontend-viz space that puts out
| pretty solid SVG for statically-rendered charts, while also
| having the option of more Tableau-like interactive
| functionality with input fields, dynamic filters, etc. Might
| be a decent option for you?
|
| Their interactive "embedded-mode" avoids iframes too... but
| it's built with web components, so you wind up in shadow-DOM
| hell if you want to do anything dynamic on the view's
| contents.
| hughess wrote:
| We use ECharts in our open source BI tool (Evidence) and it's a
| great library. Has helped us build a declarative syntax for viz
| which can be version controlled (https://evidence.dev)
|
| Previous HN discussion:
| https://news.ycombinator.com/item?id=35645464 (97 comments)
| adeptima wrote:
| Looks great!
|
| Reminds me Obsidian DataView but with charts
| https://github.com/blacksmithgu/obsidian-dataview
|
| This whole ideas to have data, visualisations and knowledge
| base in one private offline place is very appealing
| hughess wrote:
| We're fans of Obsidian! DataView looks cool - love the
| ability to define the tables in code inline in the
| markdown. That's similar to how we inline DuckDB WASM SQL
| queries in markdown: https://docs.evidence.dev/core-
| concepts/queries/
| archiewood wrote:
| I love Obsidian.
|
| The Markdown <-> Markup typing experience is just so good
| compared to e.g. Slack, Reddit and other markdown-esque
| tools
| meekaaku wrote:
| Evidence looks cool, and I evaluated sometime back. The docs
| says the pages are all pre-rendered for all possible
| combinations. Is that the case still? If so, if I have a date
| filter, is it going to pre-render all possible dates?
| hughess wrote:
| We recently changed our architecture to include
| interactivity without having to pre-render all
| combinations. Pages are still pre-rendered with their
| initial content, but each Evidence app now ships with
| filter components and an in-browser DuckDB instance so you
| can build interactive apps. We call this Universal SQL - if
| you're interested, we wrote up our rationale for doing this
| here: https://evidence.dev/blog/why-we-built-usql/
|
| Here's an example project with some filter components and
| custom styling: https://ecommerce.evidence.app/
|
| This is still a static app - the data warehouse was only
| hit during the app's build process
| klaussilveira wrote:
| How do you deal with data visibility and permissions? I mean,
| most tables have data that should only be seen by a specific
| user or group ID, and that layer is usually handled by the
| application. It would be awesome to expose the power of
| Superset for users, but I imagine creating the security layer
| would be a pain.
| re5i5tor wrote:
| I have this question too
| Ringz wrote:
| https://superset.apache.org/docs/security/
| spdustin wrote:
| You can use row-level security, or specify RBAC with pretty
| much any SQL query.
| prabhatsharma wrote:
| eCharts is awesome. We moved from plotly after using it for
| several months to echarts at
| https://github.com/openobserve/openobserve and are super happy.
| j-a-a-p wrote:
| Had good results with echarts. With Superset not so much:
| complicated to install, lost all dashboards after an update,
| cryptic error messages, custom queries meh: we decided to use
| views in Postgres. The project with Superset was finished
| successfully, but the time spend is a multiple compared to
| using something like Power BI.
|
| All in all, not very innovative, but highly needed open source
| version of a traditional BI tool. Definitely something to
| follow and to use in temporary, not too demanding use cases.
| And hopefully a future replacement of Tableau or Power BI.
| rglullis wrote:
| Anyone that worked it and could compare with Redash?
| skadamat wrote:
| Well Redash got acquired so development stopped, biggest
| difference between Superset & Redash. Preset.io supports
| Superset still
| rglullis wrote:
| Redash development slowed down for sure, but it's not looking
| abandoned. It's just that I've been using it for some time
| now, I'm wondering if is anything feature-wise that could
| justify the switch.
| vfclists wrote:
| Generally what you get when VentureCapital/PrivateEquity buys out
| Redash.io, messes up end users in the process and spits it out a
| few years later, leaving users confused as to where it stands in
| the BI tools landscape.
| paddy_m wrote:
| I wish more projects had guided tour videos that demonstrated the
| power of the tool in the hands of an expert user. Not "get
| started" but "why should I care".
|
| Wes McKinney used to have an excellent 5 minute introduction to
| pandas in this genre.
| skadamat wrote:
| This might be what you're looking for:
| https://www.youtube.com/watch?v=kGfUIOK87V8
| paddy_m wrote:
| I saw that video on the website. It isn't narrated or
| captioned as to what the users is trying to accomplish
| rusackas wrote:
| You can check this out. This is a Preset Demo, but shows quite
| a bit of Superset within Preset (which offers multiple
| instances of Superset as "Workspaces")
| https://www.youtube.com/watch?v=V0HwGnC1rU8
| mikpanko wrote:
| Does anybody know why Superset started trending today? Is there a
| major release?
| remram wrote:
| Is there more than this single HN submission?
| rusackas wrote:
| There is a major release on the horizon (4.0) and there were
| just a couple of patch releases for the 3.x variants. I'm
| surprised to see it trending too, but I'm happy about it. More
| people need to know that Open Source BI is here, and here to
| win.
| zX41ZdbW wrote:
| Superset is powerful, but I wonder why they don't fix
| "papercuts", e.g., misaligned pixels on a spinner, or inability
| to copy a value from a table's cell, or non-monospace font for
| numbers in a table, etc. There are hundreds of small annoyances
| in the product.
| rusackas wrote:
| We try! We also accept PRs and Issues if there are things
| bugging people, of course. It's always a balancing act between
| building some new feature that people are clamoring for, or
| fixing those cosmetic issues that always crop up.
| posix_monad wrote:
| Is this capable of performing efficient JOINs across non-
| homogeneous data-stores?
| Lucasoato wrote:
| Should it? If you really need that, join the different sources
| with TrinoDB (or any related managed service like AWS Athena)
| and connect it to Superset.
| ildjarn wrote:
| It's common for business questions to only be answerable with
| a join over a few different stores.
|
| I think Athena can only query data on S3?
| totalhack wrote:
| Superset would be on my shortlist if I had to use something
| else, but the join limitations were part of why I passed.
| grzaks wrote:
| We use https://cube.dev/ as intermediate layer between data
| warehouse database and Superset (and other "terminal" apps for
| BI like report generators). You define your schema (metrics,
| dimensions, joins, calculated metrics etc) in cube and then
| access them by any tool that can connect to SQL db
| nvrmnd wrote:
| One thing to keep in mind with BI software is that the users are
| often very different than, well, those individuals that prefer to
| use mutt as an email client.
|
| Many, or most, users for a BI tool will be operations, product
| managers, and business management who simply will not find the
| interface to be intuitive, responsive, or well designed. At least
| that's my experience.
| fuzztester wrote:
| Wow, those Apache guys have so many projects. Of course, they've
| been at it for years, starting with the Apache web server, then
| Tomcat, etc., and also, many projects were first developed
| outside and then handed over to them, for whatever reasons.
| andrewshadura wrote:
| And sometimes projects are handed to them to die. The way they
| (mis)handle OpenOffice is unforgivable.
| fuzztester wrote:
| Interesting, did not know.
|
| In what way, any details?
|
| Not been tracking that or using OpenOffice for a while.
| emilienaples wrote:
| How would you compare Superset with PowerBI for analytics and CSS
| integration? Trying to develop features and advanced analytics
| capabilities into an app?
| rusackas wrote:
| You can style dashboards with CSS as much as you'd like, though
| there are some limitations (canvas/webGL elements). I wrote a
| whole blog post on it: https://preset.io/blog/customizing-
| superset-dashboards-with-...
|
| If you want to style the whole application, you can fork the
| repo and go bananas. If you're looking for theming, there's
| more to be done yet on that front, and I wrote an article on
| that too: https://preset.io/blog/theming-superset-progress-
| update/
| Wilduck wrote:
| Is Superset a decent tool if you're just a single person doing
| data analysis? Say I have a handful of sqlite databases, and just
| want to be able to develop some queries / charts. I was looking
| into Tableau / Power BI / Superset, and all of them seemed pretty
| heavyweight for a single user, and none of them seemed super easy
| to get setup locally.
|
| Any recommendations for a good piece of software for the single
| user case? Or a more convenient way to run the heavyweight tools?
| unixhero wrote:
| Tableau is the best, most powerful, most mature of the three,
| most feature complete and easiest of the three. I think they
| give you a 30 day trial.
|
| This is a single user application, unless you make it part of
| your built application.
| VenkatPram7 wrote:
| Superset isn't a single user application?
| unixhero wrote:
| Ah, sure
| c0pium wrote:
| > This is a single user application
|
| K8s installation instructions:
| https://superset.apache.org/docs/installation/running-on-
| kub...
|
| RBAC configuration:
| https://superset.apache.org/docs/security/#rest-api-for-
| user...
| javchz wrote:
| I'll say PowerBI has the potential to be more powerful, but
| you need to love the whole M and DAX languages eco system.
| And the integration with python and R it's not that bad.
|
| But if your vis are with the scope of native Tableau
| capabilities, then Tableau it's more friendly and gets less
| in the way of you and your work.
| bigger_cheese wrote:
| If you are doing data analysis I don't think any of the 3
| pieces of software you mentioned are going to be that helpful.
|
| I see these products as tools for data visualization and
| reporting i.e. presenting prepared datasets to users in a
| visually appealing way. They aren't as well suited for serious
| analytics.
|
| I can't comment on Superset or Tableau but I am familiar with
| Power BI (it has been rolled out across my org), the type of
| statistics you can do with it are fairly rudimentary. If you
| need to do any thing beyond summarizing (counts, averages, min,
| max etc). It is not particularly easy.
|
| For data analysis I use SAS or R. This software allows you do
| things like multivariate regression, timeseries forecasting,
| PCA, Cluster analysis etc. There is also plotting capability.
|
| Both these products are kind of old school, I've been using
| them since early 2000's, the "new school" seems to be Python.
| Pretty much all the recent data science people in my
| organization use Python. Particularly Pandas and libraries like
| Seaborn (https://seaborn.pydata.org/).
|
| The "power" users of Power BI in my organization tend to be
| finance/HR people for use cases like drill down into cost
| figures or Interactively presenting KPI's and other headline
| figures to management things like that.
| spdustin wrote:
| For my last employer, I set up Superset for a number of our
| clients to show all sorts of heavily customized marketing
| analytics dashboards, web performance graphs, project management
| burndown reports, you name it. As with another commenter's
| experience, we also got a client to replace Tableau with it, and
| not look back. Such a great product.
| twic wrote:
| How does this compare to Jupyter notebooks and the ecosystem
| around that? Do the use cases overlap, or are they completely
| different things?
| Lucasoato wrote:
| In my experience, people with a business related background
| have an easier time learning how to use BI tools (this is true
| even if Superset may be less user-friendly than other
| commercial product like Tableau); Jupyter is an interactive
| computing platform that is based on notebooks and cells, that's
| more useful for data scientists/engineers whose needs might
| exceed the capabilities of a SQL interface.
| tomrod wrote:
| It's been a few years since I evaluated superset. Did they ever
| resolve drilldown (filter for one chart on a page, populate to
| all charts)?
| rusackas wrote:
| Yep... there's Drill By, which is more flexible than drill-
| down. Rather than having to specify a strict hierarchy of
| drilling "levels" you can pick columns, hierarchical or
| otherwise, to drill into.
| rietta wrote:
| Neat. I have to admit I about had a heart attack reading
| "Superset" as "Sunset" at first. I've become too jaded about
| stuff being shut down and announced on HN. Very pleasantly
| surprised when I read correctly and clicked through to see its
| about data analytics.
| cheema33 wrote:
| I recently discovered Apache Superset. I would love to use it in
| our product. Does anybody know if it possible to integrate it
| into an existing product? I am mostly curious about hooking up
| its authentication system to our own authentication system, which
| is based on auth in ASP.NET Core 8.
| Cilvic wrote:
| >Took me a while figure out how to embed it into my app using
| Superset Embedded SDK.
|
| Superset Embedded SDK - "Embedded SDK allows you to embed
| dashboards from Superset into your own app, using your app's
| authentication. Embedding is done by inserting an iframe,
| containing a Superset page, into the host application."
| HermitX wrote:
| Here is a fantastic video made by Soumil Shah, using
| MinIO+Hudi+StarRocks+Superset. It is amazing to have an
| interactive query experience on a data lake directly!
| https://www.youtube.com/watch?v=JkKBzrQTKx0
| 3abiton wrote:
| Thanks for sharing, it's so exciting to see so many OSS BI
| frameworks
| atbpaca wrote:
| This looks really good! How does it compare to Tableau?
| rusackas wrote:
| Well, it's free! Or significantly cheaper even if you opt to
| use Preset to run a hosted/managed/compliant version of it, and
| not have to deal with config/security/upgrades/migrations. This
| article is a year old, but it might help a bit:
| https://preset.io/blog/apache-superset-vs-tableau/
| adamgamble wrote:
| We use metabase heavily at work. However where it seems like all
| these tools fall down is organization around the hundreds of
| dashboards and questions. I wish it had like a built wiki or
| something to build out more navigation. Anyone know of any good
| ways to do that?
| xtracto wrote:
| Mhmm this gives me an idea.. what if I could "group" metabase
| sql queries by "similarity" (either of results or of the query
| itself)
|
| Another option could be to use LLM to summarize, tag and group
| queries for better discoverability.
| _pastel wrote:
| 100% agree.
|
| One thing that helps is hooking metabase up to its own database
| and building queries on your queries, e.g.:
| select * from report_card where dataset_query
| ilike '%' || {{query}} || '%'
|
| (You can also join in metadata like the author, when it was
| last ran, etc.)
|
| We also try really hard to keep the Collection directory
| structure clean and consistent. But it's still really hard.
| scrappyjoe wrote:
| Maybe take a look at https://datahubproject.io/integrations ? I
| only heard about it today, but it looks pretty promising. Spun
| out of LinkedIn, open source, lots of integrations, including
| Metabase
| datatrashfire wrote:
| love superset, but one thing that I would love to see is to make
| it easier for dashboards/charts to use a dynamic table that the
| user can select.
|
| we have multiple tenants + developer instances of our warehouse.
| to reuse the same dashboard in this setup we need to create at
| least 3 virtual datasets, plus wrangle a bunch of boiler plate
| jinja.
| vietvu wrote:
| Used Superset back in 2016 and 2020; both time chose Metabase for
| our clients' BI dashboard and Superset for our internal
| dashboard. Superset is nice, easy to modify and extend but not
| user friendly as Redash or Metabase. But after the author
| launched Preset, it seems to have improved much with the company
| effort. It looks like to me the best way for OSS to advance is to
| have a company dedicated to improve it.
| hiepdev wrote:
| How does it compare to Kibana + Elasticsearch?
| nullify88 wrote:
| A big thing here is that Superset and most of the other BI
| tools can connect directly to databases which is commonly the
| source of truth or data warehouse in some businesses. Secondly,
| Elastic have focused on other operational areas such as
| security, observability, and indexing / search. Kibana can do
| some dashboarding on those areas and its UI is nice, but
| Superset and similar tooling are more suited for BI purposes.
| nojito wrote:
| Superset is absolutely phenomenal. I really hope Microsoft
| eventually releases all of their customizations they made to it
| internally to the OS community someday.
|
| https://www.youtube.com/watch?v=RY0SSvSUkMA
|
| https://github.com/apache/superset/discussions/20094
| pknerd wrote:
| I found Superset difficult to use when I explored it a couple of
| years back[1], not sure whether this is the same case now.
|
| [1] https://blog.adnansiddiqi.me/create-your-first-sales-
| dashboa...
| kumarvvr wrote:
| Tried installing it, locally in a Python Virtual Env.
|
| Apparently installation will not work with Python 3.12, dur to
| deprecation of distutils.
|
| Does anyone have any method to install this?
| waldrews wrote:
| Maybe try the Docker installation to keep the dependencies off
| your system:
|
| https://superset.apache.org/docs/installation/installing-sup...
| altilunium wrote:
| You can query Wikipedia's internal database by using its superset
| instance.
|
| https://superset.wmcloud.org
|
| https://phabricator.wikimedia.org/T169452
|
| Back then, I used this to generate some custom statistics
|
| https://github.com/altilunium/wikiidmon
| martin82 wrote:
| Bummer that it can't pull data from JSON APIs, which Redash can
| do.
| hackandthink wrote:
| It should be possible (have not tried myself):
|
| https://preset.io/blog/accessing-apis-with-superset/
|
| "Shillelagh (SI'leIlI) is a Python library and CLI that allows
| you to query many resources (APIs, files, in memory objects)
| using SQL. It's both user and developer friendly, making it
| trivial to access resources and easy to add support for new
| ones"
|
| https://github.com/betodealmeida/shillelagh
| wesleyyue wrote:
| Surprised no one has mentioned hex yet. There was a post on the
| yc internal forum today about data stacks and a lot of founders
| mentioned they liked hex. I hadn't heard too much about them
| before but they looked interesting for someone (me) who typically
| prefers something closer to a jupyter notebook and simple stacks.
| lf-non wrote:
| Full fledged BI tools like Superset and Metabase are amazing for
| their intended use cases.
|
| But they may be an overkill if your primary use case is to
| infrequently build semi-interactive reports for non-technical
| end-users and your use cases are are mostly covered by standard
| graphs & tables. Esp. so if you are familiar with SQL and have
| access to the underlying data source. Two nifty utilities I have
| found to be very useful for latter kind of use cases are SQLPage
| and Evidence.
|
| They make it very convenient to whip out some SQL and convert
| that to a neat professional looking web ui that can be forwarded
| to an end user. In case of Evidence it is a statically generated
| site, and in case of SQLPage it is a web app that connects to a
| live database.
|
| SQLPage: https://sql.ophir.dev/
|
| Evidence: https://evidence.dev
| amai wrote:
| Does it have horizontal bar charts nowadays?
| amai wrote:
| Can one run Python scripts in Apache Superset like on can do with
| PowerBI: https://pycaret.gitbook.io/docs/learn-pycaret/official-
| blog/...
| amai wrote:
| Unfortunately information about new releases are not available on
| the superset website, but only at Preset.io:
| https://preset.io/blog/superset-3-0-release-notes/
| uraura wrote:
| From the introduction, I can see a list of backend technologies.
| But do they have a high level architecture diagram? I don't know
| what I really need for production setup.
| anentropic wrote:
| I wanted the same info, sadly lacking.
|
| AFAICT needs a db (MySQL/Postgres) and a cache
| (Redis/Memcached) and one (or more?) web workers.
|
| Then optionally also Celery workers (for "async queries" i.e.
| slow running)... not sure how optional that is though.
| orestis wrote:
| Can this work to give end-users/customers the ability to create
| their own reports/charts, respecting data access visibility etc?
|
| I am in need of a "dashboarding" feature in our SaaS, but it
| seems there's a gap between PowerBI/Tableau/Metabase/Superset and
| various charting libraries. The former are too much "turn key"
| and the latter require a ton of work to setup all the chart-
| building UI and features...
| etoulas wrote:
| Have a look at Embeddable. It's still pretty new but build by
| an experienced team.
|
| It's commercial software though.
|
| https://embeddable.com/
| loufe wrote:
| I wonder why so few BI software support Pi databases. They are
| pervasive and mission critical in commodity industries, but there
| only seems to be proprietary options available.
___________________________________________________________________
(page generated 2024-02-27 23:02 UTC)