[HN Gopher] Hosted Monitoring: Evaluating InfluxDB Cloud and Gra...
       ___________________________________________________________________
        
       Hosted Monitoring: Evaluating InfluxDB Cloud and Grafana Cloud
        
       Author : pqvst
       Score  : 69 points
       Date   : 2021-06-23 09:28 UTC (13 hours ago)
        
 (HTM) web link (pqvst.com)
 (TXT) w3m dump (pqvst.com)
        
       | sofixa wrote:
       | I feel like InfluxDB were the best time series database in town
       | for some time (2015-2018?), but they failed to monetise and
       | exploit the space properly, and now they have lots of
       | competition, many of it better in a lot of cases ( e.g. speed,
       | space use, ease of use, pricing, integrations), be it for self-
       | hosted or cloud.
       | 
       | Some of it was wasted effort ( Cheonograf is still useless
       | compared to Grafana, who thought it's a good idea to compete with
       | it?), some was poor luck, some was decisions that backfired (
       | like removing clustering from the free version). Some of it was
       | everyone deciding pull is better than push ( i disagree in most
       | cases, but see the point).
       | 
       | Telegraf still remains a very solid collecting agent and IMHO the
       | best in class. With everyone (from the SaaS monitoring vendors )
       | moving to OpenTelemetry i think that in some time it'll fall
       | behind as well.
        
         | karterk wrote:
         | InfluxDB could have become Grafana if they had focussed better.
         | It was a breathe of fresh in a space where existing solutions
         | were horrendously complicated. I suspect that they were under a
         | lot of pressure to monetize and lost some way in attempting to
         | do that.
        
         | KaiserPro wrote:
         | Telegraf's USP is that its an anything to anything converter.
         | Which is invaluable when you are supporting a whole bunch of
         | mixed stuff.
         | 
         | I'm not sure OpenTelemtry is a replacement for actual designed
         | metrics though. Its great for getting info about the state of
         | an instance of something. but thats only half the picture.
         | Building meaningful alerts of just raw telemetry is hard.
        
         | heipei wrote:
         | Telegraf is the main reason I'm still running the Telegraf +
         | InfluxDB + Grafana stack. I might have checked out TimescaleDB
         | or others, but since Telegraf works great with InfluxDB and has
         | all sorts of arcane collection plugins, mostly with sensible
         | options, I'm staying put. Good luck replicating the years of
         | community effort to build out all these collectors to anyone
         | who wants to replace it.
        
           | dharmab wrote:
           | The Prometheus community has done a great job covering a lot
           | of common software and data sources:
           | https://prometheus.io/docs/instrumenting/exporters/
        
         | [deleted]
        
       | KaiserPro wrote:
       | Old man opinion here:
       | 
       | I really like grafana, personally I'd probably host it myself
       | even now. (At all previous companies I did.) Its well contained
       | and even with alerting its simple to set up and sync state
       | (assuming you are using a hosted database, which can be
       | expensive)
       | 
       | Where my opinion gets controversial: I dislike Prometheus, and I
       | really don't like influx. I much prefer graphite.
       | 
       | Influx has higher temporal resolution, and an SQL interface (its
       | the SQL interface I don't like)
       | 
       | Prometheus has a pull model, which I can see the appeal of, but I
       | don't really like. But it deals with ephemeral metrics much
       | better (however you need a gateway to cache the metrics, as they
       | might not get pulled in time.)
       | 
       | The issue is that graphite's interface is much better for
       | discovering and filtering/transforming your data. I'm not the
       | worlds greatest SQL person, so I can't do that much complex
       | stuff, I also don't know how to do maths in SQL well. With
       | graphite I don't need to worry, all the functions are there in
       | menu, derivative? yes, all the averaging? mostDeviant? yes.
       | 
       | Graphite made making graphs quick and easy, I don't think I can
       | say the same for influx or prometheus. I know that all sorts of
       | non technical people can use graphite/grafana. I'm less convinced
       | that they can use Prometheus or influx as well without training.
        
         | wperron wrote:
         | > Prometheus has a pull model, which I can see the appeal of,
         | but I don't really like.
         | 
         | That's such an antiquated statement at this point. Prometheus
         | might have been pull-only in a distant past, but the remote
         | write protocol has been alive and well for a long time now, and
         | it works amazingly well, at surprising scale.
        
         | sofixa wrote:
         | > Influx has higher temporal resolution, and an SQL interface
         | (its the SQL interface I don't like)
         | 
         | SQL was too limited, that's why they did Flux, their new query
         | language. It's weird and has a learning curve, but powerful.
        
           | gcbirzan wrote:
           | SQL is not limited, InfluxQL is. And, flux is incredibly
           | verbose and error prone (at least a year ago when we last
           | looked at it). However, with the transformations in grafana,
           | there's no reason with flux, in the majority of cases.
        
             | jstrong wrote:
             | influxql is limited, yes, but much less verbose for time-
             | based aggregation than the equivalent sql queries, which is
             | why I like using it for adhoc visualization and
             | inspectibility. (I don't like flux). to me it's great for a
             | limited purpose. writing the equivalent sql queries would
             | drive me up the wall.
        
           | KaiserPro wrote:
           | > It's weird and has a learning curve, but powerful.
           | 
           | I don't doubt that, but teaching it to business analysts,
           | product owners and the like is a real headache.
           | 
           | I'm sure I would grow to love it, but then it would require
           | me to make dashboards for other people. Getting them to make
           | them themselves is living the dream
        
           | kungito wrote:
           | It honestly doesn't feel powerfull at all. I admit, I have
           | started using Influx 2.0 during beta, but the experience has
           | been horrible. I first tried to write all my logic in Influx
           | and have now over few months moved everything back into Rust
           | with just lines which go directly into graphs staying in
           | influx. The language as it is is ok, it's very basic, but the
           | runtime and the development environment is very unfinished.
           | So many internal errors whatever you do, the responses are
           | slow to be received because I guess so few people try to
           | write anything more complicated in Flux, the debugging is
           | very hard. You can't even print out single values, everything
           | has to be displayed in graphs. They support using arrays and
           | single values but you just get convoluted internal errors and
           | compiler panics if you don't write correctly in a language
           | without explicit typing.
        
         | Proven wrote:
         | I completely agree, especially on the usability of Graphite.
         | Casual users can create own dashboards - just get the data into
         | Graphite.
         | 
         | InfluxDB has some additional advantages, but they meed medium
         | to large scale or advanced use to matter.
         | 
         | For my use case (smaller, simple environment), Graphite is
         | great.
        
         | pqvst wrote:
         | When we were self-hosting Grafana with an influx data store
         | writing queries was actually very easy since Grafana provides a
         | "query editor" to generate the SQL for Influx. Ironically that
         | was much easier to use than InfluxDB Cloud's own solution
         | (which now uses a completely different query language).
         | 
         | Grafana Cloud seems to mostly advertise the pull model, whereas
         | we were really looking for a push model. I'm glad I figured out
         | how to get that to work with Graphite, since Grafana Cloud
         | offers both Prometheus and Graphite.
         | 
         | Now given that Grafana Cloud's free-tier covers our needs I'm
         | glad we no longer have to maintain a self-hosted version (and
         | we're definitely getting better performance now).
        
           | KaiserPro wrote:
           | I should have written that I fully understand why you'd want
           | to sling it out and be hosted by someone else.
           | 
           | I'm a devop/sre/pe/sysadmin so I have a motivation to keep
           | metrics in my firm grasp. Given the price point and
           | performance, for small organizations it'd be madness to _Not_
           | get grafana cloud to host it.
        
         | heipei wrote:
         | I use Grafana religiously every day, rely on it for metrics,
         | long-term trends and alerting. I also use Telegraf with
         | InfluxDB, but thanks to Grafana I've never had to write a
         | single InfluxDB statement.
         | 
         | Also: It's not just higher temporal resolution, it's having an
         | actual time-series database vs a metrics database. With
         | InfluxDB (and all other event-based databases) I can go in
         | after the fact and just plot the CPU load for one specific
         | host, or a group of hosts, or plot the HTTP requests that took
         | longer than 200ms, hit a specific endpoint and finished with a
         | specific HTTP status code. If you don't keep the raw
         | measurements you always have to make decisions about which
         | groups you aggregate down to ahead of time. Just something to
         | keep in mind.
        
         | [deleted]
        
       | gouthamve wrote:
       | Hey, you can also point your Telegraf directly at GrafanaCloud
       | Prometheus if you're interested in using Prometheus. See:
       | https://grafana.com/docs/grafana-cloud/how-do-i/push-from-te...
       | 
       | Our Graphite hosting is also best in class and this makes me
       | happy to see the good feedback!
       | 
       | PS: I work on Grafana Cloud Prometheus.
        
       | hagen1778 wrote:
       | I really love Grafana for visualizing metrics and can't think of
       | anything with the same level of usability. It was obvious that
       | InfluxDB Cloud UI can't compete with Grafana in this field.
       | 
       | From the article I noticed it was tricky to switch from Push to
       | Pull model. I'd recommend to try victoriametrics for your case
       | due to the following reasons:
       | 
       | 1. it supports Graphite ingestion protocol [1]
       | 
       | 2. it supports Influx line ingestion protocol [2]
       | 
       | 3. and besides of MetricsQL it also allows to query Graphite data
       | [3]
       | 
       | So you might need to make less changes if switching to VM instead
       | of Grafana Cloud. Also, you can continue to run it on DO $5
       | droplet.
       | 
       | [1] https://docs.victoriametrics.com/#how-to-send-data-from-
       | grap...
       | 
       | [2] https://docs.victoriametrics.com/#how-to-send-data-from-
       | infl...
       | 
       | [3] https://docs.victoriametrics.com/#querying-graphite-data
        
       | mikojan wrote:
       | InfluxDB Cloud might be the single best solution if what you want
       | is a managed service that just works somehow.
       | 
       | As a time series database however it is rather painful.
       | 
       | You can't have InfluxDB only. It comes integrated with the
       | complete ecosystem. For example, it exposes an app/web server by
       | default which you can't disable.
       | 
       | Flux DSL is decent but so was SQL and I already knew SQL.
       | 
       | Shipping a database does not appear to be topping their priority
       | list and it's kind of irritating. A year into the InfluxDB 2
       | release you weren't able to delete data. Now you can, but you
       | can't be sure when that's going to actually happen. And there's
       | no way of monitoring/managing currently running queries. Add to
       | that its massive memory consumption and you're left wondering
       | what other database solution would get away with this.
       | 
       | For personal projects I have since moved to IoTDB
        
       ___________________________________________________________________
       (page generated 2021-06-23 23:03 UTC)