https://grafana.com/blog/2024/09/12/opentelemetry-and-vendor-neutrality-how-to-build-an-observability-strategy-with-maximum-flexibility/ Path: Copied! * Products Open source Solutions Learn Docs Company * Downloads Contact us Sign in Create free account Contact us Products All Products Core LGTM Stack Logs powered by Grafana Loki Grafana for visualization Traces powered by Grafana Tempo Metrics powered by Grafana Mimir and Prometheus extend observability Performance & load testing powered by Grafana k6 Continuous profiling powered by Grafana Pyroscope Plugins Connect Grafana to data sources, apps, and more end-to-end solutions Application Observability Monitor application performance Frontend Observability Gain real user monitoring insights Incident Response & Management with Grafana Alerting, Grafana Incident, Grafana OnCall, and Grafana SLO Synthetic Monitoring Powered by Grafana k6 Deploy The Stack Grafana Cloud Fully managed Grafana Enterprise Self-managed Pricing Hint: It starts at FREE Open Source All Open Source Grafana Loki Multi-tenant log aggregation system Grafana Query, visualize, and alert on data Grafana Tempo High-scale distributed tracing backend Grafana Mimir Scalable and performant metrics backend Grafana Pyroscope Scalable continuous profiling backend Grafana Beyla eBPF auto-instrumentation Grafana Faro Frontend application observability web SDK Grafana Alloy OpenTelemetry Collector distribution with Prometheus pipelines Grafana OnCall On-call management Grafana k6 Load testing for engineering teams Prometheus Monitor Kubernetes and cloud native OpenTelemetry Instrument and collect telemetry data Graphite Scalable monitoring for time series data All Community resources Dashboard templates Try out and share prebuilt visualizations Prometheus exporters Get your metrics into Prometheus quickly Solutions All end-to-end solutions Opinionated solutions that help you get there easier and faster Kubernetes Monitoring Get K8s health, performance, and cost monitoring from cluster to container Application Observability Monitor application performance Frontend Observability Gain real user monitoring insights Incident Response & Management Detect and respond to incidents with a simplified workflow monitor infrastructure Out-of-the-box KPIs, dashboards, and alerts for observability Linux Windows Docker Postgres MySQL AWS Kafka Jenkins RabbitMQ MongoDB All monitoring and visualization solutions visualize any data Instantly connect all your data sources to Grafana MongoDB AppDynamics Oracle GitLab Jira Salesforce Splunk Datadog New Relic Snowflake All monitoring and visualization solutions Learn All Learn Stay up to date ObservabilityCON Annual flagship observability conference Waitlist ObservabilityCON on the Road Observability roadshow series Blog News, releases, cool stories, and more Observability Survey 2024 Key findings and results New Story of Grafana 10 years of Grafana Events Upcoming in-person and virtual events Success stories By use case, product, and industry Technical learning Documentation All the docs Webinars and videos Demos, webinars, and feature tours Tutorials Step-by-step guides Workshops Free, in-person or online Writers' Toolkit Contribute to technical documentation provided by Grafana Labs Plugin development Visit the Grafana developer portal for tools and resources for extending Grafana with plugins. new Join the community Community Join the Grafana community new Community forums Ask the community for help Community Slack Real-time engagement Grafana Champions Contribute to the community new Community organizers Host local meetups new Docs All Docs Grafana Grafana Mimir Grafana Tempo Grafana Loki Grafana Pyroscope Grafana Alloy Grafana Beyla Grafana Faro Grafana k6 Prometheus Writers' Toolkit Grafana Cloud Grafana Cloud k6 Synthetic Monitoring Grafana Kubernetes Monitoring Grafana OnCall Grafana Incident Grafana SLO Grafana Alerting Grafana Machine Learning Application Observability Grafana Enterprise Grafana Enterprise Logs Grafana Enterprise Metrics Grafana Enterprise Traces Grafana plugins Community plugins Visit documentation Get started Get started with Grafana Build your first dashboard Get started with Grafana Cloud What's new / Release notes Grafana: 11.2 Grafana k6: 0.53 Grafana Loki: 3.1 Grafana Mimir: 2.9 Grafana Pyroscope: 1.8 Grafana Tempo: 2.6 Company All Company Our team Careers We're hiring Events Partnerships Newsroom Contact us Merch Help build the future of open source observability software Open positions Check out the open source projects we support Downloads Sign in Core LGTM Stack Logs powered by Grafana Loki Grafana for visualization Traces powered by Grafana Tempo Metrics powered by Grafana Mimir and Prometheus extend observability Performance & load testing powered by Grafana k6 Continuous profiling powered by Grafana Pyroscope Plugins Connect Grafana to data sources, apps, and more end-to-end solutions Application Observability Monitor application performance Frontend Observability Gain real user monitoring insights Incident Response & Management with Grafana Alerting, Grafana Incident, Grafana OnCall, and Grafana SLO Synthetic Monitoring Powered by Grafana k6 Deploy The Stack Grafana Cloud Fully managed Grafana Enterprise Self-managed Pricing Hint: It starts at FREE Grafana Cloud Free forever plan (Surprise: it's actually useful) * Grafana, of course * 14 day retention * 10k series Prometheus metrics * 500 VUh k6 testing * 50 GB logs, traces, and profiles * 50k frontend sessions * 2232 app o11y host hours * 2232 k8s monitoring host hours * 37944 k8s monitoring container hours * and more cool stuff Create free account No credit card needed, ever. Grafana Loki Multi-tenant log aggregation system Grafana Query, visualize, and alert on data Grafana Tempo High-scale distributed tracing backend Grafana Mimir Scalable and performant metrics backend Grafana Pyroscope Scalable continuous profiling backend Grafana Beyla eBPF auto-instrumentation Grafana Faro Frontend application observability web SDK Grafana Alloy OpenTelemetry Collector distribution with Prometheus pipelines Grafana OnCall On-call management Grafana k6 Load testing for engineering teams Prometheus Monitor Kubernetes and cloud native OpenTelemetry Instrument and collect telemetry data Graphite Scalable monitoring for time series data All Community resources Dashboard templates Try out and share prebuilt visualizations Prometheus exporters Get your metrics into Prometheus quickly end-to-end solutions Opinionated solutions that help you get there easier and faster Kubernetes Monitoring Get K8s health, performance, and cost monitoring from cluster to container Application Observability Monitor application performance Frontend Observability Gain real user monitoring insights Incident Response & Management Detect and respond to incidents with a simplified workflow monitor infrastructure Out-of-the-box KPIs, dashboards, and alerts for observability Linux Windows Docker Postgres MySQL AWS Kafka Jenkins RabbitMQ MongoDB visualize any data Instantly connect all your data sources to Grafana MongoDB AppDynamics Oracle GitLab Jira Salesforce Splunk Datadog New Relic Snowflake All monitoring and visualization solutions Stay up to date ObservabilityCON Annual flagship observability conference Waitlist ObservabilityCON on the Road Observability roadshow series Blog News, releases, cool stories, and more Observability Survey 2024 Key findings and results New Story of Grafana 10 years of Grafana Events Upcoming in-person and virtual events Success stories By use case, product, and industry Technical learning Documentation All the docs Webinars and videos Demos, webinars, and feature tours Tutorials Step-by-step guides Workshops Free, in-person or online Writers' Toolkit Contribute to technical documentation provided by Grafana Labs Plugin development Visit the Grafana developer portal for tools and resources for extending Grafana with plugins. new Join the community Community Join the Grafana community new Community forums Ask the community for help Community Slack Real-time engagement Grafana Champions Contribute to the community new Community organizers Host local meetups new Featured Getting started with the Grafana LGTM Stack We'll demo how to get started using the LGTM Stack: Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics. Watch now - Open source Grafana Grafana Mimir Grafana Tempo Grafana Loki Grafana Pyroscope Grafana Alloy Grafana Beyla Grafana Faro Grafana k6 Prometheus Writers' Toolkit Cloud Grafana Cloud Grafana Cloud k6 Synthetic Monitoring Grafana Kubernetes Monitoring Grafana OnCall Grafana Incident Grafana SLO Grafana Alerting Grafana Machine Learning Application Observability Enterprise Grafana Enterprise Grafana Enterprise Logs Grafana Enterprise Metrics Grafana Enterprise Traces Grafana plugins Community plugins Visit documentation Get started Get started with Grafana Build your first dashboard Get started with Grafana Cloud What's new / Release notes Grafana: 11.2 Grafana k6: 0.53 Grafana Loki: 3.1 Grafana Mimir: 2.9 Grafana Pyroscope: 1.8 Grafana Tempo: 2.6 Our team Careers We're hiring Events Partnerships Newsroom Contact us Merch Grot cannot remember your choice unless you click the consent notice at the bottom. Site search Ask Grot - AI Beta (what could go wrong?) [ ] I am Grot. Ask me anything Grot good Grot bad Feedback I'm a beta, not like one of those pretty fighting fish, but like an early test version. So, our vampires, I mean lawyers want you to know that I may get answers wrong. - Go back Feedback Write a short description about your experience with Grot, our AI Beta. Rate your experience (required) Comments (required) [ ] Send Sending... Sent Thank you! Your message has been received! [ ] Blog * All * Community * Culture * Engineering * News * Release [ ] Blog Menu [ ] * All * Community * Culture * Engineering * News * Release OpenTelemetry and vendor neutrality: how to build an observability strategy with maximum flexibility David Allen, Juraci Paixao Krohling * 2024-09-12 * 9 min --------------------------------------------------------------------- One of the biggest advantages of the OpenTelemetry project is its vendor neutrality -- something that many community members appreciate, especially if they've spent huge amounts of time migrating from one commercial vendor to another. Vendor neutrality also happens to be a core element of our big tent philosophy here at Grafana Labs. We realize, however, that this neutrality can have its limits when it comes to real-world use cases. In this post, we'll explain vendor neutrality in the context of the open source OpenTelemetry project, including where it starts and ends, and how you can take advantage. Three key layers of telemetry Before we dive deeper into the concept of vendor neutrality, let's look at the bigger picture, as it relates to telemetry and the OpenTelemetry project. In general, telemetry approaches usually involve three layers: 1. The apps and infrastructure: This is the source of the telemetry -- meaning, the actual thing we will be observing. 2. The telemetry collector: This is software that collects, filters, processes, and forwards telemetry signals. 3. The telemetry backend: This can be broken down into two further layers: storage and exploitation, which is how we extract value out of telemetry signals. For example, at the exploitation layer, an SRE looks at a dashboard and determines that a particular service needs investigation. A diagram depicting the layers of telemetry. You can get quite complex with all three layers, particularly the collector layer, but telemetry solutions usually have something in each category. This is a useful model to help ground what vendor neutrality means in this post, and how you achieve it. OTel's vision for vendor neutrality OTel is very appealing to folks coming from commercial vendors because they've often been burned by the costs of switching from proprietary instrumentations, agents, and/or collectors. To quote the OpenTelemetry mission, vision, and values page (bolded emphasis is ours): For decades, proprietary drop-in agents from monitoring and observability vendors have been the primary source for useful telemetry from across the application stack. Unfortunately, the lack of common standards or APIs across these agents has led to vendor lock-in for customers, and inhibited innovation by tightly coupling telemetry collection with telemetry storage and analysis . With OpenTelemetry, we strive to provide a level playing field for all observability providers, avoid lock-in to any vendor, and interoperate with other OSS projects in the telemetry and observability ecosystem. There are two key points to draw out of this: 1. Tight coupling between telemetry collection, storage, and analysis is a bad idea (more on this below). 2. If you use a proprietary drop-in agent that does not conform to standards as a way of gathering telemetry, you will be locked into that vendor. This is because you cannot have a telemetry system without the data, and that vendor is the one way to get that data. That's vendor lock-in (the opposite of vendor neutrality), in OTel terms. Let's take a deeper look at both of these points. Loose coupling Tight coupling is something we avoid in all software architecture work, because when system components (such as telemetry collection and exploitation) are tightly coupled, your flexibility to choose different options is limited. In other words, tight coupling facilitates lock-in. OpenTelemetry, instead, facilitates "loose coupling," because the standards and software you use for collection are independent from your backend. That's a clear win. The way OTel components are built is also loosely coupled; you can use the SDKs without the collector, or a language-specific instrumentation API with a different SDK implementation. You can mix and match the various technical "pieces" of OpenTelemetry (the SDK, API, OTLP, and Collector) with other implementations. Open standards The second point is more complex, but just as important. A "proprietary drop-in agent" is something that represents and transmits telemetry data in a format only one vendor understands. This is where open standards really shine. A community agrees on the semantic conventions and on the wire representation (OTLP). Any open source telemetry collector that supports those things no longer exhibits proprietary lock-in. Two things to note here: 1. Open source means anyone can contribute, modify, or fork; it cannot be owned by one company. 2. Standards-compliant means the telemetry collector "speaks the agreed upon language" of OpenTelemetry semantics and OTLP. OTel Collector is the first and primary telemetry collector that fits this bill, as a reference implementation. Vendor lock-in is the opposite of those two facets: if you can't fork or modify the collector, and if the collector doesn't speak an open standard as its language, then you'll be locked into using that vendor's tools. Telemetry backends So far, we have been talking about sourcing, collecting, and processing telemetry. What about the backend? In any architecture, eventually the data comes out of an OpenTelemetry-compatible collector, and goes into a database somewhere. That database might be Grafana Loki or Mimir, or it could be Prometheus, Elasticsearch, or something else. This is where vendor neutrality starts to look different. A diagram depicting how telemetry backends work. At this point, it's no longer about telemetry data being sourced and flowing on the wire -- it's about telemetry data at rest in a database. To utilize that telemetry, we're going to have to pull it out of the database, and slice and dice it with queries. No matter what database you choose, what matters most is the database's schema and query language. It will likely respect the semantic standards, but you don't query databases with OTLP. Next, you'll build some set of dashboards on top of the queries from that database, and you might build alerts and other things, regardless of the vendor you choose. When you do this, you're out of the OpenTelemetry world, and unavoidably building product-specific resources. If you use Grafana Cloud, even if you're feeding it with OTel signals, it may not be obvious how to migrate those usage patterns to a different backend, and the same is true for all other observability vendors. This is because there are no industry standards for things like dashboards and telemetry databases to adhere to. And, as stated on the "What is OpenTelemetry?" page, telemetry storage and visualization are beyond the scope of the OTel project (bolded emphasis, again, is ours): OpenTelemetry is not an observability backend like Jaeger, Prometheus, or other commercial vendors. OpenTelemetry is focused on the generation, collection, management, and export of telemetry. A major goal of OpenTelemetry is that you can easily instrument your applications or systems, no matter their language, infrastructure, or runtime environment. Crucially, the storage and visualization of telemetry is intentionally left to other tools. Vendor neutrality from different angles Implementation effort creates sunk cost, which is a factor in how lock-in happens. In this sense, no technology provides a guarantee you can avoid vendor lock-in completely. OpenTelemetry is about decoupling and providing options in the app/ infrastructure layer and collector layer (the first two "layers" of telemetry); the third layer, telemetry backends, is where most teams put in the effort to customize their view. When they do, they have an investment that can be hard to migrate later. It's like picking a programming language: if you write an app in Go, we wouldn't say you're "locked-in" to Go, but in choosing it, you've made it harder to decide to rewrite in Rust later. What you get out of telemetry backends comes from your investment: you build visualizations that satisfy your team's needs. Grafana helps people avoid lock-in by enabling them to build dashboards on top of any database they'd like; you don't have to use Loki, Mimir, and Tempo. Still, the fundamental remains: implementation effort with any vendor creates customizations, which later limit switching with ease. This is why OpenTelemetry cannot create vendor neutrality at the backend layer, which is a common source of confusion for enterprises implementing it. OpenTelemetry is great, but if you're architecting a solution, it's important to understand these nuances around vendor neutrality. How to maximize your options Start with apps and traces OpenTelemetry comes from roots in tracing and applications, so it is strongest in cases where you're instrumenting apps that need traces. Using an OpenTelemetry SDK and OpenTelemetry-compatible collectors will decouple telemetry collection from storage and exploitation. This can help future-proof your architecture and leave you free to have maximum options at the backend layer. Adopt in layers We previously mentioned that OpenTelemetry components (SDKs, APIs, OTLP) are all loosely coupled, which means that you can pick and choose and adopt "layers" of OpenTelemetry as you go. There is simply no need to jump in all at once; layered adoption builds familiarity and mitigates risk. Standardizing at the protocol level of OTLP, though, might be a good first move, which can be done with an open source telemetry collector that supports OTLP. This will let you pick and choose which language-specific APIs and SDKs make the most sense for your team. As long as they can write OTLP to a remote destination, you are in good shape. The world can be messy, though, and even if you can't write OTLP directly from the source, often you can get your collector to emit OTLP, regardless of how the telemetry was gathered from the app. Within the app, if you choose to use non-OTel APIs for instrumentation, give preference to those that adhere to the semantic conventions, so you can benefit from tooling built elsewhere in the ecosystem, such as Grafana Cloud Application Observability. Ensure reusable instrumentation If you're evaluating different backends, you can use the same instrumentation setup to push telemetry signals to two or more backends. This way, you can do a "bake-off" and evaluate which one is a better fit. Even though you will invest effort to build the solution you need, you still get an extra point of flexibility: upgrades, maintenance, and changes in the telemetry layers are still decoupled, so you can change your collection approach and your backend approach freely without having to rebuild the entire system. Learn more OpenTelemetry's original vision was to break out of the tight coupling found in older observability architectures. It also aimed to provide alternatives to proprietary drop-in agents that made switching telemetry backends difficult. The open source project has delivered on both points, and in this post, our goal was to provide some common-sense guidelines on how you can take advantage of that. To learn more, check out our OpenTelemetry best practices post, as well as our post about recent milestones in the OpenTelemetry project and community. Tags OpenTelemetry On this page * Three key layers of telemetry * OTel's vision for vendor neutrality * Loose coupling * Open standards * Telemetry backends * Vendor neutrality from different angles * How to maximize your options * Start with apps and traces * Adopt in layers * Ensure reusable instrumentation * Learn more Scroll for more Related content Juraci Paixao Krohling * 16 Aug 2024 * 7 min read Behind the scenes of the OpenTelemetry Governance Committee The OpenTelemetry Governance Committee helps define the overall roadmap for the open source project, and ensure its continued success. Here's a... OpenTelemetry Read more Gerard van Engelen * 23 Jul 2024 * 5 min read How to set up Grafana Mimir using Ansible Learn how to use the Grafana Ansible collection to deploy and manage Mimir across multiple Linux host so you can explore your data in Grafana. Grafana Mimir OpenTelemetry Grafana Alloy Read more Ishan Jain * 18 Jul 2024 * 7 min read A complete guide to LLM observability with OpenTelemetry and Grafana Cloud Explore why monitoring LLM applications is so important and how you can do it more easily by using OpenTelemetry, Grafana Cloud, and OpenLIT. AI/ML OpenTelemetry Grafana Cloud Read more Sign up for Grafana stack updates [ ] [ ] [ ] Subscribe Sorry, an error occurred. Email update@grafana.com for help. Note: By signing up, you agree to be emailed related product-level information. --------------------------------------------------------------------- --------------------------------------------------------------------- * Grafana * Overview * Deployment options * Plugins * Dashboards * Products * Grafana Cloud * Grafana Cloud Status * Grafana Enterprise Stack * Grafana Cloud Application Observability * Grafana Cloud Asserts | Contextual alerts for root cause analysis * Grafana Cloud Frontend Observability * Grafana Cloud IRM * Grafana Cloud k6 * Grafana Cloud Logs * Grafana Cloud Metrics * Grafana Cloud Profiles * Grafana Cloud Synthetic Monitoring * Grafana Cloud Traces * Open Source * Grafana * Grafana Loki * Grafana Mimir * Grafana OnCall * Grafana Tempo * Grafana Agent * Grafana Alloy * Grafana k6 * Prometheus * Grafana Faro * Grafana Pyroscope * Grafana Beyla * OpenTelemetry * Grafana Tanka * Graphite * GitHub * Learn * Grafana Labs blog * Documentation * Downloads * Community * Community forums * Community Slack * Grafana Champions * Community organizers * ObservabilityCON 2024 * GrafanaCON * The Golden Grot Awards * Successes * Workshops * Videos * OSS vs Cloud * Load testing * Authors * Company * * The team * Press * Careers * * Partnerships * Contact * Getting help * Merch --------------------------------------------------------------------- Grafana Cloud Status Legal and Security Terms of Service Privacy Policy Trademark Policy Copyright 2024 (c) Grafana Labs Grafana Labs uses cookies for the normal operation of this website. Learn more. Got it!