[HN Gopher] OpenTelemetry for Go: Measuring overhead costs
___________________________________________________________________
OpenTelemetry for Go: Measuring overhead costs
Author : openWrangler
Score : 80 points
Date : 2025-06-16 15:09 UTC (7 hours ago)
(HTM) web link (coroot.com)
(TXT) w3m dump (coroot.com)
| dmoy wrote:
| Not on original topic, but:
|
| I definitely prefer having graphs put the unit at least on the
| axis, if not in the individual axis labels directly.
|
| I.e. instead of having a graph titled "latency, seconds" at the
| top and then way over on the left have an unlabeled axis with
| "5m, 10m, 15m, 20m" ticks...
|
| I'd rather have title "latency" and either "seconds" on the left,
| or, given the confusion between "5m = 5 minutes" or "5m = 5
| milli[seconds]", just have it explicitly labeled on each tick:
| 5ms, 10ms, ...
|
| Way, way less likely to confuse someone when the units are right
| on the number, instead of floating way over in a different
| section of the graph
| Thaxll wrote:
| Logging, metrics and traces are not free, especially if you turn
| them on at every requests.
|
| Tracing every http 200 at 10k req/sec is not something you should
| be doing, at that rate you should sample 200 ( 1% or so ) and
| trace all the errors.
| anonzzzies wrote:
| A very small % of startups gets anywhere near that traffic so
| why give them angst? Most people can just do this without any
| issues and learn from it and a tiny fraction shouldn't.
| williamdclt wrote:
| 10k/s across multiple services is reached quickly even at
| startup scale.
|
| In my previous company (startup), we'd use Otel everywhere
| and we definitely needed sampling for cost reasons (1/30
| iirc). And that was using a much cheaper provider than
| Datadog
| cogman10 wrote:
| Having high req/s isn't as big a negative as it once was.
| Especially if you are using http2 or http3.
|
| Designing APIs which cause a high number of requests and spit
| out a low amount of data can be quite legitimate. It allows
| for better scaling and capacity planning vs having single
| calls that take a large amount of time and return large
| amounts of data.
|
| In the old http1 days, it was a bad thing because a single
| connection could only service 1 request at a time. Getting
| any sort of concurrency or high request rates require many
| connections (which had a large amount of overhead due to the
| way tcp functions).
|
| We've moved past that.
| orochimaaru wrote:
| Metrics are usually minimal overheard. Traces need to be
| sampled. Logs need to be sampled at error/critical levels. You
| also need to be able to dynamically change sampling and log
| levels.
|
| 100% traces are a mess. I didn't see where he setup sampling.
| phillipcarter wrote:
| The post didn't cover sampling, which indeed, significantly
| reduces overhead in OTel because the spans that aren't
| sampled aren't ever created, when you head sample at the SDK
| level. This is more of a concern when doing tail-based
| sampling only, wherein you will want to trace each request
| and offload to a sidecar so that export concerns are handled
| outside your app. And then it routes to a sampler elsewhere
| in your infrastructure.
|
| FWIW at my former employer we had some fairly loose
| guidelines for folks around sampling:
| https://docs.honeycomb.io/manage-data-
| volume/sample/guidelin...
|
| There's outliers, but the general idea is that there's also a
| high cost to implementing sampling (especially for nontrivial
| stuff), and if your volume isn't terribly high then you'll
| probably eat a lot more in time than paying for the extra
| data you may not necessarily need.
| jhoechtl wrote:
| I am relatively new to the topic. In the sample code of the OP
| there is no logging right? It's metrics and traces but no
| logging.
|
| How is logging in OTel?
| shanemhansen wrote:
| To me traces (or maybe more specifically spans) are
| essentially a structured log with a unique ID and a reference
| to a parent ID.
|
| Very open to have someone explain why I'm wrong or why they
| should be handled separately.
| kiitos wrote:
| Traces have a very specific data model, and corresponding
| limitations, which don't really accommodate log
| events/messages of arbitrary size. The access model for
| traces is also fundamentally different vs. that of logs.
| phillipcarter wrote:
| There are practical limitations mostly with backend
| analysis tools. OTel does not define a limit on how large
| a span is. It's quite common in LLM Observability to
| capture full prompts and LLM responses as attributes on
| spans, for example.
| phillipcarter wrote:
| Logging in OTel is logging with your logging framework of
| choice. The SDK just requires you initialize the wrapper and
| it'll then wrap your existing logging calls and correlate
| term with a trace/span in active context, if it exists. There
| is no separate logging API to learn. Logs are exported in a
| separate pipeline from traces and metrics.
|
| Implementation for many languages are starting to mature,
| too.
| kubectl_h wrote:
| You have to do the tracing anyway if you are going to sample
| based on criteria that isn't available at the beginning of the
| trace (like an error that occurs later in the request) and tail
| sample. You can head sample of course, but that's going to be
| the most coarse sampling you can do and you can't sample based
| on anything but the initial conditions of the trace.
|
| What we have started doing is still tracing every unit of work,
| but deciding at the root span the level of instrumentation
| fidelity we want for the trace based on the initial conditions.
| Spans are still generated in the lifecycle of the trace, but we
| discard them at the processor level (before they are batched
| and sent to the collector) unless they have errors on them or
| the trace has been marked as "full fidelity".
| kiitos wrote:
| > Tracing every http 200 at 10k req/sec is not something you
| should be doing
|
| You don't know if a request is HTTP 200 or HTTP 500 until it
| ends, so you have to at least _collect_ trace data for every
| request as it executes. You can decide whether or not to _emit_
| trace data for a request based on its ultimate response code,
| but _emission_ is gonna be out-of-band of the request
| lifecycle, and (in any reasonable implementation) amortized
| such that you really shouldn 't need to care about sampling
| based on outcome. That is, the cost of collection is >> the
| cost of emission.
|
| If your tracing system can't handle 100% of your traffic,
| that's a problem in that system; it's definitely not any kind
| of universal truth... !
| jeffbee wrote:
| I feel like this is a lesson that unfortunately did not escape
| Google, even though a lot of these open systems came from Google
| or ex-Googlers. The overhead of tracing, logs, and metrics needs
| to be ultra-low. But the (mis)feature whereby a trace span can be
| sampled _post hoc_ means that you cannot have a nil tracer that
| does nothing on unsampled traces, because it could become sampled
| later. And the idea that if a metric exists it must be centrally
| collected is totally preposterous, makes everything far too
| expensive when all a developer wants is a metric that costs
| nothing in the steady state but can be collected when needed.
| mamidon wrote:
| How would you handle the case where you want to trace 100% of
| errors? Presumably you don't know a trace is an error until
| after you've executed the thing and paid the price.
| phillipcarter wrote:
| This is correct. It's a seemingly simple desire -- "always
| capture whenever there's a request with an error!" -- but the
| overhead needed to set that up gets complex. And then you
| start heading down the path of "well THESE business
| conditions are more important than THOSE business
| conditions!" and before you know it, you've got a nice little
| tower of sampling cards assembled. It's still worth it, just
| a hefty tax at times, and often the right solution is to just
| pay for more compute and data so that your engineers are
| spending less time on these meta-level concerns.
| jeffbee wrote:
| I wouldn't. "Trace contains an error" is a hideously bad
| criterion for sampling. If you have some storage subsystem
| where you always hedge/race reads to two replicas then cancel
| the request of the losing replica, then all of your traces
| will contain an error. It is a genuinely terrible feature.
|
| Local logging of error conditions is the way to go. And I
| mean local, not to a central, indexed log search engine;
| that's also way too expensive.
| phillipcarter wrote:
| I disagree that it's a bad criterion. The case you describe
| is what sounds difficult, treating one error as part of
| normal operations and another as not. That should be
| considered its own kind of error or other form of response,
| and sampling decisions could take that into consideration
| (or not).
| jeffbee wrote:
| Another reason against inflating sampling rates on errors
| is: for system stability you never want to do more stuff
| during errors than you would normally do. Doing something
| more expensive during an error can cause your whole
| system, or elements of it, to latch into an unplanned
| operating point where they only have the capacity to do
| the expensive error path, and all of the traffic is
| throwing errors because of the resource starvation.
| vanschelven wrote:
| The article never really explains what eBPF is -- AFAIU, it's a
| kernel feature that lets you trace syscalls and network events
| without touching your app code. Low overhead, good for metrics,
| but not exactly transparent.
|
| It's the umpteenth OTEL-critical article on the front page of HN
| this month alone... I have to say I share the sentiment but
| probably for different reasons. My take is quite the opposite:
| most value is precisely in the application (code) level so you
| definetly should instrument... and then focus on Errors over
| "general observability"[0]
|
| [0] https://www.bugsink.com/blog/track-errors-first/
| nikolay_sivko wrote:
| I'm the author. I wouldn't say the post is critical of OTEL. I
| just wanted to measure the overhead, that's all. Benchmarks
| shouldn't be seen as critique. Quite the opposite, we can only
| improve things if we've measured them first.
| politician wrote:
| I don't want to take away from your point, and yet... if anyone
| lacks background knowledge these days the relevant context is
| just an LLM prompt away.
| vanschelven wrote:
| It was always "a search away" but on the _web_ one might as
| well use... A hyperlink
| sa46 wrote:
| Funny timing--I tried optimizing the Otel Go SDK a few weeks ago
| (https://github.com/open-telemetry/opentelemetry-
| go/issues/67...).
|
| I suspect you could make the tracing SDK 2x faster with some
| cleverness. The main tricks are:
|
| - Use a faster time.Now(). Go does a fair bit of work to convert
| to the Go epoch.
|
| - Use atomics instead of a mutex. I sent a PR, but the reviewer
| caught correctness issues. Atomics are subtle and tricky.
|
| - Directly marshal protos instead of reflection with a hand-
| rolled library or with
| https://github.com/VictoriaMetrics/easyproto.
|
| The gold standard is how TiDB implemented tracing
| (https://www.pingcap.com/blog/how-we-trace-a-kv-database-
| with...). Since Go purposefully (and reasonably) doesn't
| currently provide a comparable abstraction for thread-local
| storage, we can't implement similar tricks like special-casing
| when a trace is modified on a single thread.
| malkia wrote:
| There is an effort to use arrow format for metrics too -
| https://github.com/open-telemetry/otel-arrow - but no client
| that exports directly to it yet.
| rastignack wrote:
| Would the sync.Pool trick mentionned here:
| https://hypermode.com/blog/introducing-ristretto-high-perf-g...
| help ? It's lossy but might be a good compromise.
| otterley wrote:
| Out of curiosity, does Go's built-in pprof yield different
| results?
|
| The nice thing about Go is that you don't need an eBPF module to
| get decent profiling.
|
| Also, CPU and memory instrumentation is built into the Linux
| kernel already.
| coxley wrote:
| The OTel SDK has always been much worse to use than Prometheus
| for metrics -- including higher overhead. I prefer to only use it
| for tracing for that reason.
| reactordev wrote:
| Mmmmmmm, the last 8 months of my life wrapped into a blog post
| but with an ad on the end. Excellent. Basically the same findings
| as me, my team, and everyone else in the space.
|
| Not being sarcastic at all, it's tricky. I like that the article
| called out eBPF and why you would want to disable it for speed
| but recommends caution. I kept hearing from executives a "single
| pane of glass" marketing speak and I kept my mouth shut about how
| that isn't feasible across the entire organization. Needless to
| say, they didn't like that non-answer and so I was canned. What
| an engineer cared about is different from organization/business
| metrics and often the two were confused.
|
| I wrote a lot of great otel receivers though. VMware, Veracode,
| Hashicorp Vault, GitLab, Jenkins, Jira, and the platforms itself.
| phillipcarter wrote:
| > I kept hearing from executives a "single pane of glass"
| marketing speak
|
| It's really unfortunate that Observability vendors lean into
| this to reinforce it too. What the execs usually care about is
| engineering workflows consolidating and allowing teams to all
| "speak the same language" in terms of data, analysis workflows,
| visualizations, runbooks, etc.
|
| This goal is admirable, but nearly impossible to achieve
| because it's the exact same problem as solving "we are aligned
| organizationally", which no organization ever is.
|
| That doesn't mean progress can't be made, but it's always far
| more complicated than they would like.
| reactordev wrote:
| For sure, it's the ultimate nirvana. Let me know when an
| organization gets there. :)
___________________________________________________________________
(page generated 2025-06-16 23:00 UTC)