[HN Gopher] OpenTelemetry protocol with Apache Arrow
___________________________________________________________________
OpenTelemetry protocol with Apache Arrow
Author : tanelpoder
Score : 48 points
Date : 2025-05-13 17:57 UTC (5 hours ago)
(HTM) web link (opentelemetry.io)
(TXT) w3m dump (opentelemetry.io)
| andygrove wrote:
| I've just started exploring adding OpenTelemetry support to the
| Comet subproject of DataFusion. I'm excited to see the
| integration with Apache Arrow (Rust) and potentially DataFusion
| in the future.
| SomaticPirate wrote:
| Wow, anyone able to provide a ELI5? OTel sounds amazing but this
| is flying over my head
| theLiminator wrote:
| Not sure, but seems like it will be producing apache arrow data
| and carrying it across the data stack end to end from OTEL.
| This would be great for creating data without a bunch of
| duplication/redundant processing steps and exporting it in a
| form that's ready to query.
| piterrro wrote:
| Unless I dont understand that fully (which could be the
| case).
|
| This idea could fly if downstream readers will be able to
| read it. Json is great because anything can read it, process,
| transform and serialize without having to know the intrisics
| of the protocol.
|
| Whats the point of using binary, columnar format for data in
| transit?
| arccy wrote:
| better compression https://opentelemetry.io/blog/2023/otel-
| arrow/
|
| You don't do high performance without knowing the data
| schema.
| odie5533 wrote:
| Is Arrow better than Parquet or Protobuf?
| theLiminator wrote:
| Arrow is an in-memory columnar format, kinda orthogonal
| to parquet (which is an at-rest format). Protobuf is a
| better comparison, but it's more message oriented and not
| suited for analytics.
| arccy wrote:
| the blog post comparison is against OTLP which is
| protobuf
| phillipcarter wrote:
| Warning: this is an oversimplification.
|
| Performance optimization and being able to "plug in" to the
| data ecosystem that Apache Arrow exists in.
|
| OpenTelemetry is pretty great for a lot of uses, but the
| protocol over the wire is too chunky for some applications
| where. From last year's post on the topic[0]:
|
| > In a side-by-side comparison between OpenTelemetry Protocol
| ("OTLP") and OpenTelemetry Protocol with Apache Arrow for
| similarly configured traces pipelines, we observe 30%
| improvement in compression. Although this study specifically
| focused on traces data, we have observed results for logs and
| metrics signals in production settings too, where OTel-Arrow
| users can expect 50% to 70% improvement relative to OTLP for
| similar pipeline configurations.
|
| For your average set of apps and services running in a k8s
| cluster somewhere in the cloud, this is just a nice-to-have,
| but size on wire is a problem for a lot of systems out there
| today, and they are precluded from adopting OpenTelemetry until
| that's solved.
|
| [0]: https://opentelemetry.io/blog/2024/otel-arrow-production/
| KAdot wrote:
| > We are interested in making OTAP pipelines safely embeddable,
| through strict controls on memory and through support for thread-
| per-core runtimes.
|
| I'm curious about the thread-per-core runtimes, are there even
| any mature thread-per-core runtimes in Rust around?
| jauntywundrkind wrote:
| glommio is pretty well respected.
| https://www.datadoghq.com/blog/engineering/introducing-glomm...
| https://github.com/DataDog/glommio
|
| ByteDance also has their very fast monio.
| https://github.com/bytedance/monoio
|
| Both integrate io-uring support for very fast io.
| julian-datable wrote:
| Integrations with OTLP are critical to driving adoption and
| probably one of the biggest pain points we've encountered when
| adopting it ourselves (and encouraging others to the same).
|
| Adopting OTLP without third-party support is pretty time
| consuming, especially is your tech stack is large and/or varied.
|
| Re runtimes: curious about this too. Feels like the right
| direction if you're optimizing a telemetry pipeline.
| akdor1154 wrote:
| Damn that's some scope creep if I ever saw it: 'try sending Arrow
| frames end to end' => 'rewrite the otel pipeline in rust'. Seems
| like the goals of the contributors don't exactly align with the
| goals of the project.
|
| Kind of a bummer - one thing i was hoping to come out of this was
| better Arrow ecosystem support for golang.
___________________________________________________________________
(page generated 2025-05-13 23:00 UTC)