[HN Gopher] OpenTelemetry protocol with Apache Arrow
       ___________________________________________________________________
        
       OpenTelemetry protocol with Apache Arrow
        
       Author : tanelpoder
       Score  : 48 points
       Date   : 2025-05-13 17:57 UTC (5 hours ago)
        
 (HTM) web link (opentelemetry.io)
 (TXT) w3m dump (opentelemetry.io)
        
       | andygrove wrote:
       | I've just started exploring adding OpenTelemetry support to the
       | Comet subproject of DataFusion. I'm excited to see the
       | integration with Apache Arrow (Rust) and potentially DataFusion
       | in the future.
        
       | SomaticPirate wrote:
       | Wow, anyone able to provide a ELI5? OTel sounds amazing but this
       | is flying over my head
        
         | theLiminator wrote:
         | Not sure, but seems like it will be producing apache arrow data
         | and carrying it across the data stack end to end from OTEL.
         | This would be great for creating data without a bunch of
         | duplication/redundant processing steps and exporting it in a
         | form that's ready to query.
        
           | piterrro wrote:
           | Unless I dont understand that fully (which could be the
           | case).
           | 
           | This idea could fly if downstream readers will be able to
           | read it. Json is great because anything can read it, process,
           | transform and serialize without having to know the intrisics
           | of the protocol.
           | 
           | Whats the point of using binary, columnar format for data in
           | transit?
        
             | arccy wrote:
             | better compression https://opentelemetry.io/blog/2023/otel-
             | arrow/
             | 
             | You don't do high performance without knowing the data
             | schema.
        
               | odie5533 wrote:
               | Is Arrow better than Parquet or Protobuf?
        
               | theLiminator wrote:
               | Arrow is an in-memory columnar format, kinda orthogonal
               | to parquet (which is an at-rest format). Protobuf is a
               | better comparison, but it's more message oriented and not
               | suited for analytics.
        
               | arccy wrote:
               | the blog post comparison is against OTLP which is
               | protobuf
        
         | phillipcarter wrote:
         | Warning: this is an oversimplification.
         | 
         | Performance optimization and being able to "plug in" to the
         | data ecosystem that Apache Arrow exists in.
         | 
         | OpenTelemetry is pretty great for a lot of uses, but the
         | protocol over the wire is too chunky for some applications
         | where. From last year's post on the topic[0]:
         | 
         | > In a side-by-side comparison between OpenTelemetry Protocol
         | ("OTLP") and OpenTelemetry Protocol with Apache Arrow for
         | similarly configured traces pipelines, we observe 30%
         | improvement in compression. Although this study specifically
         | focused on traces data, we have observed results for logs and
         | metrics signals in production settings too, where OTel-Arrow
         | users can expect 50% to 70% improvement relative to OTLP for
         | similar pipeline configurations.
         | 
         | For your average set of apps and services running in a k8s
         | cluster somewhere in the cloud, this is just a nice-to-have,
         | but size on wire is a problem for a lot of systems out there
         | today, and they are precluded from adopting OpenTelemetry until
         | that's solved.
         | 
         | [0]: https://opentelemetry.io/blog/2024/otel-arrow-production/
        
       | KAdot wrote:
       | > We are interested in making OTAP pipelines safely embeddable,
       | through strict controls on memory and through support for thread-
       | per-core runtimes.
       | 
       | I'm curious about the thread-per-core runtimes, are there even
       | any mature thread-per-core runtimes in Rust around?
        
         | jauntywundrkind wrote:
         | glommio is pretty well respected.
         | https://www.datadoghq.com/blog/engineering/introducing-glomm...
         | https://github.com/DataDog/glommio
         | 
         | ByteDance also has their very fast monio.
         | https://github.com/bytedance/monoio
         | 
         | Both integrate io-uring support for very fast io.
        
       | julian-datable wrote:
       | Integrations with OTLP are critical to driving adoption and
       | probably one of the biggest pain points we've encountered when
       | adopting it ourselves (and encouraging others to the same).
       | 
       | Adopting OTLP without third-party support is pretty time
       | consuming, especially is your tech stack is large and/or varied.
       | 
       | Re runtimes: curious about this too. Feels like the right
       | direction if you're optimizing a telemetry pipeline.
        
       | akdor1154 wrote:
       | Damn that's some scope creep if I ever saw it: 'try sending Arrow
       | frames end to end' => 'rewrite the otel pipeline in rust'. Seems
       | like the goals of the contributors don't exactly align with the
       | goals of the project.
       | 
       | Kind of a bummer - one thing i was hoping to come out of this was
       | better Arrow ecosystem support for golang.
        
       ___________________________________________________________________
       (page generated 2025-05-13 23:00 UTC)