[HN Gopher] Sparrow, a modern C++ implementation of the Apache A...
       ___________________________________________________________________
        
       Sparrow, a modern C++ implementation of the Apache Arrow columnar
       format
        
       Author : SylvainCorlay
       Score  : 61 points
       Date   : 2025-01-31 23:44 UTC (8 hours ago)
        
 (HTM) web link (johan-mabille.medium.com)
 (TXT) w3m dump (johan-mabille.medium.com)
        
       | amluto wrote:
       | This is supposed to be idiomatic?!?                   namespace
       | sp = sparrow;         sp::primitive_array<int> ar = { 1, 3, 5, 7,
       | 9 };         // Caution: get_arrow_structures returns pointers,
       | not values         auto [arrow_array, arrow_schema] =
       | sp::get_arrow_structures(std::move(ar));         // Use
       | arrow_array and arrow_schema as you need (serialization,
       | // passing it to a third party library)         // ...         //
       | do NOT release the C structures in the end, the "ar" variable
       | will do it        // for you
       | 
       | I'm sorry, resources are kept alive by an object _that has been
       | moved from_?
        
         | juunpp wrote:
         | I think that comment is a copy-paste mistake. If you look at
         | the next code snippet, the comment actually makes sense there.
         | 
         | That being said, I've also given up on C++ and learn it mostly
         | to keep up with the job, if that's where you are coming from. I
         | don't find Rust to be a satisfying replacement, though. No
         | language scratches the itch for me right now.
        
         | jandrewrogers wrote:
         | Not speaking to the specific design choices here, but in C++
         | moved-from objects are not destroyed and must be valid in their
         | moved-from state (e.g. a sentinel value to indicate they've
         | been moved) so that they can be destroyed in the indefinite
         | future. This is useful even though "destroy on move" is the
         | correct semantics for most cases. Making "move" and "destroy"
         | distinct operations increases the flexibility and
         | expressiveness.
         | 
         | A common case where this is useful is if the address space
         | where the object lives is accessible, for read or write, by
         | references exogenous to the process like some kinds of shared
         | memory or hardware DMA. If the object is immediately destroyed
         | on move, _it implies the memory can be reused_ while things
         | your process can't control may not know you destroyed the
         | object. This is essentially a use-after-free bug factory. Being
         | able to defer destruction to a point in time when you can
         | guarantee this kind of UAF bug is not possible is valuable.
        
           | quietbritishjim wrote:
           | > Making "move" and "destroy" distinct operations increases
           | the flexibility and expressiveness.
           | 
           | No, it does not. It's an artefact of the evolution of the
           | language and highly undesirable. Rust has destructive moves
           | (and copies built on top of moves, rather than the other way
           | round) and it's far cleaner.
        
             | jandrewrogers wrote:
             | Sure, if you never need to deal with actual low-level high-
             | performance systems code. Just because this use case
             | doesn't apply to anything you do doesn't mean it applies to
             | nobody. This is the kind of attitude that undermines
             | languages like Rust (which I use in my systems). A fair
             | criticism of Rust as a "systems language" is that it simply
             | excludes all the really difficult parts of being a systems
             | language.
             | 
             | C++ deserves a lot of criticism. Many aspects of the
             | language are quite fucked. But willfully ignoring that it
             | solves real problems that other nominal systems languages
             | are unwilling to address doesn't mean those problems don't
             | exist.
        
         | senkora wrote:
         | This bothered me enough to check the source code, because I
         | simply had to know:                   template <layout_or_array
         | A>         std::pair<ArrowArray*, ArrowSchema*>
         | get_arrow_structures(A& a)         {             arrow_proxy&
         | proxy = detail::array_access::get_arrow_proxy(a);
         | return std::make_pair(&(proxy.array()), &(proxy.schema()));
         | }
         | 
         | https://github.com/man-group/sparrow/blob/c01a768f590ebf3b22...
         | 
         | So the answer is that the `std::move` does nothing and should
         | be omitted, because this function only has one overload, and
         | that overload takes its argument by lvalue-reference.
         | 
         | (And as far as I can tell,
         | `detail::array_access::get_arrow_proxy(a)` eventually just
         | reads a member on `a`, so there's no copying anywhere)
         | 
         | It's a harmless mistake; I'm just surprised it wasn't caught.
         | The authors seem pretty experienced with the language.
        
           | quietbritishjim wrote:
           | > So the answer is that the `std::move` does nothing and
           | should be omitted
           | 
           | You can't assign an rvalue into an lvalue reference,
           | precisely to avoid this sort of mistake. If this is the only
           | overload then this just wouldn't compile (e.g. [1]). So the
           | std::move isn't doing nothing, but yes it should be omitted.
           | Maybe it's just a weird typo.
           | 
           | [1] https://ideone.com/4NS5dI
        
       | rubenvanwyk wrote:
       | Seems cool but I have questions about Arcticdb - is Polars,
       | DuckDB etc really so limited for data science analysis that it's
       | justified to write a new library specifically for time-series
       | analysis on S3 files?
        
         | rubenvanwyk wrote:
         | Apparently this: https://www.infoq.com/presentations/arcticdb/
        
       ___________________________________________________________________
       (page generated 2025-02-01 08:00 UTC)