[HN Gopher] Sparrow, a modern C++ implementation of the Apache A...
___________________________________________________________________
Sparrow, a modern C++ implementation of the Apache Arrow columnar
format
Author : SylvainCorlay
Score : 61 points
Date : 2025-01-31 23:44 UTC (8 hours ago)
(HTM) web link (johan-mabille.medium.com)
(TXT) w3m dump (johan-mabille.medium.com)
| amluto wrote:
| This is supposed to be idiomatic?!? namespace
| sp = sparrow; sp::primitive_array<int> ar = { 1, 3, 5, 7,
| 9 }; // Caution: get_arrow_structures returns pointers,
| not values auto [arrow_array, arrow_schema] =
| sp::get_arrow_structures(std::move(ar)); // Use
| arrow_array and arrow_schema as you need (serialization,
| // passing it to a third party library) // ... //
| do NOT release the C structures in the end, the "ar" variable
| will do it // for you
|
| I'm sorry, resources are kept alive by an object _that has been
| moved from_?
| juunpp wrote:
| I think that comment is a copy-paste mistake. If you look at
| the next code snippet, the comment actually makes sense there.
|
| That being said, I've also given up on C++ and learn it mostly
| to keep up with the job, if that's where you are coming from. I
| don't find Rust to be a satisfying replacement, though. No
| language scratches the itch for me right now.
| jandrewrogers wrote:
| Not speaking to the specific design choices here, but in C++
| moved-from objects are not destroyed and must be valid in their
| moved-from state (e.g. a sentinel value to indicate they've
| been moved) so that they can be destroyed in the indefinite
| future. This is useful even though "destroy on move" is the
| correct semantics for most cases. Making "move" and "destroy"
| distinct operations increases the flexibility and
| expressiveness.
|
| A common case where this is useful is if the address space
| where the object lives is accessible, for read or write, by
| references exogenous to the process like some kinds of shared
| memory or hardware DMA. If the object is immediately destroyed
| on move, _it implies the memory can be reused_ while things
| your process can't control may not know you destroyed the
| object. This is essentially a use-after-free bug factory. Being
| able to defer destruction to a point in time when you can
| guarantee this kind of UAF bug is not possible is valuable.
| quietbritishjim wrote:
| > Making "move" and "destroy" distinct operations increases
| the flexibility and expressiveness.
|
| No, it does not. It's an artefact of the evolution of the
| language and highly undesirable. Rust has destructive moves
| (and copies built on top of moves, rather than the other way
| round) and it's far cleaner.
| jandrewrogers wrote:
| Sure, if you never need to deal with actual low-level high-
| performance systems code. Just because this use case
| doesn't apply to anything you do doesn't mean it applies to
| nobody. This is the kind of attitude that undermines
| languages like Rust (which I use in my systems). A fair
| criticism of Rust as a "systems language" is that it simply
| excludes all the really difficult parts of being a systems
| language.
|
| C++ deserves a lot of criticism. Many aspects of the
| language are quite fucked. But willfully ignoring that it
| solves real problems that other nominal systems languages
| are unwilling to address doesn't mean those problems don't
| exist.
| senkora wrote:
| This bothered me enough to check the source code, because I
| simply had to know: template <layout_or_array
| A> std::pair<ArrowArray*, ArrowSchema*>
| get_arrow_structures(A& a) { arrow_proxy&
| proxy = detail::array_access::get_arrow_proxy(a);
| return std::make_pair(&(proxy.array()), &(proxy.schema()));
| }
|
| https://github.com/man-group/sparrow/blob/c01a768f590ebf3b22...
|
| So the answer is that the `std::move` does nothing and should
| be omitted, because this function only has one overload, and
| that overload takes its argument by lvalue-reference.
|
| (And as far as I can tell,
| `detail::array_access::get_arrow_proxy(a)` eventually just
| reads a member on `a`, so there's no copying anywhere)
|
| It's a harmless mistake; I'm just surprised it wasn't caught.
| The authors seem pretty experienced with the language.
| quietbritishjim wrote:
| > So the answer is that the `std::move` does nothing and
| should be omitted
|
| You can't assign an rvalue into an lvalue reference,
| precisely to avoid this sort of mistake. If this is the only
| overload then this just wouldn't compile (e.g. [1]). So the
| std::move isn't doing nothing, but yes it should be omitted.
| Maybe it's just a weird typo.
|
| [1] https://ideone.com/4NS5dI
| rubenvanwyk wrote:
| Seems cool but I have questions about Arcticdb - is Polars,
| DuckDB etc really so limited for data science analysis that it's
| justified to write a new library specifically for time-series
| analysis on S3 files?
| rubenvanwyk wrote:
| Apparently this: https://www.infoq.com/presentations/arcticdb/
___________________________________________________________________
(page generated 2025-02-01 08:00 UTC)