[HN Gopher] The missing C++ smart pointer
___________________________________________________________________
The missing C++ smart pointer
Author : maattdd
Score : 65 points
Date : 2023-08-20 18:14 UTC (4 hours ago)
(HTM) web link (blog.matthieud.me)
(TXT) w3m dump (blog.matthieud.me)
| waynecochran wrote:
| I like the idea. I have wondered why not have a "garbage
| collected" smart ptr std::gc_ptr<T> that can allow cycles with
| other such ptrs to avoid the short coming of std::shared_ptr<T>.
| You would need to define which gc_ptr's are in your "root set" to
| initiate the "mark and sweep" of the graph of ptrs. This would be
| useful for heavily linked data structures with cycles.
| ape4 wrote:
| This is a sensible next step for C++. I would only "cost" you
| if you used it. And would come in very handy in some
| situations.
| Asooka wrote:
| That is in Microsoft's managed C++ called C++/CLI that is
| compiled for the CLR. It uses the ^ (hat) sigil, i.e. "int^" vs
| "int*".
| waynecochran wrote:
| Yeah, don't do much MS programming anymore. I expected
| something like this in boost.
| quicknir wrote:
| I think there's a bit of confusion here around "value semantics".
|
| No C++ smart pointer has "value semantics", relative to its
| target T. You can see this because == performs address
| comparison, not deep comparison, and `const` methods on the smart
| pointer can be used to mutate the target (e.g. in C++, operator*
| on unique_ptr is always const, and yields a T&).
|
| This is in contrast to Rust, where Box performs deep equality,
| and has deep const/mut. In Rust, Box is basically just a wrapper
| around a value to have it on the heap (enabling things like
| dynamic polymorphism, like in C++). In C++, the pointer is its
| own entity, with its own separate equality, and so on.
|
| Const-ness of operations, operator==, and assignment/copying
| behavior all have to be consistent with each other. For example,
| if `box` was simply `unique_ptr` with a copy constructor
| (somehow, and as the table in the blog post basically implies),
| then you would have that after `auto a = b;`, `a != b`, which
| obviously doesn't work. This means that the hypothetical
| `std::box` would have to have its comparison and const-ness
| adjusted as well. In C++ terms, this isn't really a pointer at
| all. The closest thing to what the author is suggesting is
| actually `polymorphic_value`, I believe, which IIRC has been
| proposed formally (note that it does not have pointer in the
| name).
|
| Also as an aside, smart pointers are not suitable a) for building
| data structures in general, and b) building recursive data
| structures in particular. The former is because meaningfully
| using smart pointers (i.e. letting them handle destruction)
| inside an allocator aware data structure (as many C++ data
| structures tend to be, and even data structures in Rust) would
| require duplicating the allocator over and over. The latter is
| because compilers do not perform TCO in many real world examples
| (and certainly not in debug mode); if you write a linked list
| using `std::unique_ptr` the destructor will blow your stack.
| SamReidHughes wrote:
| I've seen this sort of pointer (assuming the author means it's
| nullable) be called "clone_ptr<T>" but it called T's clone()
| method. Because T might be a base class, invoking the pointee's
| copy constructor in C++ is not a great idea.
| bobbyi wrote:
| Is this similar to std::optional? It's a box containing a value.
| Copying the optional copies the value.
| mike_hock wrote:
| Optional isn't polymorphic.
| codewiz wrote:
| Transparently copyable heap-allocated object would are a recipe
| for introducing invisible performance issues, especially in
| generic code.
|
| Rust requires types to explicitly opt-in to being implicitly
| copied, while C++ requires you to opt-out by deleting the copy-
| constructor.
|
| Accidentally copying small structs on the stack is a minor
| performance problem. Copying an std::box<int> in a hot loop could
| cause heap fragmentation, lock contention and huge amounts of
| wasted memory due to heap alignment requirements (32 bytes on
| 64-bit arches).
| mike_hock wrote:
| You mean, as opposed to the other transparently copyable heap-
| allocated objects such as ... std::vector, std::list,
| std::unordered_map, std::string, ... basically most of the
| standard library other than the smart pointers?
|
| The problem is already there, Box wouldn't change anything.
| Guvante wrote:
| I feel like copy be default makes box a weird paradigm in C++.
|
| In Rust you will love unless you write `Clone`.
| maattdd wrote:
| I consider C++ a copy-by-default language (and pointer/ref
| being the necessary evil which breaks this great default).
|
| In Rust, it's move by default and you need an explicit .clone()
|
| In C++, it's copy by default and you need an explicit
| std::move()
| pwdisswordfishc wrote:
| That's pretty much polymorphic_value <https://wg21.link/p201> or
| indirect_value <https://wg21.link/p1950>
| maattdd wrote:
| Thank you very much for finding those proposals!
| assbuttbuttass wrote:
| I'm not sure I see the benefit vs std::unique_ptr. In the rare
| case you do want to deep-copy a unique_ptr, you can always use
| std::make_unique() to invoke the copy constructor
| LegionMammal978 wrote:
| That was my thought as well; even Rust requires a .clone() call
| to deep-copy a Box<T>, since it doesn't allow implicit copies
| of types without the Copy trait. (Types with that trait must
| effectively be "plain old data" that can be copied byte-by-
| byte.) So I don't see the issue with requiring an explicit copy
| function for std::unique_ptr<T> instead of an implicit copy
| constructor.
| ithkuil wrote:
| But if you clone a struct that contains a Box<T> field then
| that field is also cloned (at least that's the behavior if
| the default derived Clone impl)
| seeknotfind wrote:
| Kind of, though because the language does a lot of things for
| you when you use the built in copy, make_unique gets
| complicated when you want to use std::box inside of another
| structure. You would need to override the default copy
| constructor to get this to work. For instance, vector<box<Foo>>
| wouldn't be possible to implement with unique_ptr because you
| can't override the copy constructor for a templated type.
| std::box would allow you to copy it. As for why you would need
| to do this (over vector<Foo>), consider Foo having subclasses.
| Complexity breeds complexity...
|
| Regardless, I think lifetime annotations would solve far more
| problems than std::box. I really do like box as a suggestion as
| it would help clean up types, make things a bit more explicit
| in a few places, but there are bigger issues with C++ right
| now. This is a great suggestion (as is unique_resource for
| similar on the stack), but a relatively minor thing in the
| scheme of things. Still nice.
| pjmlp wrote:
| Lifetime annotations are somehow complicated without
| subseting the language.
|
| It isn't as if Microsoft, Apple and Google haven't been doing
| it for a while.
|
| https://devblogs.microsoft.com/cppblog/high-confidence-
| lifet...
|
| https://reviews.llvm.org/D15032
| [deleted]
| loeg wrote:
| Yeah, this is just a unique_ptr with a copy operation.
| nemetroid wrote:
| You could make a similar argument about std::unique_ptr, that
| you can always use new and delete to create and destroy.
|
| Which one of these (which implement the same copy/move
| semantics) would you prefer? struct A {
| box<int> thing; } struct B {
| std::unique_ptr<int> thing; B(std::unique_ptr<int> t)
| : thing(std::move(t)) {} B(const
| B& other) { *this = other; }
| B& operator=(const B& other) { if (other.thing)
| thing = std::make_unique<int>(*other.thing); else
| thing.reset(); return *this; }
| B(B&&) = default; B& operator=(B&&) = default;
| ~B() = default; };
|
| edit: nullptr checks.
| Asooka wrote:
| Huh, isn't struct B literally what an implementation of
| std::box would look like? You would need a nullptr check in
| the copy ctor and copy assignment operator to make it
| complete, but other than that, that is exactly how I would
| implement box. Anything that would be in std, you can
| implement yourself, so I would encourage people to try
| implementing box themselves and see how it works for them.
| nemetroid wrote:
| Yes, it's not too difficult (but thanks for the pointer,
| edited). The question was why this smart pointer would be
| useful at all, not just whether it belongs in the standard
| library. Implementing the helper class yourself is not too
| bad, but putting this in domain classes is a lot of cruft.
| fluoridation wrote:
| Personally, I've never needed to copy something that was both
| not trivially copiable _and_ the component parts of which
| were individually trivially copiable. If thing just gets
| initialized immediately, why do you put it in a pointer?
| Usually you would put it in a pointer because you need to
| initialize it in a specific order in the constructor relative
| to the other members (so presumably also in the copy
| constructor), or because it 's a polymorphic object (so you
| can't just use the copy constructor of the static type). If
| neither of those is true, why use a pointer at all? Just make
| the member a simple object.
| nemetroid wrote:
| Types that are self-referential in one way or another, e.g.
| a graph where nodes point to each other, or a "main object"
| containing subobjects which point back to the main object.
| There is no way to implement an efficient move for such
| types (it would need to adjust all pointers), so
| implementing move operations would be misleading. If you
| want to be able to pass around ownership of such a type, it
| needs to happen through pointer indirection.
|
| But you might want to allow a copy operation that recreates
| the entire structure. The proposed box<T> offers exactly
| the desired copy/move semantics.
| zabzonk wrote:
| > Inspired by Box<T> in Rust, the std::box<T> would be a heap-
| allocated smart pointer.
|
| so, is it the pointer that is heap-allocated or the pointee?
| frankly, i find this article somewhat incoherent, and
| importantly, it lacks code examples illustrating what it is
| talking about.
| mgraczyk wrote:
| pointee
| esrauch wrote:
| It doesn't seem that complicated: they want unique_ptr that is
| copyable and copies the underlying pointee as well.
|
| I think this would be way too big of a footgun: with implicit
| copy it would be too easy to pass box instead of box& and
| accidentally make copies when you didn't mean to. Box is not
| copy in Rust which avoids that problem, so really the
| equivalent in C++ would be just to add a .deepcopy() function
| on uniqueptr which is only implemented if the underlying type
| has a copy ctor.
| zabzonk wrote:
| but you can do that anyway - just copy the thing the
| unique_ptr points to, and make another unique_ptr point to
| it. but i really don't understand what copying a unique_ptr
| implicitly would mean.
|
| also, c++ has (wisely, imho) rejected the concept of a "deep
| copy" - we just have copies, of varying depths.
| esrauch wrote:
| > but i really don't understand what copying a unique_ptr
| implicitly would mean.
|
| The author is imagining box would be like unique_ptr but
| with an implicit copy constructor that copies the heap
| data: unique_ptr(unique_ptr<T>& o) : unique
| _ptr(*o) {}
|
| Then if you have this code: function
| f(unique_ptr<int> ptr) { cout << *ptr; }
| f(some_uniq); // implicitly copies the uniqueptr and its
| data
| zabzonk wrote:
| isn't this the semantics of any old c++ object with a
| copy constructor (and all the other stuff, of course)
| LordShredda wrote:
| std::vector<MyType> is a pretty good 'box' like container.
| Dynamically allocated and applied RAII semantics. If you only
| want one instance, then dynamic allocation shouldn't (?) be
| necessary.
| frozenport wrote:
| Without code examples its really hard to judge.
|
| The lack of this type can be viewed as a pessimization for
| copying objects.
| tialaramex wrote:
| I hadn't even thought about it, I was like Box<T> is basically
| std::unique_ptr<T> anyway so what's the point -- but yes, Rust's
| types all either can't be copied at all, or they implement Clone
| and thus Clone::clone, which is what you'd call a "deep copy" if
| you're used to that nomenclature.
|
| I think the underlying cause is that Rust's assignment semantic
| is a destructive move, not a copy+, which frees up the
| opportunity for an actual copy to be potentially expensive,
| matching reality. In a language where assignment _is_ copy, that
| operation _must_ be cheap and so we 've obliged to make up an
| excuse for how although this is a "copy" it doesn't behave the
| way you want, it's just a "shallow copy".
|
| + Although it will nearly always work to think of Rust's
| assignments as destructive move, as an optimisation types whose
| representation _is_ their meaning can choose to implement Copy, a
| trait which says to the compiler that it 's fine to actually just
| copy my bits, I have no deeper meaning - thus if the type you're
| using is Copy then assignments for that type are in fact
| performed just by copying and don't destroy anything. So a byte,
| a 64-bit floating point number, a 4CC, an IP address none of
| those have some larger significance, they're Copy, but a string,
| a HashMap, some custom object you made (unless it can and did opt
| in to Copy), those are not Copy.
|
| Crucially, from an understanding point of view. Implementing Copy
| _requires_ a trivial implementation of Clone. As a result it
| feels very natural.
| arximboldi wrote:
| The immutable data-structures library Immer provides such type:
|
| https://sinusoid.es/immer/containers.html#box
| tedunangst wrote:
| What would prevent me from making a std::box<FILE*> and blowing
| up my program?
| jng wrote:
| I have written and been using that same smart pointer type for
| years, under the pretty horrible name of holder_cloner_t<> (at
| least it's clear). It is indeed the right solution to a very
| common and important type of problem. Looking forward to
| something like this in the standard library one of these decades.
| zabzonk wrote:
| > It is indeed the right solution to a very common and
| important type of problem
|
| if it is such a solution to such a common problem (both of
| which i dispute), why do you think it is not alredy in the
| standard library?
| randomNumber7 wrote:
| The problem I see with this is, that you don't always know how to
| make a deep copy. Who knows, what happens when you copy a
| variable of type Foo?
|
| Taking that aside, I agree it would make a lot of sense to write
| code in that style^^
| zer0zzz wrote:
| This kinda reminds me of the cow pattern a lot of swift stdlib
| types use.
| nly wrote:
| Rip off of this article from 6 years ago?
|
| https://hackernoon.com/value-ptr-the-missing-c-smart-pointer...
|
| https://buckaroo.pm/blog/value-ptr-the-missing-smart-ptr
|
| People have been writing pointer-like value semantic wrappers for
| type-erasure for decades.
| klyrs wrote:
| I'm not sure I'd call the article a ripoff, but buckaroo's
| comes with an implementation:
| https://github.com/LoopPerfect/valuable
| maattdd wrote:
| (I'm the author of the blog post)
|
| Indeed it's _exactly_ the same idea (and I think `value_ptr` is
| actually a better name than `box`).
|
| I've googled before writing the blog post and haven't found
| literature about this (but I was mostly googling stuff around
| "box" or "deep shared ptr").
|
| Thank you for the link! Kinda glad it's more widespread than
| what I though actually, it means we can think about making an
| official proposal.
| andrewmcwatters wrote:
| "Boxing" as a term is a fairly well known concept. Unboxed
| values, etc.
| nly wrote:
| There was a proposal for such a thing in 2014
|
| A Proposal for the World's Dumbest Smart Pointer, v4 - Open
| Standards https://open-
| std.org/jtc1/sc22/wg21/docs/papers/2014/n4282.p...
| brtv wrote:
| > std::box<T> addresses these issues by offering deep copying and
| automatic garbage collection
|
| This is pretty much impossible when holding a pointer of base
| class. However, this is a primary reason for having pointers in
| the first place (polymorphism, and having abstract base classes).
|
| In all other cases, you're probably better off with either the
| raw value, std::variant or std::reference_wrapper.
| gpderetta wrote:
| It is actually super easy, barely an inconvenience, as long as
| you know the actual dynamic type at construction time.
|
| For example shared-ptr to base can correctly invoke the correct
| derived type destructor even if the destructor is not virtual.
|
| Edit: accidentally a word.
| brtv wrote:
| You always know the actual dynamic type at construction time,
| how would you otherwise construct it?
|
| > For example shared-ptr to base can correctly invoke the
| correct derived type
|
| Invoke what exactly? Im sorry I don't understand what you're
| trying to say here.
|
| I guess you can force all derivied types to implement a
| clone() function, such that box<T> can do the deep copy, but
| Id consider that a fairly big inconvenience for such a simple
| pointer type.
| gpderetta wrote:
| No need for T to have a clone function. You can use
| standard type erasure techniques. Consider how std::
| function or std::any is implemented.
| liquidify wrote:
| std::variant is a nasty thing. I always try hard to find a way
| around it, and there is almost always a better way around it.
| nly wrote:
| What's so nasty about it?
|
| I find it a delight
| seeknotfind wrote:
| Works with RTTI?
| brtv wrote:
| With polymorphism, you typically want base classes that
| provide a general interface, that many classes can derive. In
| places where you use this pointer-to-base, you don't
| need/want any knowledge of the derived type. It is an
| unneeded depedancy, which would only increase compile time,
| or worse, cause circular dependancies.
|
| I'm not a big fan of RTTI, and not even sure if it would work
| here. But once you start keeping track of all derived types,
| you might as well use an std:: variant. It's more cache
| friendly too, so more performant in many cases.
| mr_00ff00 wrote:
| There are some (a lot of) changes that C++ should have inspired
| by Rust, but as the other comments have said. I really don't feel
| like a smart pointer that acts like unique but copies by value is
| all that necessary.
|
| That doesn't seem to fix a memory bug (cause doing this with a
| unique ptr, then the compiler would yell at you for using copy),
| it seems to just make it easier than having to write
| `std::make_unique(*otherptr);`
| clnq wrote:
| The author could implement the deep-copying pointer and share the
| .h on their GitHub. You don't need a language extension for this
| in C++ as types can implement operator overrides and copy and
| move constructors.
|
| But I doubt many people would use it, and that's probably why it
| doesn't belong in std::.
|
| In contrast, before C++ 11, developers would write their own
| RAII-style smart pointers. So it made sense to save them the
| labor. I don't think a pointer that doesn't allow shallow copies
| is usually found in codebases. It sounds like a specific use-case
| pointer.
|
| It's a neat type that people coming from other languages could
| like, but maybe not quite standard library-ready?
| softwaredoug wrote:
| I just do my darndest never ever use the heap. Let the stack be
| your memory manager.
|
| I think if you come from other languages you assume the heap is
| the default when it should be the exception.
| pstrateman wrote:
| which works great until you need more than 8MiB of memory.
|
| stack is usually quite limited unless you very much control the
| environment
| maattdd wrote:
| How do you create recursive data structure on the stack?
| rightbyte wrote:
| Using pointers to the stack?
| maattdd wrote:
| Pointers to stack sound like a recipe for disaster (all
| those pointers will be dangling very very soon).
| galangalalgol wrote:
| Is the normal stack size for a thread still 8MB? Most of my c++
| has always been simulation or signal processing where a buffer
| of objects to be simulated or samples/pixels to be processed
| always exceed sane stack sizes. The standard answer is to
| preallocate these buffers and reuse them. Smart pointers are
| pretty useful in the simulation case, but not really necessary
| for the signal processing.
| cvccvroomvroom wrote:
| Low level-capable languages need GC lifecycle hooks and
| replacements similar to Rust alloc, but flexible enough to plug
| in BWS, ORCA, or DIY.
|
| I also think the semantics of shared and unshared const and
| mutable state need to be made explicit. Pony is very good about
| this more so than Rust by bringing into the language.
| cjensen wrote:
| Seems to me that the critical problem with this idea is "deep
| copy."
|
| There is no builtin deep copy facility. Without the facility then
| a box pointer would be dangerous leading to weird effects when
| the copy is too shallow.
|
| You could solve deep copy with a template that relies on each
| class providing a deep copy function if one is needed. But again,
| this will make bugs if someone forgets to provide the function.
|
| Rather than make an error-prone feature in the standard library,
| I think it would be better to just explicitly roll this yourself.
| A sensible constructor copy should already do a deep copy -- or
| ensure copy-on-write to simulate a deep copy. So copying is as
| easy a calling make_shared (original) or make_unique (original).
| jbandela1 wrote:
| A non_null_unique_ptr<T>
|
| that is enforced at compile time would be far more valuable for
| me. That would mean some kind of destructive move where the
| compiler guarantees that you can not access a moved from object.
| fluoridation wrote:
| T && gets you the semantics you're looking for.
|
| EDIT: Why are you booing me? I'm right.
| devit wrote:
| T&& doesn't run the T destructor when it's destroyed, which
| is the whole point of unique_ptr.
| kccqzy wrote:
| Not the compiler but today you can already have a ClangTidy
| check: https://clang.llvm.org/extra/clang-
| tidy/checks/bugprone/use-...
|
| If your build system respects ClangTidy checks and turn them
| into errors, it's effectively the same as a compiler guarantee.
| [deleted]
| gumby wrote:
| First of all, Box is a terrible name because the term has been
| used for boxed pointers (putting tag bits into unused parts of
| the pointer) and, in some languages, immediates, for four decades
| at least. Also all C++ standard identifiers are thankfully
| lowercase snake case.
|
| I don't understand why the author writes "raw value is
| straightforward and efficient... However, you can't allocate them
| dynamically and you can't build recursive data structure such as
| a linked list or a tree with them." There is clearly something I
| don't understand here. Consider an int -- you can dynamically
| allocate one, you can put it in a tree. Putting a box into a tree
| will still require other data applicable to the tree itself, same
| as an int), and so on. So I don't understand the point being made
| here.
|
| And the deep copy behavior is rarely what I want in a mutable
| structure anyway (it's always safe, if usually wasteful, in a R/O
| structure).
| Thorrez wrote:
| >There is clearly something I don't understand here. Consider
| an int -- you can dynamically allocate one, you can put it in a
| tree putting a box into a tree will still require other data
| applicable to the tree itself, same as an int), and so on. So I
| don't understand the point being made here.
|
| To make a tree, you need something allowing an object (a tree
| node) to own an object of the same type as itself (another tree
| node). There are various options for this: raw pointer,
| unique_ptr, shared_ptr, auto_ptr, box. Without using one of
| those options, the tree isn't possible. That's the author's
| point. std::box can be used in place of one of the other
| pointer types.
| nurettin wrote:
| Can't you use edge lists/arrays to represent your tree? And
| maybe even topologically sort your node list according to
| whatever kind of traversal you want to make? At least that's
| what I used to do in the qbasic days.
| gumby wrote:
| I understand that fine. My problem is the author calls out
| the "raw value" as distinct from "raw pointer", implying a
| non pointer value which would be the _content_ of a tree
| node. This makes the entire paragraph hard to make sense of.
| maattdd wrote:
| We don't care about the content of the node. The
| interesting part is how do you define the Node itself.
|
| struct Node<T> { Node<T> left; Node<T> right; };
|
| would obviously be the best choice (what I call in the
| article "raw value"). But it's not valid C++.
| gumby wrote:
| Ah, thanks.
| gct wrote:
| I'd rather have an optionally-owned pointer type so I can handle
| writing virtual methods that can return a value aliasing an
| already existing value or create a new one on demand. Otherwise
| you either have to roll your own or bloat your API:
| virtual std::flexptr<Thing> get_expensive_thing();
|
| vs: virtual bool has_expensive_thing();
| virtual const Thing& get_expensive_thing(); virtual
| std::unique_ptr<Thing> build_expensive_thing();
|
| It'd probably have shared_ptr semantics but you'd have to treat
| it as a const ref for lifetime purposes, which might make it
| distasteful to the std library folks.
___________________________________________________________________
(page generated 2023-08-20 23:01 UTC)