[HN Gopher] The missing C++ smart pointer
       ___________________________________________________________________
        
       The missing C++ smart pointer
        
       Author : maattdd
       Score  : 65 points
       Date   : 2023-08-20 18:14 UTC (4 hours ago)
        
 (HTM) web link (blog.matthieud.me)
 (TXT) w3m dump (blog.matthieud.me)
        
       | waynecochran wrote:
       | I like the idea. I have wondered why not have a "garbage
       | collected" smart ptr std::gc_ptr<T> that can allow cycles with
       | other such ptrs to avoid the short coming of std::shared_ptr<T>.
       | You would need to define which gc_ptr's are in your "root set" to
       | initiate the "mark and sweep" of the graph of ptrs. This would be
       | useful for heavily linked data structures with cycles.
        
         | ape4 wrote:
         | This is a sensible next step for C++. I would only "cost" you
         | if you used it. And would come in very handy in some
         | situations.
        
         | Asooka wrote:
         | That is in Microsoft's managed C++ called C++/CLI that is
         | compiled for the CLR. It uses the ^ (hat) sigil, i.e. "int^" vs
         | "int*".
        
           | waynecochran wrote:
           | Yeah, don't do much MS programming anymore. I expected
           | something like this in boost.
        
       | quicknir wrote:
       | I think there's a bit of confusion here around "value semantics".
       | 
       | No C++ smart pointer has "value semantics", relative to its
       | target T. You can see this because == performs address
       | comparison, not deep comparison, and `const` methods on the smart
       | pointer can be used to mutate the target (e.g. in C++, operator*
       | on unique_ptr is always const, and yields a T&).
       | 
       | This is in contrast to Rust, where Box performs deep equality,
       | and has deep const/mut. In Rust, Box is basically just a wrapper
       | around a value to have it on the heap (enabling things like
       | dynamic polymorphism, like in C++). In C++, the pointer is its
       | own entity, with its own separate equality, and so on.
       | 
       | Const-ness of operations, operator==, and assignment/copying
       | behavior all have to be consistent with each other. For example,
       | if `box` was simply `unique_ptr` with a copy constructor
       | (somehow, and as the table in the blog post basically implies),
       | then you would have that after `auto a = b;`, `a != b`, which
       | obviously doesn't work. This means that the hypothetical
       | `std::box` would have to have its comparison and const-ness
       | adjusted as well. In C++ terms, this isn't really a pointer at
       | all. The closest thing to what the author is suggesting is
       | actually `polymorphic_value`, I believe, which IIRC has been
       | proposed formally (note that it does not have pointer in the
       | name).
       | 
       | Also as an aside, smart pointers are not suitable a) for building
       | data structures in general, and b) building recursive data
       | structures in particular. The former is because meaningfully
       | using smart pointers (i.e. letting them handle destruction)
       | inside an allocator aware data structure (as many C++ data
       | structures tend to be, and even data structures in Rust) would
       | require duplicating the allocator over and over. The latter is
       | because compilers do not perform TCO in many real world examples
       | (and certainly not in debug mode); if you write a linked list
       | using `std::unique_ptr` the destructor will blow your stack.
        
       | SamReidHughes wrote:
       | I've seen this sort of pointer (assuming the author means it's
       | nullable) be called "clone_ptr<T>" but it called T's clone()
       | method. Because T might be a base class, invoking the pointee's
       | copy constructor in C++ is not a great idea.
        
       | bobbyi wrote:
       | Is this similar to std::optional? It's a box containing a value.
       | Copying the optional copies the value.
        
         | mike_hock wrote:
         | Optional isn't polymorphic.
        
       | codewiz wrote:
       | Transparently copyable heap-allocated object would are a recipe
       | for introducing invisible performance issues, especially in
       | generic code.
       | 
       | Rust requires types to explicitly opt-in to being implicitly
       | copied, while C++ requires you to opt-out by deleting the copy-
       | constructor.
       | 
       | Accidentally copying small structs on the stack is a minor
       | performance problem. Copying an std::box<int> in a hot loop could
       | cause heap fragmentation, lock contention and huge amounts of
       | wasted memory due to heap alignment requirements (32 bytes on
       | 64-bit arches).
        
         | mike_hock wrote:
         | You mean, as opposed to the other transparently copyable heap-
         | allocated objects such as ... std::vector, std::list,
         | std::unordered_map, std::string, ... basically most of the
         | standard library other than the smart pointers?
         | 
         | The problem is already there, Box wouldn't change anything.
        
       | Guvante wrote:
       | I feel like copy be default makes box a weird paradigm in C++.
       | 
       | In Rust you will love unless you write `Clone`.
        
         | maattdd wrote:
         | I consider C++ a copy-by-default language (and pointer/ref
         | being the necessary evil which breaks this great default).
         | 
         | In Rust, it's move by default and you need an explicit .clone()
         | 
         | In C++, it's copy by default and you need an explicit
         | std::move()
        
       | pwdisswordfishc wrote:
       | That's pretty much polymorphic_value <https://wg21.link/p201> or
       | indirect_value <https://wg21.link/p1950>
        
         | maattdd wrote:
         | Thank you very much for finding those proposals!
        
       | assbuttbuttass wrote:
       | I'm not sure I see the benefit vs std::unique_ptr. In the rare
       | case you do want to deep-copy a unique_ptr, you can always use
       | std::make_unique() to invoke the copy constructor
        
         | LegionMammal978 wrote:
         | That was my thought as well; even Rust requires a .clone() call
         | to deep-copy a Box<T>, since it doesn't allow implicit copies
         | of types without the Copy trait. (Types with that trait must
         | effectively be "plain old data" that can be copied byte-by-
         | byte.) So I don't see the issue with requiring an explicit copy
         | function for std::unique_ptr<T> instead of an implicit copy
         | constructor.
        
           | ithkuil wrote:
           | But if you clone a struct that contains a Box<T> field then
           | that field is also cloned (at least that's the behavior if
           | the default derived Clone impl)
        
         | seeknotfind wrote:
         | Kind of, though because the language does a lot of things for
         | you when you use the built in copy, make_unique gets
         | complicated when you want to use std::box inside of another
         | structure. You would need to override the default copy
         | constructor to get this to work. For instance, vector<box<Foo>>
         | wouldn't be possible to implement with unique_ptr because you
         | can't override the copy constructor for a templated type.
         | std::box would allow you to copy it. As for why you would need
         | to do this (over vector<Foo>), consider Foo having subclasses.
         | Complexity breeds complexity...
         | 
         | Regardless, I think lifetime annotations would solve far more
         | problems than std::box. I really do like box as a suggestion as
         | it would help clean up types, make things a bit more explicit
         | in a few places, but there are bigger issues with C++ right
         | now. This is a great suggestion (as is unique_resource for
         | similar on the stack), but a relatively minor thing in the
         | scheme of things. Still nice.
        
           | pjmlp wrote:
           | Lifetime annotations are somehow complicated without
           | subseting the language.
           | 
           | It isn't as if Microsoft, Apple and Google haven't been doing
           | it for a while.
           | 
           | https://devblogs.microsoft.com/cppblog/high-confidence-
           | lifet...
           | 
           | https://reviews.llvm.org/D15032
        
         | [deleted]
        
         | loeg wrote:
         | Yeah, this is just a unique_ptr with a copy operation.
        
         | nemetroid wrote:
         | You could make a similar argument about std::unique_ptr, that
         | you can always use new and delete to create and destroy.
         | 
         | Which one of these (which implement the same copy/move
         | semantics) would you prefer?                 struct A {
         | box<int> thing;       }            struct B {
         | std::unique_ptr<int> thing;           B(std::unique_ptr<int> t)
         | :           thing(std::move(t))           {}           B(const
         | B& other) {               *this = other;           }
         | B& operator=(const B& other) {               if (other.thing)
         | thing = std::make_unique<int>(*other.thing);               else
         | thing.reset();               return *this;           }
         | B(B&&) = default;           B& operator=(B&&) = default;
         | ~B() = default;       };
         | 
         | edit: nullptr checks.
        
           | Asooka wrote:
           | Huh, isn't struct B literally what an implementation of
           | std::box would look like? You would need a nullptr check in
           | the copy ctor and copy assignment operator to make it
           | complete, but other than that, that is exactly how I would
           | implement box. Anything that would be in std, you can
           | implement yourself, so I would encourage people to try
           | implementing box themselves and see how it works for them.
        
             | nemetroid wrote:
             | Yes, it's not too difficult (but thanks for the pointer,
             | edited). The question was why this smart pointer would be
             | useful at all, not just whether it belongs in the standard
             | library. Implementing the helper class yourself is not too
             | bad, but putting this in domain classes is a lot of cruft.
        
           | fluoridation wrote:
           | Personally, I've never needed to copy something that was both
           | not trivially copiable _and_ the component parts of which
           | were individually trivially copiable. If thing just gets
           | initialized immediately, why do you put it in a pointer?
           | Usually you would put it in a pointer because you need to
           | initialize it in a specific order in the constructor relative
           | to the other members (so presumably also in the copy
           | constructor), or because it 's a polymorphic object (so you
           | can't just use the copy constructor of the static type). If
           | neither of those is true, why use a pointer at all? Just make
           | the member a simple object.
        
             | nemetroid wrote:
             | Types that are self-referential in one way or another, e.g.
             | a graph where nodes point to each other, or a "main object"
             | containing subobjects which point back to the main object.
             | There is no way to implement an efficient move for such
             | types (it would need to adjust all pointers), so
             | implementing move operations would be misleading. If you
             | want to be able to pass around ownership of such a type, it
             | needs to happen through pointer indirection.
             | 
             | But you might want to allow a copy operation that recreates
             | the entire structure. The proposed box<T> offers exactly
             | the desired copy/move semantics.
        
       | zabzonk wrote:
       | > Inspired by Box<T> in Rust, the std::box<T> would be a heap-
       | allocated smart pointer.
       | 
       | so, is it the pointer that is heap-allocated or the pointee?
       | frankly, i find this article somewhat incoherent, and
       | importantly, it lacks code examples illustrating what it is
       | talking about.
        
         | mgraczyk wrote:
         | pointee
        
         | esrauch wrote:
         | It doesn't seem that complicated: they want unique_ptr that is
         | copyable and copies the underlying pointee as well.
         | 
         | I think this would be way too big of a footgun: with implicit
         | copy it would be too easy to pass box instead of box& and
         | accidentally make copies when you didn't mean to. Box is not
         | copy in Rust which avoids that problem, so really the
         | equivalent in C++ would be just to add a .deepcopy() function
         | on uniqueptr which is only implemented if the underlying type
         | has a copy ctor.
        
           | zabzonk wrote:
           | but you can do that anyway - just copy the thing the
           | unique_ptr points to, and make another unique_ptr point to
           | it. but i really don't understand what copying a unique_ptr
           | implicitly would mean.
           | 
           | also, c++ has (wisely, imho) rejected the concept of a "deep
           | copy" - we just have copies, of varying depths.
        
             | esrauch wrote:
             | > but i really don't understand what copying a unique_ptr
             | implicitly would mean.
             | 
             | The author is imagining box would be like unique_ptr but
             | with an implicit copy constructor that copies the heap
             | data:                 unique_ptr(unique_ptr<T>& o) : unique
             | _ptr(*o) {}
             | 
             | Then if you have this code:                 function
             | f(unique_ptr<int> ptr)        { cout << *ptr; }
             | f(some_uniq); // implicitly copies the uniqueptr and its
             | data
        
               | zabzonk wrote:
               | isn't this the semantics of any old c++ object with a
               | copy constructor (and all the other stuff, of course)
        
       | LordShredda wrote:
       | std::vector<MyType> is a pretty good 'box' like container.
       | Dynamically allocated and applied RAII semantics. If you only
       | want one instance, then dynamic allocation shouldn't (?) be
       | necessary.
        
       | frozenport wrote:
       | Without code examples its really hard to judge.
       | 
       | The lack of this type can be viewed as a pessimization for
       | copying objects.
        
       | tialaramex wrote:
       | I hadn't even thought about it, I was like Box<T> is basically
       | std::unique_ptr<T> anyway so what's the point -- but yes, Rust's
       | types all either can't be copied at all, or they implement Clone
       | and thus Clone::clone, which is what you'd call a "deep copy" if
       | you're used to that nomenclature.
       | 
       | I think the underlying cause is that Rust's assignment semantic
       | is a destructive move, not a copy+, which frees up the
       | opportunity for an actual copy to be potentially expensive,
       | matching reality. In a language where assignment _is_ copy, that
       | operation _must_ be cheap and so we 've obliged to make up an
       | excuse for how although this is a "copy" it doesn't behave the
       | way you want, it's just a "shallow copy".
       | 
       | + Although it will nearly always work to think of Rust's
       | assignments as destructive move, as an optimisation types whose
       | representation _is_ their meaning can choose to implement Copy, a
       | trait which says to the compiler that it 's fine to actually just
       | copy my bits, I have no deeper meaning - thus if the type you're
       | using is Copy then assignments for that type are in fact
       | performed just by copying and don't destroy anything. So a byte,
       | a 64-bit floating point number, a 4CC, an IP address none of
       | those have some larger significance, they're Copy, but a string,
       | a HashMap, some custom object you made (unless it can and did opt
       | in to Copy), those are not Copy.
       | 
       | Crucially, from an understanding point of view. Implementing Copy
       | _requires_ a trivial implementation of Clone. As a result it
       | feels very natural.
        
       | arximboldi wrote:
       | The immutable data-structures library Immer provides such type:
       | 
       | https://sinusoid.es/immer/containers.html#box
        
       | tedunangst wrote:
       | What would prevent me from making a std::box<FILE*> and blowing
       | up my program?
        
       | jng wrote:
       | I have written and been using that same smart pointer type for
       | years, under the pretty horrible name of holder_cloner_t<> (at
       | least it's clear). It is indeed the right solution to a very
       | common and important type of problem. Looking forward to
       | something like this in the standard library one of these decades.
        
         | zabzonk wrote:
         | > It is indeed the right solution to a very common and
         | important type of problem
         | 
         | if it is such a solution to such a common problem (both of
         | which i dispute), why do you think it is not alredy in the
         | standard library?
        
       | randomNumber7 wrote:
       | The problem I see with this is, that you don't always know how to
       | make a deep copy. Who knows, what happens when you copy a
       | variable of type Foo?
       | 
       | Taking that aside, I agree it would make a lot of sense to write
       | code in that style^^
        
       | zer0zzz wrote:
       | This kinda reminds me of the cow pattern a lot of swift stdlib
       | types use.
        
       | nly wrote:
       | Rip off of this article from 6 years ago?
       | 
       | https://hackernoon.com/value-ptr-the-missing-c-smart-pointer...
       | 
       | https://buckaroo.pm/blog/value-ptr-the-missing-smart-ptr
       | 
       | People have been writing pointer-like value semantic wrappers for
       | type-erasure for decades.
        
         | klyrs wrote:
         | I'm not sure I'd call the article a ripoff, but buckaroo's
         | comes with an implementation:
         | https://github.com/LoopPerfect/valuable
        
         | maattdd wrote:
         | (I'm the author of the blog post)
         | 
         | Indeed it's _exactly_ the same idea (and I think `value_ptr` is
         | actually a better name than `box`).
         | 
         | I've googled before writing the blog post and haven't found
         | literature about this (but I was mostly googling stuff around
         | "box" or "deep shared ptr").
         | 
         | Thank you for the link! Kinda glad it's more widespread than
         | what I though actually, it means we can think about making an
         | official proposal.
        
           | andrewmcwatters wrote:
           | "Boxing" as a term is a fairly well known concept. Unboxed
           | values, etc.
        
           | nly wrote:
           | There was a proposal for such a thing in 2014
           | 
           | A Proposal for the World's Dumbest Smart Pointer, v4 - Open
           | Standards https://open-
           | std.org/jtc1/sc22/wg21/docs/papers/2014/n4282.p...
        
       | brtv wrote:
       | > std::box<T> addresses these issues by offering deep copying and
       | automatic garbage collection
       | 
       | This is pretty much impossible when holding a pointer of base
       | class. However, this is a primary reason for having pointers in
       | the first place (polymorphism, and having abstract base classes).
       | 
       | In all other cases, you're probably better off with either the
       | raw value, std::variant or std::reference_wrapper.
        
         | gpderetta wrote:
         | It is actually super easy, barely an inconvenience, as long as
         | you know the actual dynamic type at construction time.
         | 
         | For example shared-ptr to base can correctly invoke the correct
         | derived type destructor even if the destructor is not virtual.
         | 
         | Edit: accidentally a word.
        
           | brtv wrote:
           | You always know the actual dynamic type at construction time,
           | how would you otherwise construct it?
           | 
           | > For example shared-ptr to base can correctly invoke the
           | correct derived type
           | 
           | Invoke what exactly? Im sorry I don't understand what you're
           | trying to say here.
           | 
           | I guess you can force all derivied types to implement a
           | clone() function, such that box<T> can do the deep copy, but
           | Id consider that a fairly big inconvenience for such a simple
           | pointer type.
        
             | gpderetta wrote:
             | No need for T to have a clone function. You can use
             | standard type erasure techniques. Consider how std::
             | function or std::any is implemented.
        
         | liquidify wrote:
         | std::variant is a nasty thing. I always try hard to find a way
         | around it, and there is almost always a better way around it.
        
           | nly wrote:
           | What's so nasty about it?
           | 
           | I find it a delight
        
         | seeknotfind wrote:
         | Works with RTTI?
        
           | brtv wrote:
           | With polymorphism, you typically want base classes that
           | provide a general interface, that many classes can derive. In
           | places where you use this pointer-to-base, you don't
           | need/want any knowledge of the derived type. It is an
           | unneeded depedancy, which would only increase compile time,
           | or worse, cause circular dependancies.
           | 
           | I'm not a big fan of RTTI, and not even sure if it would work
           | here. But once you start keeping track of all derived types,
           | you might as well use an std:: variant. It's more cache
           | friendly too, so more performant in many cases.
        
       | mr_00ff00 wrote:
       | There are some (a lot of) changes that C++ should have inspired
       | by Rust, but as the other comments have said. I really don't feel
       | like a smart pointer that acts like unique but copies by value is
       | all that necessary.
       | 
       | That doesn't seem to fix a memory bug (cause doing this with a
       | unique ptr, then the compiler would yell at you for using copy),
       | it seems to just make it easier than having to write
       | `std::make_unique(*otherptr);`
        
       | clnq wrote:
       | The author could implement the deep-copying pointer and share the
       | .h on their GitHub. You don't need a language extension for this
       | in C++ as types can implement operator overrides and copy and
       | move constructors.
       | 
       | But I doubt many people would use it, and that's probably why it
       | doesn't belong in std::.
       | 
       | In contrast, before C++ 11, developers would write their own
       | RAII-style smart pointers. So it made sense to save them the
       | labor. I don't think a pointer that doesn't allow shallow copies
       | is usually found in codebases. It sounds like a specific use-case
       | pointer.
       | 
       | It's a neat type that people coming from other languages could
       | like, but maybe not quite standard library-ready?
        
       | softwaredoug wrote:
       | I just do my darndest never ever use the heap. Let the stack be
       | your memory manager.
       | 
       | I think if you come from other languages you assume the heap is
       | the default when it should be the exception.
        
         | pstrateman wrote:
         | which works great until you need more than 8MiB of memory.
         | 
         | stack is usually quite limited unless you very much control the
         | environment
        
         | maattdd wrote:
         | How do you create recursive data structure on the stack?
        
           | rightbyte wrote:
           | Using pointers to the stack?
        
             | maattdd wrote:
             | Pointers to stack sound like a recipe for disaster (all
             | those pointers will be dangling very very soon).
        
         | galangalalgol wrote:
         | Is the normal stack size for a thread still 8MB? Most of my c++
         | has always been simulation or signal processing where a buffer
         | of objects to be simulated or samples/pixels to be processed
         | always exceed sane stack sizes. The standard answer is to
         | preallocate these buffers and reuse them. Smart pointers are
         | pretty useful in the simulation case, but not really necessary
         | for the signal processing.
        
       | cvccvroomvroom wrote:
       | Low level-capable languages need GC lifecycle hooks and
       | replacements similar to Rust alloc, but flexible enough to plug
       | in BWS, ORCA, or DIY.
       | 
       | I also think the semantics of shared and unshared const and
       | mutable state need to be made explicit. Pony is very good about
       | this more so than Rust by bringing into the language.
        
       | cjensen wrote:
       | Seems to me that the critical problem with this idea is "deep
       | copy."
       | 
       | There is no builtin deep copy facility. Without the facility then
       | a box pointer would be dangerous leading to weird effects when
       | the copy is too shallow.
       | 
       | You could solve deep copy with a template that relies on each
       | class providing a deep copy function if one is needed. But again,
       | this will make bugs if someone forgets to provide the function.
       | 
       | Rather than make an error-prone feature in the standard library,
       | I think it would be better to just explicitly roll this yourself.
       | A sensible constructor copy should already do a deep copy -- or
       | ensure copy-on-write to simulate a deep copy. So copying is as
       | easy a calling make_shared (original) or make_unique (original).
        
       | jbandela1 wrote:
       | A                   non_null_unique_ptr<T>
       | 
       | that is enforced at compile time would be far more valuable for
       | me. That would mean some kind of destructive move where the
       | compiler guarantees that you can not access a moved from object.
        
         | fluoridation wrote:
         | T && gets you the semantics you're looking for.
         | 
         | EDIT: Why are you booing me? I'm right.
        
           | devit wrote:
           | T&& doesn't run the T destructor when it's destroyed, which
           | is the whole point of unique_ptr.
        
         | kccqzy wrote:
         | Not the compiler but today you can already have a ClangTidy
         | check: https://clang.llvm.org/extra/clang-
         | tidy/checks/bugprone/use-...
         | 
         | If your build system respects ClangTidy checks and turn them
         | into errors, it's effectively the same as a compiler guarantee.
        
         | [deleted]
        
       | gumby wrote:
       | First of all, Box is a terrible name because the term has been
       | used for boxed pointers (putting tag bits into unused parts of
       | the pointer) and, in some languages, immediates, for four decades
       | at least. Also all C++ standard identifiers are thankfully
       | lowercase snake case.
       | 
       | I don't understand why the author writes "raw value is
       | straightforward and efficient... However, you can't allocate them
       | dynamically and you can't build recursive data structure such as
       | a linked list or a tree with them." There is clearly something I
       | don't understand here. Consider an int -- you can dynamically
       | allocate one, you can put it in a tree. Putting a box into a tree
       | will still require other data applicable to the tree itself, same
       | as an int), and so on. So I don't understand the point being made
       | here.
       | 
       | And the deep copy behavior is rarely what I want in a mutable
       | structure anyway (it's always safe, if usually wasteful, in a R/O
       | structure).
        
         | Thorrez wrote:
         | >There is clearly something I don't understand here. Consider
         | an int -- you can dynamically allocate one, you can put it in a
         | tree putting a box into a tree will still require other data
         | applicable to the tree itself, same as an int), and so on. So I
         | don't understand the point being made here.
         | 
         | To make a tree, you need something allowing an object (a tree
         | node) to own an object of the same type as itself (another tree
         | node). There are various options for this: raw pointer,
         | unique_ptr, shared_ptr, auto_ptr, box. Without using one of
         | those options, the tree isn't possible. That's the author's
         | point. std::box can be used in place of one of the other
         | pointer types.
        
           | nurettin wrote:
           | Can't you use edge lists/arrays to represent your tree? And
           | maybe even topologically sort your node list according to
           | whatever kind of traversal you want to make? At least that's
           | what I used to do in the qbasic days.
        
           | gumby wrote:
           | I understand that fine. My problem is the author calls out
           | the "raw value" as distinct from "raw pointer", implying a
           | non pointer value which would be the _content_ of a tree
           | node. This makes the entire paragraph hard to make sense of.
        
             | maattdd wrote:
             | We don't care about the content of the node. The
             | interesting part is how do you define the Node itself.
             | 
             | struct Node<T> { Node<T> left; Node<T> right; };
             | 
             | would obviously be the best choice (what I call in the
             | article "raw value"). But it's not valid C++.
        
               | gumby wrote:
               | Ah, thanks.
        
       | gct wrote:
       | I'd rather have an optionally-owned pointer type so I can handle
       | writing virtual methods that can return a value aliasing an
       | already existing value or create a new one on demand. Otherwise
       | you either have to roll your own or bloat your API:
       | virtual std::flexptr<Thing> get_expensive_thing();
       | 
       | vs:                 virtual bool has_expensive_thing();
       | virtual const Thing& get_expensive_thing();       virtual
       | std::unique_ptr<Thing> build_expensive_thing();
       | 
       | It'd probably have shared_ptr semantics but you'd have to treat
       | it as a const ref for lifetime purposes, which might make it
       | distasteful to the std library folks.
        
       ___________________________________________________________________
       (page generated 2023-08-20 23:01 UTC)