[HN Gopher] A curiously recurring lifetime issue
       ___________________________________________________________________
        
       A curiously recurring lifetime issue
        
       Author : irsagent
       Score  : 62 points
       Date   : 2023-12-16 17:37 UTC (1 days ago)
        
 (HTM) web link (blog.dureuill.net)
 (TXT) w3m dump (blog.dureuill.net)
        
       | jimberlage wrote:
       | Is there a good tutorial on Valgrind for beginners? It's a tool
       | I've only ever seen praised so I'm curious to play with it a bit.
        
         | gavinhoward wrote:
         | As an avid Valgrind user, I don't know of one.
         | 
         | But if you would like, I could post one to my blog.
         | 
         | If you would like me to, contact me privately. [1]
         | 
         | [1]: https://gavinhoward.com/contact/
        
         | kentonv wrote:
         | In my experience all you really have to do is:
         | 
         | 1. Write a test program that exercises your segfault.
         | 
         | 2. Build it in debug mode.
         | 
         | 3. Run `valgrind <my-program>`
         | 
         | And it just tells you where your problem is. Not much more to
         | it.
        
       | phendrenad2 wrote:
       | Referential transparency be damned, I guess. This feels like a
       | inherent downside to languages where you have to manage
       | lifetimes.
        
       | mungaihaha wrote:
       | Nice read
        
       | dataflow wrote:
       | I think [[clang::lifetimebound]] would let the compiler detect
       | this at compile time?
        
       | IshKebab wrote:
       | > Is this on Cap'n Proto? Honestly, I don't know.
       | 
       | It absolutely is. It's a fairly basic principle that APIs should
       | be difficult to misuse, and that fact that you made the same
       | mistake 2/3 times shows that it is _very_ easy to misuse. In
       | other words it is a badly designed API. At the very least it
       | should be called ListView.
       | 
       | I am not a big fan of CapnProto. It has some neat features but
       | it's very complicated, full of footguns like this, and the API is
       | extremely unergonomic.
        
         | dmeybohm wrote:
         | Yeah I agree, the ListView naming is more appropriate.
         | 
         | I haven't used non-owning types like string_view or span too
         | much because I haven't needed that level of performance or
         | memory optimization yet, and so those just seem like footguns
         | as compared to just a reference without those needs. I do like
         | to use a technique in classes that use non-owning references
         | that would work for those too to prevent this particular
         | problem.
         | 
         | For that, there are two methods with the same name, but
         | different access - an lvalue version and an rvalue version.
         | Then, you delete the rvalue method like this:
         | class Response {         auto getListView() & -> ListView {
         | return ListView(m_List);         }         void getListView()
         | && = delete;       }
         | 
         | Then you get a compile error like in Rust when you try to call
         | getListView() from a temporary object, but if you call the
         | method from an lvalue it still works at least as long as the
         | object is in scope.
        
         | bsder wrote:
         | > In other words it is a badly designed API.
         | 
         | I don't agree. The API is what it is _because_ it is
         | specifically a zero copy API for performance. If you don 't
         | care about performance, why are you using C++ (stupid) and a
         | zero-copy API (doubly stupid)?
         | 
         | I absolutely do _NOT_ expect a zero copy API to own things. If
         | I drop the underlying reference that is really an alias, how on
         | earth is that the fault of the zero copy API?
         | 
         | The combination of aliasing and lifetimes are _C++_ footguns--
         | full stop. This is aptly demonstrated by how quickly Rust kills
         | this cold.
         | 
         | If you use sharp knives, sometimes you cut your fingers. People
         | like you would claim the knife is the problem.
        
       | plagiarist wrote:
       | My takeaway is just the last paragraph, it sounds like Cap'n
       | Proto is a footgun and I should use anything else.
        
       | nyanpasu64 wrote:
       | I'd say that method chaining (referential transparency, etc.) and
       | implicit destructor calls with side effects don't mix.
       | 
       | I have a general rule that "resource" types which own a heap
       | allocation should usually be given a variable name with explicit
       | scope (and likely even an explicit type, rather than `auto
       | response` like in this post). This is a general guideline to
       | avoid holding a reference to a temporary that gets destroyed, but
       | doesn't protect against returning a dangling reference into a
       | resource type from a function.
       | 
       | In other places, where languages make the opposite decision (from
       | this blog post) to _extend_ the lifetime of a temporary variable
       | with a destructor when you call methods on it, you get things
       | like C++ 's temporary lifetime extension (not a bug, note that I
       | don't understand it well), and footguns like Rust's `match
       | lock.lock().something {}` (https://github.com/rust-lang/lang-
       | team/blob/master/design-me...).
        
       | HarHarVeryFunny wrote:
       | Seems like someone trying to be too clever to me, and perhaps a
       | case of premature optimization. Non-owning references are a
       | problem waiting to happen. Even if your language/api allows you
       | to check if the reference is still valid before use, you can
       | obviously forget to do so.
       | 
       | Rather than use a non-owning reference I'd rather use a design
       | that didn't need it, or just use a std::shared_ptr owning
       | reference instead. I realize there are potential cases (i.e. one
       | can choose to design such cases) of circular references where a
       | non-owning reference might be needed to break the circular chain,
       | or where one wants a non-owning view of a data structure, but
       | without very careful API design and code review these are easy to
       | mess up.
        
         | kentonv wrote:
         | This sounds nice but it just isn't realistic. If you try to
         | write a complex system in C++ without non-owning references,
         | you're basically heap-allocating every single object and using
         | slow atomic refcounting everywhere. Performance will likely be
         | much worse than just using a garbage collected language to
         | start with.
        
       | GuB-42 wrote:
       | The problem I see here is that one of the functions returns a
       | pointer and it doesn't use the usual pointer syntax.
       | 
       | I see no *, no & and no -> in the code. So I would assume
       | everything to behave as if it was owned or even copied. Had it
       | returned actual pointers, or pointer-like objects like iterators,
       | it would have been more obvious.
        
         | amluto wrote:
         | This is C++ we're talking about.                   auto x =
         | y();
         | 
         | Is x a pointer or reference? There's no way to tell. _Maybe_ if
         | you then do                   x->foo();
         | 
         | You have some idea that x is pointer-ish, but unique_ptr works
         | like this and isn't very pointer-ish.
        
       | kentonv wrote:
       | Author of Cap'n Proto here.
       | 
       | The main innovation of Cap'n Proto serialization compared to
       | Protobuf is that it doesn't copy anything, it generates a nice
       | API where all the accessor methods are directly backed by the
       | underlying buffer. Hence the generated classes that you use all
       | act as "views" into the single buffer.
       | 
       | C++, meanwhile, is famously an RAII lanugage, not garbage-
       | collected. In such languages, you have to keep track of which
       | things own which other things so that everyone knows who is
       | responsible for freeing memory.
       | 
       | Thus in an RAII language, you generally don't expect view types
       | to own the underlying data -- you must separately ensure that
       | whatever does own the backing data structure stays alive. C++
       | infamously doesn't really help you with this job -- unlike Rust,
       | which has a type system capable of catching mistakes at compile
       | time.
       | 
       | You might argue that backing buffers should be reference counted
       | and all of Cap'n Proto's view types should hold a refcount on the
       | buffer. However, this creates new footguns. Should the
       | refcounting be atomic? If so, it's really slow. If not, then
       | using a message from multiple threads (even without modifying it)
       | may occasionally blow up. Also, refcounting would have to keep
       | the _entire_ backing buffer alive if any one object is pointing
       | at it. This can lead to hard-to-understand memory bloat.
       | 
       | In short, the design of Cap'n Proto's C++ API is a natural result
       | of what it implements, and the language it is implemented in. It
       | is well-documented that all these types are "pointer-like",
       | behaving as views. This kind of API is very common in C++,
       | especially high-performing C++. New projects should absolutely
       | choose Rust instead of C++ to avoid these kinds of footguns.
       | 
       | In my experience each new developer makes this mistake once,
       | figures it out, and doesn't have much trouble using the API after
       | that.
        
         | foxhill wrote:
         | apologies, perhaps i'm missing something here, having not used
         | cap'n proto in any context at all before.
         | 
         | is it not possible to delete the rvalue reference overload of
         | 'getList'?
         | 
         | as far as i can tell, the error producing code wouldn't have
         | produced a diagnostic, but failed to build in the first
         | instance, like the rust case?
        
           | kentonv wrote:
           | That would catch some legitimate use cases, where you get the
           | list and immediately use it on the same line. Admittedly this
           | is not so common for lists, but very common for struct
           | readers, e.g.:                   int i =
           | call.send().getSomeStruct().getValue();
           | 
           | Here, even though `send()` returns a response that is not
           | saved anywhere, and a struct reader is constructed from it,
           | the struct reader is used immediately in the same line, so
           | there's no use-after-free.
           | 
           | Someone else mentioned using lifetimebound annotations. This
           | will probably work a lot better, avoiding the false
           | positives. It just hadn't been done because the annotations
           | didn't exist at the time that most of Cap'n Proto was
           | originally written.
        
             | foxhill wrote:
             | i could be wrong, but i'm reasonably confident that this is
             | UB for even trivial types? someone more knowledgeable with
             | the language lawyering would need to opine one way or the
             | other.
             | 
             | regardless of that outcome, i think i'd prefer to require a
             | value preserving the lifetime of the reader/view. in the
             | cases that it may not be necessary, i'd prefer to lean on
             | the optimiser to take care of it..!
        
               | kentonv wrote:
               | What's UB about it? Any temporary objects constructed
               | during the evaluation of a statement live until the end
               | of the statement. The standard is clear on that.
               | 
               | > i think i'd prefer to require a value preserving the
               | lifetime of the reader/view. in the cases that it may not
               | be necessary, i'd prefer to lean on the optimiser to take
               | care of it..!
               | 
               | We'd all prefer APIs that cannot be used unsafely but
               | realistically there's no magic the optimizer can do to
               | make the problems with refcounting go away. You need to
               | use a language like Rust to solve this.
        
               | foxhill wrote:
               | ah, sorry, i didn't read that correctly.
               | 
               | perhaps for values like this you're fine. i think my
               | point still stands about the reader of a built-in
               | list/sequence type, surely?
               | 
               | and, not to sound facetious, that's exactly what
               | optimisers do :)
               | 
               | the c++ type system is more than capable about reasoning
               | about lifetimes, the issue is that, with c++, it's an
               | optional part of the language. also, the lack of non-
               | destructive moves. but to require both of those things in
               | the language would require, essentially, the borrow
               | checker in rust.
        
             | kentonv wrote:
             | Oh actually there's a much more obvious case where
             | prohibiting getters on rvalues would be a problem. It would
             | prevent you from doing this in general:
             | myReader.getFoo().getBar()
             | 
             | Here, `myReader` is already a view type; ownership of the
             | backing buffer lives elsewhere. `getFoo()` returns a reader
             | for some sub-struct, and `getBar()` returns a member of
             | that struct. If we say getters are not permitted to be
             | called on rvalues, this expression is illegal, but there's
             | no actual problem with it and in practice we write code
             | like this all the time.
        
       ___________________________________________________________________
       (page generated 2023-12-17 23:01 UTC)