[HN Gopher] Wrong by Default
___________________________________________________________________
Wrong by Default
Author : nalgeon
Score : 64 points
Date : 2022-05-14 18:40 UTC (2 days ago)
(HTM) web link (kevincox.ca)
(TXT) w3m dump (kevincox.ca)
| the_gipsy wrote:
| > Even simply passing references to multiple places creates
| ambiguity on who is responsible for cleaning it up.
|
| Learning rust, realizing how value ownership (borrow checker
| stuff) overlaps with conceptual ownership has been enlightening.
| dataangel wrote:
| I've been so annoyed with people falsely equating defer and RAII
| that it didn't occur to me this is yet another way it's inferior!
|
| The thing that usually comes to mind for me is that defer only
| works for local scope. RAII is more general because RAII objects
| can be put inside other objects that have different lifetimes. I
| can have an OpenFile inside a struct inside a reference counted
| smart pointer which will make it so that when the last user using
| that file disconnects the file is automatically closed. Defer
| can't do that.
| masklinn wrote:
| Indeed `defer` quickly gets real complicated when you need to
| return an object but you also may need to clean it up e.g.
| f, err := os.Open("file.go") if err != nil {
| return nil, err } data := make([]byte,
| 100) count, err := f.Read(data) if err != nil {
| return nil, err } _, err = f.Seek(0,
| 0) if err != nil { return nil, err
| } return f, nil
|
| You can't just `defer f.Close()` after the first condition
| (regardless of flushing / write loss issues, we'll assume read-
| only) because you want to return it. You need to either:
|
| * defer a callback and use a flag
|
| * or specifically _not_ use defer, remember to close the file
| on every path, and hope your linter supports that pattern
|
| RAII? RAII just does the right hing there (disregarding the
| same flushing issue on writes). In case of error you drop the
| file and that closes it. In case of no-error you return the
| file object, that automatically makes it the responsibility of
| the caller.
| sirwhinesalot wrote:
| Some languages (like D and Zig) distinguish between "error
| defer" and "success defer", which solves the issue above.
|
| Not as automatic as RAII, but better than the single defer
| (which sucks).
|
| EDIT: My personal preference is actually something like
| Python's "with" or C#'s "using", since they rely on
| "destructors" rather than explicit calls to close. But they
| have the exact same issue as simple defer. If you ever need
| to store or return the thing, you need to handle it manually
| instead.
|
| In my own usage I almost never keep stuff open for long or
| store it in complex data structures where this is an issue in
| practice, simple "using" is almost always enough. Rust and
| C++ need RAII because they also manage memory like any other
| resource, otherwise it would get incredibly painful.
|
| (Zig uses arena and pool allocators to manage memory, if you
| had to manage all memory with defers I think people would go
| nuts)
| svnpenn wrote:
| > You can't just `defer f.Close()` after the first condition
| (regardless of flushing / write loss issues, we'll assume
| read-only) because you want to return it
|
| You would never include Close in this function, period. In
| this example above, it would always be the caller closing the
| file: package main import "os"
| func hello() (*os.File, error) { return
| os.Open("file.go") } func main() {
| file, err := hello() if err != nil {
| panic(err) } defer file.Close()
| }
| tialaramex wrote:
| Now though this extra work (the thing I got has to have
| deferred work done when I don't need it any more) is given
| to your caller.
|
| For a File Handle this might feel natural, but what about
| if I hand you a... 3D printer? a Bank Account? a Youtube
| Video? a Package Delivery? Which of these need some sort of
| "deferred close" action ? You probably need to read the
| docs and arrange to close or not close things accordingly
| right?
|
| You're going to be Wrong by Default again.
| geraneum wrote:
| The problem is, the programmer (caller) forgets and that's
| the whole point of having a more or less automated way of
| releasing/closing/etc.
|
| Apart from that, it's bad design IMO (leaky abstraction).
| jjnoakes wrote:
| The example above has an operation that might fail between
| the Open and the return. Leaving that out of course makes
| the example useless, but if you keep that part in, you
| can't use go's simple defer pattern to help you close the
| file if that intermediate operation fails.
| TheDong wrote:
| RAII is superior, but there's also another go pattern for
| that which reads better than your two alternatives sometimes:
| // You _must_ 'defer closeFunc()' (with closeFunc being the
| second return value) after this function, // even if
| this function returns an error. // You do not need to
| call File.Close() on the returned file, only closeFunc()
| func openFileAndPeakHeader(path string) (*os.File, func()
| error, error) { closeF := func() error { return
| nil } f, err := os.Open(path) if err
| != nil { return nil, closeF, err
| } closeF = f.Close // ...
| if _, err := f.Read(data); err != nil {
| return nil, closeF, err } return f,
| closeF, nil }
|
| Unfortunately, it relies on someone reading docs for them to
| determine that this isn't an Either<(File, func), error>, but
| rather an Either<File, (func, error)>, but that's how go
| works anyways. You have to read doc comments to know what
| types an 'error' might be, you have to read docs to know if
| return values are Either or not (i.e. file.Read returns n,
| err, and both are meaningful, not just one of them).
|
| It's not good, but I think it's better in some cases than the
| options you presented.
| gilbetron wrote:
| It's just as easy to not write a destructor as it is to not write
| a defer.
| CodeIsTheEnd wrote:
| But you only need to write the destructor once.
| masklinn wrote:
| Except the defer stuff is _in addition to_ the "writing the
| destructor" part.
|
| When you have defer, _you still need to write code you 'll be
| deferring_. That's what the destructor is, no more. `defer` is
| strictly _additional_ overhead.
|
| Furthermore:
|
| 1. If you're just composing in destructible objects the
| compiler does the work for you, if I create a struct which
| contains a File, i don't have to do anything, rustc will make
| my struct implicitly Drop and release the `File`. If you do
| that in Go... you have to remember to write a deferrable method
| which'll close the underlying file.
|
| 2. You're only defining a type once, and you're thinking about
| its invariants _right there_ , it's a lot easier to remember
| those invariants when creating the type than every time you're
| using an instance.
|
| Unless you have linear structural subtyping and the compiler
| forces you to do it. Then it's fine.
| [deleted]
| adrianmonk wrote:
| Not the main point here, but I think you could almost fix the
| defer approach with types. Instead of: File f =
| File.open(...); defer f;
|
| You could have: PleaseDeferMe<File> p =
| File.open(...); File f = defer p;
|
| If you want to actually access that file you've created, your
| only path is through using defer.
|
| One case this doesn't solve is creating an object but never using
| it. When writing new code, you probably won't make that error,
| but when ripping out old code you might miss a piece you should
| have deleted. If the compiler warns about unused variables (for
| every possible code path), it could catch that too.
|
| (This could be generalized a bit. Call it CleanMeUp<T>, have a
| CleanMeUp.takeResponsibility() method that returns T, and maybe
| defer isn't the only thing that can call takeResponsibility().)
| shhsshs wrote:
| Java's `throws` keyword comes to mind. It forces you to handle
| exceptions or explicitly declare that you expect the exception
| to be handled by a method higher up the call stack.
|
| Your strongly-typed suggestion seems better. But you'd have to
| somehow allow for cases where a function opens a file, does
| something with it, and also expects the file to be closed by a
| higher function. Perhaps something like:
| PleaseDeferMe<File> BeginWrite() {
| PleaseDeferMe<File> p = File.open(...); // unsafe!
| Make sure to return p to the calling function so this gets
| called with "defer" at some point File f =
| unwrapDefer p; f.Write(some header) return
| p }
| david2ndaccount wrote:
| I can't believe that people still use opening/closing files as
| examples of RAII when that is one of the cases that RAII _can't_
| handle correctly. A call to close is the last chance the
| operating system has to signal that an error occurred when
| writing. You can't easily signal errors in destructors. In C++,
| you basically can't throw an exception and there is no return
| value from a destructor. Rust suffers the same kind of problem.
| The only way to properly close a file is to use regular old
| explicit control flow with if statements. Then you can handle the
| errors in the normal way.
| deathanatos wrote:
| Sure, but defer still fairs _worse_ than RAII does even in this
| example. In this particular example, yes, you 'll need to call
| flush yourself and handle the error yourself: which you'd
| _still_ need in a GC 'd lang/lang w/ "defer", in addition to
| remembering to close the file (which the RAII example _will_
| have your back on).
| jjnoakes wrote:
| There is a right way to do this using RAII, and it works for
| files and for any other resource that might fail on close: you
| simply explicitly do the close yourself, but only along the
| success/happy path. That way you get the best of both worlds:
| automatic clean-up (ignoring errors) along error paths, and
| proper close-and-check-for-errors along the happy path.
|
| Of course it'd be even nicer if the implicit RAII close could
| somehow add its error to the error you are currently
| propagating, but that's beside the point for your example.
| notriddle wrote:
| > The only way to properly close a file is to use regular old
| explicit control flow with if statements. Then you can handle
| the errors in the normal way.
|
| Other than crashing the program, what could you possibly do?
|
| This is what portable software has to work with [1]:
| If close() is interrupted by a signal that is to be caught, it
| shall return -1 with errno set to [EINTR] and the state of
| fildes is unspecified. If an I/O error occurred while reading
| from or writing to the file system during close(), it may
| return -1 with errno set to [EIO]; if this error is returned,
| the state of fildes is unspecified.
|
| And it's not just the standard being dumb. Here's what Linux's
| manual page says about it [2]: Retrying
| the close() after a failure return is the wrong thing to
| do, since this may cause a reused file descriptor from another
| thread to be closed. This can occur because the Linux kernel
| always releases the file descriptor early in the close
| operation, freeing it for reuse; the steps that may
| return an error, such as flushing data to the
| filesystem or device, occur only later in the close
| operation. Many other implementations similarly
| always close the file descriptor (except in the case
| of EBADF, meaning that the file descriptor was
| invalid) even if they subsequently report an error
| on return from close(). POSIX.1 is currently silent on this
| point, but there are plans to mandate this behavior in the next
| major release of the standard. A careful
| programmer who wants to know about I/O errors may
| precede close() with a call to fsync(2). The
| EINTR error is a somewhat special case. Regarding the EINTR
| error, POSIX.1-2008 says: If close() is
| interrupted by a signal that is to be caught,
| it shall return -1 with errno set to EINTR and the
| state of fildes is unspecified. This permits
| the behavior that occurs on Linux and many other
| implementations, where, as with other errors that may be
| reported by close(), the file descriptor is
| guaranteed to be closed. However, it also permits
| another possibility: that the implementation returns
| an EINTR error and keeps the file descriptor open.
| (According to its documentation, HP-UX's close()
| does this.) The caller must then once more use close()
| to close the file descriptor, to avoid file descriptor leaks.
| This divergence in implementation behaviors provides a
| difficult hurdle for portable applications, since on
| many implementations, close() must not be called
| again after an EINTR error, and on at least one,
| close() must be called again.
|
| And, just to finish it off, Windows is completely silent on
| this matter. It mentions EBADF, and that's it [3]. The
| CloseHandle() function, which POSIX close() (a.k.a. _close())
| uses under the hood, also doesn't seem to say anything about
| whether CloseHandle() failing on a file leaves the handle
| opened, closed, or in limbo.
|
| And since all of these documentations reserve the right to not
| actually flush the file when it's closed, any warnings or
| errors are going to be "best-effort" anyway. If you really want
| to make sure the file actually gets written, call fsync().
|
| [1]:
| https://pubs.opengroup.org/onlinepubs/9699919799/functions/c...
|
| [2]: https://www.man7.org/linux/man-pages/man2/close.2.html
|
| [3]: https://docs.microsoft.com/en-us/cpp/c-runtime-
| library/refer...
| metadat wrote:
| > Retrying the close() after a failure return is the wrong
| thing to do, since this may cause a reused file descriptor
| from another thread to be closed
|
| Imagine how tricky it could be to have to debug this or try
| to reproduce it? Quite the nasty subtle buggy boundary case,
| especially in Golang where cleanup deferrment is encouraged
| and touted in lots of example code.
|
| It's easy to imagine a segment of naive language users
| completely unknowingly risking creating environments where
| such a bug can thrive.
|
| It goes like this: first deferring the call to close after
| opening a file and not receiving an error, then calling close
| again at the end of the happy path "just to be sure" or
| because "now I'm done with it". And then blundering along
| their merry way, believing there is nothing wrong with this
| and "go has my back".
|
| (1) Common File API abstractions that taste more or less like
| OOPcombined with (2) defer-capable languages (like Go) where
| this is the "best practice": This class of bug should be
| considered common and expected.
|
| The tricky part lies in the gap between being newbie-friendly
| vs. correct and safe. Language makers can't have it both
| ways.
|
| Maybe static code analysis could detect it and refuse to
| compile, as it should be considered a foot-gun type of error.
|
| p.s. notriddle: thank you for inlining complete-ish relevant
| detailed information in your post. It's nice of you.
| richardwhiuk wrote:
| Handling an error from close is a bad way to handle the error,
| because the file is already closed, so you are pretty much
| stuffed.
|
| You want to call fsync() instead, and grab the error there,
| which will prevent you from being able to get an error from
| close() - see "Dealing with error returns from close()" in
| https://www.man7.org/linux/man-pages/man2/close.2.html
|
| If you get it on close, you have ~no idea which operation
| failed.
|
| Obviously doing it on the operation allows you to wrap it up in
| the handling of the operation and allows you to do RAII (or
| DIRR as it probably should be known).
|
| (The EBADF option is nonsense in a properly developed
| application, and should cause a panic. The EINTR position is
| just a mess.)
| richardwhiuk wrote:
| Of course, other flow control options are available which
| also prevent this mess - e.g: int failed =
| with_file("foo", |file|{ write(file, "bar");
| }); fn with_file(char* foo, lambda: Fn(File)){
| let f = open(foo); if !f { return f; } let
| err = lamda(f); let cerr = close(f); if !err
| { return err; } return cerr;
| }
| metadat wrote:
| Is this Rust or a different language?
| anyfoo wrote:
| It depends on what you are working on, and then even "not wrong
| by default" might be... wrong.
|
| For low level code, destructors/RAII are much harder to notice
| than a very explicit defer. Even the "kernel C" style "goto
| done;" to a unified exit point (sometimes multiple labels that
| fall into each other to clean up according to how far you got)
| wins out by explicitly telling you what will happen.
|
| If you are working on projects where you often spend much more
| time reading code than subsequently writing code, and if the
| thing you are working on is not a high level application where
| the meat is in some business logic, then anything that reduces
| readability in the sense of "I can see everything that is going
| on" is very bad.
|
| In low level code, the potential for icky problems is also much
| higher: You don't want to find out that some stray destructor did
| something that should not have happened in e.g. an interrupt
| context, or while the current thread's memory management is in an
| intermediate state. That is _not_ correct by default, it is
| "wrong and you didn't even tell me".
|
| In high level application software development, e.g. CRUD SaaS or
| a smartphone app, the opposite may be the case, and decluttering
| might well win out.
| ComputerGuru wrote:
| This actually brings me to one of the few (current?) limitations
| of rust that drive me nuts! In all the other modern languages I
| work with, my big projects end up with a utility RAII class that
| wraps a closure to execute when the scope expires. It's great for
| cleaning up logical operations as opposed to what is
| traditionally done by RAII destructors in oop languages.
|
| Eg if I'm creating a temporary file or directory in a function to
| handle some IO logic, I'll create a RAII wrapper around a closure
| that deletes that folder when the scope expires - this might be a
| one-off thing so instead of creating an actual RAII
| "TemporaryFolder" object that wraps around a path, creates the
| folder in its constructor, and deletes the folder and its
| contents in the destructor (and have to reference the inner path
| instead of the object itself when needing it, etc), I can just
| stick a closure to delete the folder in my wrapper and call it a
| day.
|
| You can't really do that in rust because you'd have to give
| ownership of the file/folder/whatever to the closure - meaning
| you can't use it yourself afterwards! Technically the compiler
| just has to guarantee that the object persists to the end of
| scope (isn't destroyed or moved out earlier) and then there would
| be no borrow conflicts (since it's only run after everything else
| has finished and can't be used for anything else in the RAII
| wrapper) but it can't really know that, so you're disallowed from
| sharing it between your actual code and the cleanup closure.
| masklinn wrote:
| > You can't really do that in rust because you'd have to give
| ownership of the file/folder/whatever to the closure - meaning
| you can't use it yourself afterwards!
|
| The way to do this is pretty much what C++ has in unique_ptr(p,
| deleter): have the utility RAII class own both the object and
| the closure, dereference to the wrapped object, and on drop
| pass the wrapped object to the closure.
|
| Altho
|
| > Eg if I'm creating a temporary file or directory in a
| function to handle some IO logic, I'll create a RAII wrapper
| around a closure that deletes that folder when the scope
| expires
|
| There are crates for lots of those situations.
|
| https://crates.io/crates/tempfile for instance.
| ComputerGuru wrote:
| > have the utility RAII class own both the object and the
| closure
|
| Yeah, but the ergonomics suck. Especially for random cleanup
| tasks that aren't necessarily one-object, one-task.
|
| The temp file thing was just an example, actual use case was
| more domain-specific.
| tomjakubowski wrote:
| If it's only a temp file (as opposed to a dir) and it's
| Linux, you may want to use O_TMPFILE (possibly O_EXCL too)
| when open()ing the file instead of the RAII wrapper. That
| ensures the OS will delete the file even if the program is
| killed.
| notriddle wrote:
| https://docs.rs/tempfile/3.3.0/src/tempfile/file/imp/unix.r
| s...
| revskill wrote:
| This concept is almost the same as
|
| trx = new DbTransaction();
|
| ...
|
| trx.commit() or trx.rollback();
|
| It's easy to be missed, but it's hard to detect without manually
| remembering where to put it in the correct place.
| deathanatos wrote:
| It's the same concept.
|
| RAII can (and I have seen DB libraries that _do_ ) model a DB
| transaction as a resource. In the cases that I know of, they
| take the safe default of rolling the transaction back if the
| code fails to commit it. But the transaction is never leaked
| nor is the connection left in some unspecified state.
| car_analogy wrote:
| > But the problem with most garbage collectors is that they only
| clean up memory!
|
| Is this an inherent limitation of garbage collectors, or could
| they be made to clean up arbitrary resources?
| fancyfredbot wrote:
| Actually he later goes on to say the garbage collector can
| clean other resources by using the finalise method. The problem
| is that it doesn't necessarily clean them up at the right time.
| The garbage collector isn't triggered by having too many open
| file descriptors, it runs when you are low on memory. In theory
| it could look at other resources too, but for certain resources
| you don't want to wait at all.
| arcticbull wrote:
| There's also generally no guarantee that finalizers ever run,
| period.
| Sesse__ wrote:
| It is obvious if you think of what a garbage collector is; it's
| a simulation of having infinite memory. So objects in a GC
| world live forever, and thus are never destroyed. (Of course,
| behind the scenes, the garbage collector will _deallocate_
| them, but only when your program can no longer care because it
| doesn't have any way of reaching them.)
|
| Of course, a garbage collector can be made to run finalizers,
| but then you either have to do a pure refcounting GC, at least
| for those objects (which is highly suboptimal for speed), or
| live with unpredictable finalization (which is incompatible
| with using RAII for e.g. locks, as the article points out).
| duped wrote:
| The problem with RAII is that it is semantically coupled to
| destruction in most (all?) of the languages it is implemented in,
| which is tied to lexical scope.
|
| In GC'd languages where lifetime is not meaningfully coupled to
| scope, RAII cannot work the way you expect it to. The most
| trivial way to handle it is to couple to finalizers, which is
| tricky for two reasons: firstly it is not deterministic _when_
| (or if, in case of an exception /fatal error) the finalizer runs,
| and secondly it means the GC must keep track of user defined
| finalizers to execute on collection (what if they trigger an
| error? What if they take a long time? What if they block?).
|
| The solution is to find some way to tie RAII to lifetimes, which
| may require exit analysis and can be extremely difficult, if not
| impossible, to implement in the language. The "least bad" way to
| deal with the problem is defer-like semantics.
|
| I'm not a huge fan of defer, but it's important to understand
| _why_ it exists. It 's hardly "wrong by default" and more
| "understandably different."
| ComputerGuru wrote:
| This is only technically true, in practice GC languages have
| workarounds. C# is almost entirely GC but has the IDisposable
| pattern and using statements to make up for this.
| kevincox wrote:
| > which is tied to lexical scope
|
| This isn't really true. It is kinda true in C++ but fairly well
| fixed with moves. In Rust it isn't true at all because moves
| are the default so any time you return a value or put it in a
| data structure the lifetime is no longer lexically bound.
|
| > It's hardly "wrong by default" and more "understandably
| different."
|
| I think it is both. What I mean by "wrong by default" is that
| if you write f = open(name)
|
| The code is now wrong. This is the default behaviour. To make
| the code corred you need to add a second bit of code (a close
| call). Defer is a powerful mechanism to do that, but it is a
| tool to solve the core wrongness.
| masklinn wrote:
| Note that they said the destruction is tied to lexical scope,
| in general, not necessarily the lexical scope of creation.
|
| Their point is mostly that once you "drop" an object in a
| GC'd language, it will be collected at some later time
| independent of lexical scope.
|
| Which I don't think is much of an issue, technically the
| language could "drop" (close) such objects lexically and
| leave memory reclamation for later. Not a huge issue.
|
| The bigger issue is that GC-based objects have a strong
| tendency to share, a lot. That makes RAII way more
| challenging, when using the objects directly for starters,
| but even more so when composing RAII objects and non-RAII
| objects.
| marcosdumay wrote:
| On both C++ and Rust the destruction is tied to the value's
| lifetime. On the two languages that everybody remember when
| they talk about RAII, the destruction is not tied to
| lexical scope.
|
| (Although, on C++ that requires extra code for your class
| to behave correctly. The resource managing class is wrong
| by default, but the code that uses it is not.)
| tialaramex wrote:
| In Rust what you wrote is just how it works because it's
| how _move_ was defined in the language, but I think in
| C++ at most you could say that this is how people
| conceptually try to arrange for things to happen, and at
| that point we 're back to "Wrong by Default".
|
| All the C++ compiler actually gives you is destruction at
| end of lexical scope, and then it's _your job_ as
| programmer to do what people actually wanted instead.
|
| Historically what you do in C++ is copy things _a lot_ ,
| or work with pointers. Modern C++ offers a move, but for
| backwards compatibility reasons it had to preserve that
| behaviour where the destruction happens precisely when
| you leave scope, so, move "hollows out" objects, leaving
| something which can be destroyed later.
|
| Thus it's your job, when implementing a C++ type, to
| ensure that the "real" work happens whenever a programmer
| would expect, leaving just some shell to be properly
| "destroyed" later when scope ends. Whereas for Rust you
| just implement Drop correctly and the compiler arranges
| for it to happen promptly.
| MauranKilom wrote:
| > All the C++ compiler actually gives you is destruction
| at end of lexical scope, and then it's your job as
| programmer to do what people actually wanted instead.
|
| `new` and `delete` are built into the language (and smart
| pointers into the standard library). The statement that
| object lifetime is tied to lexical scope is plainly
| untrue.
|
| > Thus it's your job, when implementing a C++ type, to
| ensure that the "real" work happens whenever a programmer
| would expect
|
| The auto-generated (`= default`) copy and move
| constructors do the right thing by default if you compose
| your type from other (correct) types. Yes, in the rare
| case that you write a type that has to implement RAII,
| the compiler can do the wrong thing by default ("copy
| just copies the pointer"), but then you write a wrapper
| class _once_ and you 're done. If that's even necessary
| at all, since unique_ptr can have custom deleters. (You
| wouldn't violate Single Responsibility by messing with
| RAII in a class that also does other things, right?)
___________________________________________________________________
(page generated 2022-05-16 23:00 UTC)