[HN Gopher] Don't defer Close() on writable files (2017)
___________________________________________________________________
Don't defer Close() on writable files (2017)
Author : misonic
Score : 227 points
Date : 2024-09-08 09:36 UTC (2 days ago)
(HTM) web link (www.joeshaw.org)
(TXT) w3m dump (www.joeshaw.org)
| K0nserv wrote:
| Rust has the same problem. Files are closed in `Drop` when the
| value goes out of scope, but all errors are silently ignored. To
| solve this there's `sync_all`[0].
|
| Generally, relying on defer in Go or Drop in Rust for anything
| that can fail seems like an anti-pattern to me.
|
| 0: https://doc.rust-
| lang.org/std/fs/struct.File.html#method.syn...
| kzrdude wrote:
| An ownership consuming close(self) would make sense, but has
| not been added, there must be some good reason for that?
| masklinn wrote:
| That nobody has gone through the effort of collating its
| requirements and writing an RFC after
| https://github.com/rust-lang/rfcs/pull/770 was closed (back
| in 2015).
|
| I assume a big issue is that this is full of edge cases up
| the ass, and the value is somewhat limited in the sense that
| if you know you want durable writes you'll sync() and know
| you're fucked if you get an error, but close() does not
| guarantee a sync to disk, as the linux man page indicates:
|
| > A successful close does not guarantee that the data has
| been successfully saved to disk, as the kernel uses the
| buffer cache to defer writes.
|
| So you'd need a "close", and a "close_sync", and possibly
| also a "close_datasync" (if you're ok with discarding
| metadata). And one could argue at that point `close` has
| essentially no value beyond hopefully getting rid of the fd /
| handle, and drop already does a fine job of that.
| Arch485 wrote:
| IIRC the Rust book talks about files automatically being
| closed when dropped, and how that's better than having a
| close method. That's probably why it's not a separate method,
| even though it suppresses errors.
| tux3 wrote:
| Don't call close() twice if you're not using pidfds, this is
| racy. The fd could be reused in between the two calls. You'll
| risk closing random things a frustratingly small fraction of the
| time, creating very hard bugs for yourself.
| silon42 wrote:
| Surely File.close will clear the fd on success...
| tux3 wrote:
| You're right, looks like it does!
| MartinodF wrote:
| Yeah but is it safe for concurrent use?
| masklinn wrote:
| If you're performing concurrent unprotected closing of an
| fd, you're deep into the wild lands of nonsense already.
| lionkor wrote:
| (pretty much) only channels are safe for concurrent use in
| go
| adrianmsmith wrote:
| Surely this would all go away if Go had an exception handling
| mechanism like most mainstream languages do?
|
| You'd just concentrate on the "happy path", you'd close the file,
| there'd be nothing to forget or write blog posts about because
| the exception would be propagated, without needing to write any
| lines of code.
| iainmerrick wrote:
| You're not wrong, but that ship sailed away a dozen years ago.
| delusional wrote:
| If you've never seen somebody type 'catch (Exception e) {
| logger.log("should never happen", e);}' then sure. In the real
| world people will often explicitly ignore the error, even when
| they are confronted with it.
| devjab wrote:
| Explicit error handling is a choice and implicit error handling
| through exceptions is not necessarily a feature.
|
| Both have advantages and disadvantages, I'd say the more
| "modern" approach actually the opposite to what you state here,
| and is in my opinion the way to Go (pun int intended), though
| it's also how Haskell does it. You'll find the same philosophy
| in Rust, Zig, Swift and others which all build on the previous
| decades of throwing exceptions and how terribly that scales in
| terms of maintainability. Even in the "old world" like with
| Java you have Kotlin which does both.
| saurik wrote:
| I feel like you might be claiming that Haskell does something
| like Go does, but it actually doesn't: it supports monads,
| and so uses a monad to hide the error semantics entirely,
| providing exception-like syntax with automatic propagation.
|
| (I might misunderstand your use of "though", though? It could
| be that you were just noting in passing how Haskell disagrees
| with all of these supposedly-"modern" languages, and instead
| leaned into the sane happy path semantics, thanks to monads.)
|
| (edit: to be clear, though... I do not think exceptions solve
| this. I wrote a comment elsewhere on this thread about the
| semantics issue, but a few other people also wrote similar
| things while I was trying to type my overly-verbose reply
| ;P.)
| foldr wrote:
| It's worth noting that using Either style monadic error
| handling in IO code in Haskell is arguably an antipattern,
| as the Haskell runtime has its own built in exception
| handling (which actually works in a pretty conventional
| way). Some more info on this under 'Exceptions best
| practices ' here:
| https://tech.fpcomplete.com/haskell/tutorial/exceptions/
| xg15 wrote:
| > _Explicit error handling is a choice_
|
| Yeah, it is - a bad choice IMO.
|
| I know the "if err != nil" pattern is spoken up as some sort
| of cultural idiosyncrasy of Go, similar to the whitespace
| formatting in python. But so far, I haven't seen any actual
| data (or even arguments) why it is superior to exceptions, or
| which inherent problems of exceptions it solves. (The
| classical example of "it makes control flow more obvious and
| makes it easier to ensure that mandatory finalizers are not
| skipped" was just disproven by this very article)
|
| So if there is more substantial criticism against exceptions
| than just FUD, I'd like to know it.
| randomdata wrote:
| Well, let's assume Go did commonly make use of exception
| handlers for error cases: defer func() {
| if r := recover(); r != nil {
| fmt.Println("Failed to write file", r) }
| }() f := os.Create("file") defer
| f.Close() io.WriteString(f, "Hello, World!")
|
| Cool. You've solved one problem layer, perhaps. But if you
| look closely you'll notice that code still has a bug!
|
| So clearly exception handlers aren't enough. There might be
| some good ideas in using exception handling, but if you
| strictly limit its use to errors you end up half-assing it.
| Why not go all the way and find a solution that works in
| all cases?
| matt_craig wrote:
| Is it that the defer() that recovers is called _before_
| the defer() that closes (and potentially fails)?
| randomdata wrote:
| No. No rearranging of the code would fix the problem.
| matt_craig wrote:
| Hmm. Checked the docs, looks like os.Create returns *File
| and err, but the sample doesn't check that error.
| randomdata wrote:
| os.Create, per the topic of discussion, is said to throw
| an error using the exception handling mechanism rather
| than return error. The exception handler checks the
| error.
|
| If you focus on errors, you're going to never get it. The
| whole idea here is that the problem is bigger than
| errors.
| xg15 wrote:
| Maybe I'm too dumb too see the bug here.
|
| Do you mean the issue that if both WriteString() _and_
| Close() throw an exception, the one from WriteString()
| will be swallowed?
|
| That used to be a problem, but has been solved in more
| modern implementations using suppressed exception
| tracking.
| randomdata wrote:
| Ha. I suppose you could argue that is also a bug, but not
| the one I was thinking of. Let's say this hypothetical Go
| where error over exception handling is the norm has no
| such issue.
|
| Think more carefully about what we're actually trying to
| solve. It is not just about errors.
| ghosty141 wrote:
| > I haven't seen any actual data (or even arguments) why it
| is superior to exceptions, or which inherent problems of
| exceptions it solves
|
| It's faster. Doing all error handling via exception is not
| viable if you want speed.
|
| Exceptions work well for errors in the sense that these
| rarely happen. But using them for general "this is the bad
| outcome" of an operation that can happen in the hot-path is
| problematic. For example pythons "next()" function which
| raises an exception if the end of the iterator is reached,
| if C++ did this it just wouldn't be used.
|
| So in my opinion exceptions are nice if you want ease of
| use, and explicit error handling (look at rust with their
| question mark operation which makes it pretty easy to do)
| is the way to go for performance.
|
| One advantage of explicit error handling like rust does it
| is: it forces you to handle the error path like other
| business logic. Again, if you write a short script, this is
| annoying and doesn't get you much but if the application is
| very important then doing so is a good thing since it
| forces you to think about where which errors can happen and
| how you handle them. With exceptions its very easy to
| completely forget that a can even happen and thus they get
| ignored and suddenly the program crashes with a unreadable
| error message.
| saurik wrote:
| This is merely an implementation tradeoff: I could
| _trivially_ modify a compiler to support exceptions that
| were as fast as returning error sum types without any
| syntax changes, but it comes at the cost of making all
| non-failing code slower--as you have to litter the
| runtime with checks that exceptions elide--so we don 't
| generally do it... but like, if you make a new language
| and don't go with the _syntax_ of exceptions because you
| like the _implementation_ of sum types, I feel you have
| misunderstood the problem space.
|
| (As for "handling" errors, you should only have a few
| places in the entire codebase which do that... littering
| the entire codebase with opportunities to feel like you
| might could handle an error seems like a mistake as it
| just encourages more handling.)
| gregjor wrote:
| You interpret Go using a different pattern for error
| handling as a criticism of or challenge to exceptions.
| Consider that languages can take different approaches,
| without having to get into "best" or "worse." Every
| approach to error handling has advantages and shortcomings.
|
| The Go designers have explained and talked about this
| decision many times. Some people don't like it and maybe
| choose a different language. No one pushes FUD regarding
| exceptions. I get the impression that if you choose a
| Toyota you would think that your neighbor buying a Honda
| poses a criticism you have to address.
|
| Discussed before many times on HN, such as:
|
| https://news.ycombinator.com/item?id=4159641
|
| Original article:
|
| https://commandcenter.blogspot.com/2012/06/less-is-
| exponenti...
| kaba0 wrote:
| > Every approach to error handling has advantages and
| shortcomings
|
| If under 'every approach' you explicitly exclude go's
| terrible errno syntax sugar, and include exceptions and
| sum types, then yeah. There is zero advantage to go's
| error handling compared to the proper sum typed
| solutions.
| kevindamm wrote:
| Sibling comments have good points but what made it click
| for me was the book "Exceptional C++" where all the
| possible sources of exceptions are pointed out -- it's
| usually twice as many as you would first guess by looking
| at the code, sometimes made more complicated by type-
| converting constructors and operator overloading and RAII.
|
| Then consider that if any code you're calling is not
| exception-safe it makes the task of writing your code to be
| exception safe that much harder.
|
| Then add on top of that the question of responsibility --
| do you handle the exception close to where it gets thrown
| or farther up the call stack? Handling it too close to the
| issue may be missing some context and result in a lot of
| the same exception handling across many areas of the
| codebase. But handling it too far up misses context, too,
| and often leaves the program in an uncertain state where
| it's not clear whether the show must go on or if it's time
| to shut it down. It's not uncommon to see a mix of
| exceptions and returned error values to try and find a
| happy middle ground there.
|
| Java tried to paper over some of these problems with fewer
| gotchas (more explicit separation between resource
| allocation and initialization) and compiler checks that
| exceptions are in the method signature and are always
| handled _somewhere_ up the call stack but this often
| results in handling of very vague exception types at a high
| level. Some developers, uncertain what the right way to
| handle an exception is, and not wanting to crash the
| program, will just silently drop the exception instead!
| Admittedly, you can do this in go-style and C-style error
| handling, too, but at least then it's not as far removed
| from the source of an error because it's annoying to have
| to keep passing an error response through so many function
| signatures.
|
| I used to use exceptions a lot in C++ and Java. This
| changed when I started working at a place where a lot of
| C++ is used but without exceptions. It was ostensibly about
| runtime costs but when seniors were pressed on the issue it
| was clear there were a lot of these other reasons stated
| above, and ultimately about readability of the code (error
| handling being close to error source) and a philosophy of
| failing quickly when assertions fail (because keeping calm
| and carrying on leads to programs running that should have
| died before they could make more of a mess).
|
| I know it's unpopular but, in light of some real problems
| with exceptions, I actually prefer the way Go makes you do
| it. It often encourages doing all the setup in one (or a
| few) function scopes, and it becomes pretty evident from a
| function's signature whether you should expect something to
| go wrong or not. Would I prefer it for a game-dev scripting
| language? no! But for a system language it is better IMHO
| devjab wrote:
| I prefer it because of location. The error handling happens
| where the error occur. It makes large systems much more
| maintainable in an enterprise setting.
|
| It's not like exception handling and throwing around things
| to have them caught later is inherently bad. It's just a
| different philosophy, one that I don't personally like
| anymore. It's down the same alley as things like OOP, SOLID
| or DRY. Things which have good concepts that way too often
| leads to code bases which are incredibly annoying to work
| with. Maybe not for small systems with short life times,
| but for systems where you're going to be the 100th person
| working on something that's been running for 30 years it's
| just nice to not have to play detective. I'd like to put a
| little disclaimer in here, because that isn't inherently a
| consequence of exception handling or any of the other
| concepts but it's just what happens when people work on
| code on a Thursday afternoon after a tough week. The
| simpler less abstract things are made, the easier it'll be
| to unravel, and simple error handling is dealing with the
| errors exactly where they occur.
|
| As others point out, it's not without its disadvantages.
| It's just that in my experience, those are better
| disadvantages than the disadvantages of implicit error
| handling.
| gwd wrote:
| > But so far, I haven't seen any actual data (or even
| arguments)
|
| OK, here's an argument.
|
| - In order to write resilient software, programs must
| handle not only the "happy path" when things succeed, but
| the path where things might fail.
|
| - Thus it is important for developers to 1) be aware of
| which operations may fail fail, and b) think about what the
| program should do in that case.
|
| - Exceptions make it easier for the programmer to forget
| that something might fail, and to avoid thinking about what
| to do if it does fail.
|
| - Go's error handling idiom makes it clear that an
| operation might fail, and prompts programmers to think
| about what to do in that case. (They may of course choose
| not to think about it, but at least they made a conscious
| choice at some level.)
|
| Thus Go's error handling idiom nudges developers towards
| more resilient software than exception-based workflows.
|
| Or to put it differently: Programming systems which may
| fail simply _is_ ugly: there are an exponential number of
| ways a system may fail, and each one must be handled
| correctly. Exceptions hide this ugliness, but by doing so
| make it more likely that there will be cases not handled
| correctly. By exposing this ugliness, Go makes it more
| likely that most cases will be handled correctly.
| kaba0 wrote:
| > In order to write resilient software, programs must
| handle not only the "happy path" when things succeed, but
| the path where things might fail
|
| And exceptions let you handle error conditions without
| making the actual business logic harder to read, with as
| little or much specificity as required.
|
| > Thus it is important for developers to 1) be aware of
| which operations may fail fail, and b) think about what
| the program should do in that case
|
| Checked exceptions/effect types exist, being explicit or
| implicit in function signatures is not a fundamental
| property of exceptions.
|
| And what is clearer in terms of error handling -- if err
| being every third line, with questionable handling logic,
| e.g. just printing or swallowing stuff (or _gestures at
| the article_ ), and definite human error from repetition
| ---- or a well-defined block with proper scoping, without
| which the error case does the only reasonable thing --
| automatically bubbles up, making it possible to handle
| higher up. There is often no immediate action that can be
| done in certain _exceptional_ situations, e.g. your
| ordinary function that writes a file can't do anything
| about a full disc. The best it can do is to yell, so that
| the action that called it somewhere can do some evasive
| action, e.g. re-trying /notifying the user/etc.
|
| > Exceptions make it easier for the programmer to forget
| that something might fail, and to avoid thinking about
| what to do if it does fail.
|
| Disagree. If anything, something not being in a try-catch
| block says that it will be handled higher up (or checked
| exceptions making it part of the signature), and when
| it's surrounded by it, I know what is the happy path, and
| unhappy path immediately, without it being crossed over
| (usually badly), as it would happen with if errs.
|
| > Go's error handling idiom makes it clear that an
| operation might fail
|
| What about the case when it both returns a value and an
| error?
|
| > and prompts programmers to think about what to do in
| that case
|
| Blindly if erring and printing out a random string is not
| error handling. That's just noise, and a terrible trap
| for yourself, having to grep for useless error codes
| later on.
| pjmlp wrote:
| The modern approach of Rust and Swift is basically checked
| exceptions done with a different painting.
| kaba0 wrote:
| Go is doing the worst thing possible. It is neither
| expressive enough for proper sum types, nor does it have
| expressions (that are analogous to sum types with good
| defaults and syntactic sugar).
|
| It is literally C's shitty `errno` with syntactic sugar.
| randomdata wrote:
| Go does indeed has an exception handling system like most other
| languages. Errors and exceptions are very different things,
| though.
|
| Of course, nothing stops you from building your own file
| handing package that overloads exception handlers to deal with
| errors. If it gains traction then it would prove the stdlib
| should consider a v2 API.
|
| But that you already haven't done so is telling...
| peterashford wrote:
| You mean panics? They don't work across go routines so
| they're limited and hardly exception handling like most other
| languages
| randomdata wrote:
| Which languages see exception handlers work across go
| routines?
|
| Anyway, most don't. There is no difference from exception
| handlers in most other languages.
|
| The syntax is a little different. Is that where you've
| become confused?
| rerdavies wrote:
| Languages in which try/catch/throw work across coroutine
| boundaries: Java, Javascript, and it's a standard feature
| in C++ coroutine libraries (and completely supported by
| the core C++ coroutine engine). So in my limited personal
| experience, among the languages that I am familiar with
| ... all of them except go.
|
| Are there actually ANY languages other than go that have
| coroutines, and try/catch/throw mechanisms, where you
| cannot throw across a coroutine boundary?
|
| And why would exception handlers NOT work across
| coroutine boundaries, other than laziness on the part of
| implementers?
| randomdata wrote:
| _> Languages in which try /catch/throw work across
| coroutine boundaries_
|
| You wha...? The question was about _goroutines_ , not
| coroutines.
|
| Besides, you'll notice that exception handlers cross
| coroutine boundaries in Go just fine. Your random tangent
| isn't even correct. Where did you dream up this idea to
| the contrary? I know coroutines are still new to Go, only
| officially landing in the latest release (experimentally
| in 1.22), but you'd think that would also mean their
| behaviour is fresh in your memory.
|
| I'll take your avoidance of the original question to mean
| that no other language does it either.
| xg15 wrote:
| "You can always make your own" is missing the point. Of
| course it's useless if a single library implements error
| handling differently than the rest of the language. The
| question is why the language does have this kind of manual
| error handling as a standard in the first kind.
|
| > _If it gains traction then it would prove the stdlib should
| consider a v2 API._
|
| Some library that behaves completely differently from the
| rest of the language and breaks all interop with the rest of
| the ecosystem will have a hard time gaining traction, no
| matter if the way the library does it is objectively better
| or not.
| randomdata wrote:
| _> The question is why the language does have this kind of
| manual error handling as a standard in the first kind._
|
| Probably for the same reason Rust does, and why it suffers
| much the same problem:
|
| 1. It is what was in vogue in the 2010s.
|
| 2. More importantly, the problem isn't limited to errors.
| What have you gained treating errors as some hyper special
| case when they aren't any different than any other value?
|
| I think we agree that we can do better, but seeing errors
| as special doesn't get you there. We need something that
| understands the all-encompassing problem.
|
| So, failing that understanding, if you're going to do
| something that sucks, you may as well choose the least-
| sucky option, surely? Exception handling brings a horrible
| developer experience. To the point that in languages where
| errors over exception handling semantics are the norm, you
| will find that most developers simply give up on error
| handling entirely because it is so painful.
|
| _> Some library that behaves completely differently from
| the rest of the language and breaks all interop with the
| rest of the ecosystem will have a hard time gaining
| traction_
|
| I'm not sure history agrees. Ruby was also of the return
| values over exception handling mind before Rails came
| along. Rails pushed exception handlers for errors and
| developers went for it. Provide an API people actually want
| to use, and they'll use it. What was common before is
| inconsequential.
|
| I expect what you are really saying is that exception
| handling wouldn't actually improve this example case even
| in the best case, and in the worst case developers would
| end up giving up on error handling leaving such a package
| to be a net negative to a codebase.
| rerdavies wrote:
| There's a self-selection problem though. People who prefer
| error handling through exceptions are just not going to use
| go. Period. Will not use it. Been there done that. No no no.
|
| So it is telling. But I think what it actually tells is that
| people would have done it just use another language instead.
| randomdata wrote:
| As they should. Scripting tasks are, indeed, best performed
| in scripting languages.
| tonyedgecombe wrote:
| Most of the other options don't include native code
| compilation though. I suppose Swift is there but doesn't
| make much sense outside of the Apple ecosystem.
| cyco130 wrote:
| This problem isn't solved with exceptions either. The problem
| is that finalizers (C++ destructors, Java's `finally` blocks,
| Go's `defer` etc.) shouldn't fail but `close()` _can_ fail.
| Therefore, for 100% correctness, `close()` calls should be
| handled explicitly and not left to finalizers.
|
| Finalizers shouldn't fail because they might be executed while
| another exception is already in flight. Three languages have
| three different behaviors when that happens but in my opinion
| they all do the wrong thing:
|
| In C++, if you throw from a destructor while another exception
| is in flight, the program will be terminated. In Java, throwing
| from a `finally` block will "forget" the original exception. In
| Go (according to this article, I'm not familiar with it), error
| from `defer` will be ignored. None of these are ideal.
| pansa2 wrote:
| > _in my opinion they all do the wrong thing_
|
| What would be the right thing? Combining the original
| exception and the error from `close` into some kind of
| `MultipleError`?
| saurik wrote:
| Personally I do not believe the math of these two monads
| allows for any better solutions (and I do not believe
| multierror is correct ;P)... I am thereby also very curious
| what they think the correct thing to do here is.
| cyco130 wrote:
| That's probably the best option I think. I've heard Ada
| does that (but don't quote me on that). If you can access
| the original errors from the `MultipleError` object, at
| least you can tell the user what exactly went wrong.
|
| I don't thing there's one true right thing(tm) though.
| That's why explicit handling is necessary: The compiler
| doesn't have enough context to handle it for you. The
| programmer needs to decide what's the right way to handle
| it.
| Almondsetat wrote:
| What about Java's try-with-resources?
| cyco130 wrote:
| My Java knowledge is at least 10 years out of date so I'm
| not familiar with try-with-resources. But from what I can
| gather from Google, it looks like they now provide a way to
| access the suppressed exceptions, which is probably a step
| in the right direction.
| shawnz wrote:
| The problem is handled with exceptions plus Java's try-with-
| resources or C#'s using statements or Python's context
| managers though, right?
|
| Furthermore in Java since version 7 you can actually see both
| exceptions with the suppressed exceptions pattern.
| cyco130 wrote:
| See my reply to the sibling comment about try-with-
| resources. I'm not familiar with any of these mechanisms
| but giving access to suppressed exceptions is probably a
| step in the right direction.
| layer8 wrote:
| Java with try-with-resources does the correct thing: It
| attaches the new exception as a secondary exception to the
| currently in-flight exception.
|
| Since function calls form a tree, exceptions must form a tree
| as well.
|
| Doing this automatically is also one of the killer arguments
| for exceptions over error codes, IMO.
| dwattttt wrote:
| How automatic it is depends on the language, but error
| "objects" (rather than codes) can do this too. It is pretty
| great.
| layer8 wrote:
| Yes, with enough syntactic sugar it can become
| equivalent. Exceptions can be viewed as sum types with
| special syntactic sugar in conjunction with the regular
| return types, and can in principle be implemented as
| such.
|
| When I say "exceptions", I mean the source-level
| semantics, not how it's implemented behind the scenes.
| kokada wrote:
| Go also has `errors.Join` since 1.20, and one of its uses
| is exactly this one: to be used in deferred close that can
| possible raise errors so the errors can stack each other:
| https://stackoverflow.com/a/78013962.
|
| > Doing this automatically is also one of the killer
| arguments for exceptions over error codes, IMO.
|
| Definitely doing this automatically is better than relying
| on the programmer to do this manually, but I wouldn't say
| this is the "killer" argument for exceptions over errors
| codes, because this doesn't add anything new to the
| argument of exceptions vs explicit error handling.
| neonsunset wrote:
| It might fail, but usually doesn't. In C#, to handle an e.g.
| file open failure, you can just write try {
| using var file = File.OpenWrite("test.txt");
| file.Write("Hello, World!"u8); } catch
| (IOException e) { Console.WriteLine(e.Message);
| }
|
| The exception handler is an enclosing scope for both file
| open and dispose (flush and close) operations. You can also
| hoist file variable to an outer scope to, for example, decide
| what to do if dispose throws for some reason. In practical
| terms, this is as much of an edge case as it gets, but can be
| expressed in a fairly straightforward way.
| lostmsu wrote:
| I don't think this does the right thing. You are catching
| exception if Close fails and nothing else. The problem
| thought is what to do when after file is opened something
| fails first, and then Close also fails.
| neonsunset wrote:
| I'm catching any exception that occurs within the scope
| of try block.
|
| This includes an exception when trying to open or write
| to the file. If I fail to write to a file, it will try to
| dispose the stream, which flushes it and closes the file
| handle. If disposing the file handle itself fails, which
| should never happen, the exception will occur in the
| finally block, which this exception handler catches too.
| If you need to disambiguate and handle each case
| differently, which is rarely needed, you can order try-
| catch-finally blocks differently with explicit dispose
| and different nesting. This, again, is not a practical
| scenario and most user code will just `using file =
| File.OpenWrite` it and let the exception bubble up.
| lostmsu wrote:
| Yes, and this ignoring of the original exception is the
| core of the problem discussed. If you are willing to lose
| supposedly written data, your approach is golden.
| neonsunset wrote:
| As said in the previous comment, you can place a variable
| in an outside scope and assign to it from within an try-
| catch block to handle an open file stream in a particular
| way upon failure. You can simply write more code to
| disambiguate and specifically handle errors from separate
| parts of the execution flow. In any case closing a file
| handle should never fail, but if it does - there are
| tools to deal with this.
| lostmsu wrote:
| Well the point it: the problem is not as simple in C# as
| your initial snippet suggests.
| kaba0 wrote:
| With exceptions you don't silently ignore error and go on
| as if nothing happened.
|
| Just simply not explicitly handling every single possible
| error is the correct choice in many scenarios - in which
| case it bubbles up to a general error handler, e.g.
| telling the user that something bad happened here and
| here.
| ekimekim wrote:
| Python handles this case by raising the new error but
| including a reference to the original error. By default, the
| formatted error shows both: >>> mylist = []
| >>> try: ... first = mylist[0] ...
| finally: ... inverse_length = 1.0 / len(mylist) #
| imagine this was something more complex ...
| Traceback (most recent call last): File "<stdin>",
| line 2, in <module> IndexError: list index out of
| range During handling of the above
| exception, another exception occurred:
| Traceback (most recent call last): File "<stdin>",
| line 4, in <module> ZeroDivisionError: float division
| by zero
| immibis wrote:
| Java also does this.
| louthy wrote:
| > Surely this would all go away if Go had an exception handling
| mechanism like most mainstream languages do?
|
| Or monads, but that might be a step too far for the Go world
| considering the push-back against generics. When you have
| declarative error handling (a good thing imho) then monads
| really are the bees knees.
| thiht wrote:
| Exceptions are a terrible error handling mechanism. You have no
| idea what throws and what doesn't, it's impossible to write
| defensive code that makes sense with exceptions. Errors as
| value is the only sane way to deal with errors. Granted, Go
| does it pretty badly but it's still infinitely better than
| exceptions.
| adrianmsmith wrote:
| > You have no idea what throws and what doesn't
|
| The answer here is that everything throws.
|
| Any code can have a Null/Nil dereference error, any code can
| use an array and generate an out-of-bounds exception, etc.
| skitter wrote:
| When there's a null pointer dereference, that's a bug in
| the program; when closing a file fails, that's something
| external the program must handle.
| DarkNova6 wrote:
| > You have no idea what throws and what doesn't
|
| You do in Java. It's called Checked Exceptions. It's a
| binding API contract and communicates this well not only when
| writing the code, but also when reviewing it.
| masklinn wrote:
| There are more wrinkles with this:
|
| - if you are _creating_ a file, to ensure full synchronisation
| you also need to fsync the parent directory, otherwise the file
| can be fsynced but the update to the directory lost
|
| - if sync fails, you can not assume anything about the file,
| whether on-disk _or in memory_ , critically one understanding
| which got dubbed "fsyncgate" and lead to many RDBMS having to be
| updated is that you can not portably _retry_ fsync after failure:
| the earlier error may have invalidated the IO buffers _but_ IO
| errors may not be sticky, so a later fsync will have nothing to
| write and report success
| dist-epoch wrote:
| So if I do fopen/fwrite/fsync/fclose, that is not enough? That
| is crazy, I think 90% of apps don't fsync the parent directory.
| Also, how many levels of parents do you need to fsync?
| masklinn wrote:
| > So if I do fopen/fwrite/fsync/fclose, that is not enough?
|
| That is my understanding.
|
| > Also, how many levels of parents do you need to fsync?
|
| Only one, at least if you didn't create the parent directory
| (if you did then you might have to fsync its parent,
| recursively). The fsync on the parent directory ensures the
| dir entry for your new file is flushed to disk.
| magicalhippo wrote:
| I've never heard about this in my years of programming. I
| just tried to read through the Win32 documentation, as I've
| done several times over the years, and it mentions a lot of
| edge cases but not this that I could see.
|
| Is this some Linux/Unix specific thing? Am I blind?
| masklinn wrote:
| I am talking about posix semantics yes, I have no idea
| how things work on windows.
| magicalhippo wrote:
| Phew. I've primarily used Windows. Not that any of the
| posix programs I've been exposed to have done the dir
| sync though.
|
| For cross-platform stuff I've mainly used Boost, which I
| assumed handled such details.
| guappa wrote:
| I'm sure it does not.
|
| Also these things are needed very very rarely (which is
| why few even know about the issue) and are not good for
| performance and battery life.
| account42 wrote:
| More specifically, these things are for trying to improve
| the behavior on unclean system shutdown (e.g. power loss)
| which is inherently chaotic and unless all parts (most
| critically the disk and its controller) are well behaved
| you don't have any real guarantees anyway.
|
| Windows also doesn't guarantee that data is written to
| disk by the time WriteFile/CloseHandle returns, the
| Windows version of fsync is FlushFileBuffers.
| red_admiral wrote:
| There's the old saying that on Windows*, files have
| names; in POSIX, names have files. I think that's what
| makes the difference here.
|
| * technically it's the filesystem as much as the OS that
| is relevant here.
| pixl97 wrote:
| * a very limited set of names in relation to what's
| possible on other operating systems.
|
| If your processing files from other systems on NTFS
| you'll very likely have rename said files in an
| application and store an index of the names.
| red_admiral wrote:
| As far as I know, it's specific to the combination of
| certain POSIX-ish OS and file systems, like linux/Ext3. I
| have no clue what BSD does here, or whether ReiserFS is
| different.
|
| Windows/NTFS is a different world, there are still edge
| cases that can go wrong but I don't think this particular
| one is a problem because FAT/NTFS is not inode-based.
|
| I imagine if you looked at the SQLite source code you'd
| see different edge-case-handling code for different OSes.
| pixl97 wrote:
| Ya NTFS file naming and long path is a wreck. Unzipping
| files from mac/linux is a easy way to end up with missing
| data. Applications quite often break on long file paths
| especially Microsofts own stuff like powershell.
| cryptonector wrote:
| NTFS has inodes.
|
| The thing about Windows is that because the file open
| operation (`CreateFile*()`) by default prevents renames
| of the file, Windows apps have come to not depend so much
| on file renaming, which makes one of the biggest
| filesystem power failure hazards less of an issue on
| Windows. But not being able to rename files over others
| as easily as in POSIX really sucks. And this doesn't
| completely absolve Windows app devs of having to think
| about power failure recovery! "POSIX semantics" is often
| short-hand for shortcomings of POSIX that happen to also
| be there in the WIN32 APIs, such as the lack of a
| filesystem write barrier, file stat-like system calls
| that mix different kinds of metadata (which sucks for
| distributed filesystems protocols), and so on. And yes,
| you can open files in Windows such that rename-over is
| allowed, so you still have this problem.
| jiggawatts wrote:
| Crazy is the right term. File system APIs in general have too
| many sharp edges and need a ground-up rethink.
|
| Consider S3-like protocols: these recognise that 99% of the
| time applications just want "create file with given contents"
| or "read back what they've previously written."
|
| The edge cases should be off the beaten path, not in your way
| tripping you up to when you want the simple scenario.
| lionkor wrote:
| It sounds more like you are asking for an abstraction
| lmz wrote:
| The file system is already an abstraction. I think they
| are asking if it's the right one.
| Cthulhu_ wrote:
| Aren't the edge cases features? An abstraction (or
| different API, sure) is in order to prevent footguns.
| However, this abstraction should not force fsyncs for
| example, due to the performance impact mentioned. It puts
| the choice of guaranteed writes vs performance to the
| developer.
| aseipp wrote:
| > Aren't the edge cases features?
|
| What features do you have in mind?
|
| > It puts the choice of guaranteed writes vs performance
| to the developer.
|
| Yes, and it's a completely false choice. This entire
| point of this thread is that fsync is an incredibly
| difficult API to use in a way that gets you the
| guarantees you need ("don't lose the writes to this
| file"). And that the consistency guarantees of specific
| filesystems, VFS, POSIX, and their interactions are not
| easy to understand even for the experienced -- and it can
| be catastrophic to get wrong.
|
| It isn't actually a choice between "Speed vs
| correctness". That's a nice fairy tale where people get
| to pretend they know what they're up against and everyone
| has full information. Most programmers aren't going to
| have the attention to get this right, even good ones. So
| then it's just "99.9% chance you fucked it up and it's
| wrong" and then your users are recovering data from
| backups.
| josephg wrote:
| A better abstraction, designed from the ground up,
| wouldn't force fsyncs to work.
|
| For example, write groups or barriers (like memory
| barriers) would be wonderful. Or a transaction api, or io
| completion ports like on windows.
|
| In a database (and any other software designed for
| resiliency), you want the file contents to transition
| cleanly from state A to B to C, with no chance to end up
| in some intermediate state in the case of power loss. And
| you want to be notified when the data has durably
| written. It's unnecessarily difficult to write code that
| does that on top of POSIX in an efficient way. Most code
| that interacts with files is either slow, wrong or both.
| All because the api is bad.
| TickleSteve wrote:
| Is the filesystem the correct abstraction? For most
| applications, a database-like API is more appropriate,
| hence SQLite.
| cryptonector wrote:
| The filesystem is very easy to use for simple things by
| comparison to a DB, and it's more accessible from the
| shell. But you're right, the filesystem is _very_
| difficult to use in a power failure safe way. SQLite3 has
| great power failure recovery testing, so the advice to
| use SQLite3 for any but the simplest things is pretty
| good.
|
| It'd be very nice to get some sort of async filesystem
| write barrier API. Something like `int fbarrier(int fd)`
| such that all writes anywhere in the filesystem will be
| sync'ed later when you `fsync()` that fd.
|
| It would also be very nice to have an async
| `sync()`/`fsync()`. That may sound oxymoronic, but it's
| not. An async `sync()`/`fsync()` would schedule and even
| start the sync and then provide a completion notice so
| the application can do other work while the sync happens
| in the background. One can do sync operations in worker
| threads and then report completion, but it'd be nice to
| have this be a first class operation. Really, every
| system call that does "I/O" or is or can be slow should
| be / have been designed to be async-capable.
| formerly_proven wrote:
| One of the advantages of using a database for files is that
| it's relatively more likely that these platform-dependent
| considerations were indeed considered and you don't have to
| moonlight as an DBMS engineer while writing your application.
| EasyMark wrote:
| One reason Iike use sqlite so much in my personal projects,
| sometimes just using it to store various config data bits
| within row entries of the most generic table ever, works
| well with small projects, probably not so much with large
| scaled ones.
| red_admiral wrote:
| Hence the saying, "SQLite competes with fopen()".
| layer8 wrote:
| Yes. File system implementations never really thought this
| through, and hence here we are.
| everforward wrote:
| That should be only for creating files, and maybe updating
| their metadata (not sure about that one).
|
| The confusion stems from people thinking that files and
| directories are more different than they are. Both are
| inodes, and both are basically containers for data. File
| inodes are containers for actual data, while directory inodes
| are containers for other inodes.
|
| All inodes need to be fsynced when you write to them. For
| files this is obviously when you write data to them. For
| directories, this is any time you change the inodes they
| contain, since you're effectively writing data to them.
|
| You only need to sync the direct parent because the
| containers aren't transitive; the grandparent directory only
| stores a reference to the parent directory, not to the files
| within the parent directory. It's basically a graph.
| xdavidliu wrote:
| > It's basically a graph.
|
| do you mean it's a basically a tree? Because if it were
| just a graph, you could still have edges from the
| grandparent to the grandchild in addition to the one from
| the former to the child
| red_admiral wrote:
| This ... depends. In normal POSIX land, hard links break
| the tree structure, for one, so you get a DAG but not a
| tree. I think some file systems do enforce tree
| structure, though - hard links are not supported
| everywhere.
|
| It used to be possible ages ago to hard link to
| directories, which meant that you could have actual
| cycles and a recursive tree-walking algorithm would never
| terminated. (As far as I know you can still do this by
| editing the disk manually, although I think fsck will
| make a fuss if it detects this.)
|
| You can still, with the right syscalls and drivers, do
| something like hard links on NTFS (I think they're
| technically called mount points but it's not the same
| thing as POSIX ones). I'm not sure if you can still make
| directory cycles, and you're probably a bad person if you
| do.
| account42 wrote:
| Logically, ".." is a hard link to the parent directory.
| This may or may not actually be the case in the
| filesystem on the disk.
| monocasa wrote:
| > , hard links break the tree structure, for one, so you
| get a DAG but not a tree.
|
| More than DAGs, but instead pretty arbitrary graphs since
| you can express cycles with hard links.
| RenThraysk wrote:
| In the case of atomic file writing, the rename() doesn't
| cause a fsync of the parent directory?
| holowoodman wrote:
| Depends on the file system. In NFS, yes. In others, maybe
| not.
| altfredd wrote:
| > rename() doesn't cause a fsync of the parent directory?
|
| It does not.
|
| At best it will schedule a journal commit asynchronously
| (I recall that ext4 maintainer complained about adding
| this "workaround for buggy user code" on lkml). If you
| want to receive an IO error when renaming fails, make
| sure to call fsync() yourself.
| guappa wrote:
| > while directory inodes are containers for other inodes.
|
| Uh? No.
|
| You can imagine them as a list of name,inode tuples.
| eyelidlessness wrote:
| I'm not well versed enough on the subject to know if
| there's important nuance here that I'm missing... but
| that doesn't sound like a contradiction to me. That's
| effectively a map with O(n) access, right? I think I
| would more generally refer to that as a "collection", but
| it certainly _contains inodes_ as you describe it.
| flotzam wrote:
| IMO it's better to think of a directory as "containing"
| only the _named references to_ inodes. Other directories
| on the same filesystem may contain other named references
| to the same inodes ( "hard links").
|
| The inodes themselves are more like free floating
| anonymous objects independent of any particular
| directory, and might not have a named reference at all
| (O_TMPFILE; or the named reference was deleted but a file
| descriptor is still open; or orphaned inodes due to
| filesystem corruption or someone forgetting to fsync the
| directory after creating a file as masklinn pointed out -
| e2fsck will link these in /lost+found). This is also why
| chmod appears to affect other hard links in different
| directories: Because it actually modifies the inode (file
| permissions are recorded in _that_ ), not the named
| reference.
| eyelidlessness wrote:
| This definitely seems like a more meaningful distinction
| to me. It's also closer to the mental model I had coming
| into the discussion, FWIW.
|
| And I think it makes the collection/container terminology
| distinction sharper too. Depending on context, I think
| it's usually _reasonable_ (if imprecise) to describe a
| bucket of references or pointers to things as a
| collection of those things. But I don't think it makes as
| much sense to call it a container of those things, except
| in a really abstract sense.
| anon-3988 wrote:
| Is anything ever enough aside from powering down the PC and
| then reading it back again to make sure its there?
| kccqzy wrote:
| You don't need even to bother with fsync unless you are
| developing a database or use the file system like a database.
|
| There's a reason Apple made fsync useless and introduced
| F_FULLSYNC. They know developers have an incentive to
| overestimate the importance of their own files at the
| detriment to system responsiveness and power draw.
| account42 wrote:
| Agreed, things like Firefox constantly fsyncing its profile
| database is absurd.
| o11c wrote:
| The problem is, Apple's fsync doesn't even introduce any
| useful semantics. All it says is "write this eventually".
|
| What a lot of people really need is just a "provide
| ordering, don't leave me with inconsistent data" operation.
| If you lose power "after" a write but before the fsync, for
| any non-networked application that's no different than
| losing power before the write, as long as the filesystem
| doesn't introduce chaos.
| Verdex wrote:
| Great another case of simple thing everyone knows is simple
| but turns out to be horrifyingly complicated and heavily
| system dependent, but only on occasion so you can get 80% of
| the way through your career before encountering the gaps in
| your knowledge.
|
| I guess I'll add it to the list.
|
| Of course on the other hand, I was already thinking that I
| should just use SQLite for all my file handling needs. This
| little nugget makes me think that this was the correct
| intuition. [Queue horrifying revelations w.r.t. SQLite here.]
| garrettr_ wrote:
| w.r.t SQLite, the only horrifying revelation I've had is
| that it allows NULLs in composite primary keys, which I've
| seen lead to some nasty bugs in practice.
| cryptonector wrote:
| Yes, you have to declare PRIMARY KEY columns as NOT NULL.
| There's lots of little caveats like this about SQLite3.
| So what.
| cryptonector wrote:
| You need to `fflush()` too, before the `fsync()`.
| praptak wrote:
| > otherwise the file can be fsynced but the update to the
| directory lost
|
| It also goes the other way - the update to the directory can be
| fsynced but the file lost. This can break the "create temp
| file, write, close, rename to current" scenario (when the
| intention is to replace file contents atomically).
|
| POSIX doesn't guarantee the order in which data hits the disk,
| so the above scenario can become "create temp file, write
| [contents still in memory only], rename to current [written to
| disk], power failure".
|
| I believe there was a bug where this scenario had worked for a
| long time then one filesystem (ext4?) pushed closer to what is
| admissible under POSIX (non-obvious reorders of physical
| writes) and people started getting random data corruption in
| programs which used write & rename.
| eatonphil wrote:
| Can anyone provide links to learn more about these
| misconceptions and the bugs they caused?
|
| Edit: talking about filesystem misconceptions, not fsyncgate.
| sgarland wrote:
| EDIT: realized you were asking about the ext4 bugs, not the
| parent comment of that one. Oh well, keeping this up anyway
| for others.
|
| Postgres has a whole wiki page [0] about it, it's quite a
| read. They also link a [1] MySQL commit to fix the same
| issue.
|
| [0]: https://wiki.postgresql.org/wiki/Fsync_Errors
|
| [1]: https://github.com/mysql/mysql-
| server/commit/8590c8e12a3374e...
| praptak wrote:
| I have found a blog post from 2009 about this issue[1]. I
| think the "recent Ubuntu bug [that] has gotten slashdotted"
| is [2], so I think the most technical discussion about this
| would be [2].
|
| I also found an interesting "scar tissue" from that bug in
| the current ext4 docs[0]:
|
| _" If auto_da_alloc is enabled, ext4 will detect the
| replace-via-rename and replace-via-truncate patterns and
| force that any delayed allocation blocks are allocated such
| that at the next journal commit, in the default
| data=ordered mode, the data blocks of the new file are
| forced to disk before the rename() operation is committed.
| This provides roughly the same level of guarantees as ext3,
| and avoids the "zero-length" problem that can happen when a
| system crashes before the delayed allocation blocks are
| forced to disk."_
|
| [0]https://docs.kernel.org/admin-guide/ext4.html
|
| [1] https://thunk.org/tytso/blog/2009/03/12/delayed-
| allocation-a...
|
| [2]
| https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781
| nextaccountic wrote:
| > It also goes the other way - the update to the directory
| can be fsynced but the file lost. This can break the "create
| temp file, write, close, rename to current" scenario (when
| the intention is to replace file contents atomically).
|
| > POSIX doesn't guarantee the order in which data hits the
| disk, so the above scenario can become "create temp file,
| write [contents still in memory only], rename to current
| [written to disk], power failure".
|
| Wait.. is there a way to do this correctly? At which points
| is fsync warranted, and how many fsyncs do we need for the
| whole "write file then mv it on top of current" to not lose
| data on power failure?
| paulsutter wrote:
| Sqlite is much better than raw files to keep data intact on
| power failures, and be sure to study the options carefully
|
| If you truly need to use files you can take other steps
| such as mv the old file to .bck before mv the new file, but
| I really think you want sqlite
| account42 wrote:
| SQLite doesn't do any magic other than fsync. Using it to
| deal with power failures is nonsense.
| vlovich123 wrote:
| Well generally OP is correct that SQLite will more likely
| have written fsync code correctly and be fairly robust
| against power failures due to the WAL. Additionally, it's
| not doing any renaming which you would need with plain
| files.
|
| So while it doesn't do any other magic, it's more likely
| to handle power failures correctly.
| cryptonector wrote:
| `fsync()` is not necessarily enough if your on-disk
| format is complex because you still need to write
| recovery code. Of course, SQLite3's on-disk format is
| very complex, and so it requires correspondingly more
| complex power failure recovery code. But SQLite3 has an
| excellent test suite, and in particular they have an
| excellent power failure recovery test suite.
|
| Using SQLite3 to avoid power failure issues is pretty
| good advice. I don't see why GP is getting downvoted.
| nine_k wrote:
| But the authors of SQLite have studied and correctly
| implemented the working fsync logic, so you don't have
| to. It *may* be a correct approach if it's your
| implementation detail, and.nobody else expects that file
| you're replacing.
| Too wrote:
| I think the idea was to store data in the db, instead of
| scattered over multiple files and directories. Then you
| only have to worry about one file to fsync (Two if using
| a wal).
| Dylan16807 wrote:
| > SQLite doesn't do any magic other than fsync.
|
| True.
|
| > Using it to deal with power failures is nonsense.
|
| Using a very well designed library for your use case is
| not nonsense.
| vlovich123 wrote:
| Write, fsync, rename then fsync directory if you need an
| ordering guarantee that the rename is a transaction
| barrier.
|
| Of course, the fun part is that the filesystem can't really
| guarantee fsync behavior if drives lie about it which many
| consumer drives do for benchmark reasons. Fun, no?
| bornfreddy wrote:
| It would also be a fun experiment to write, fsync, cut
| power. Rinse and repeat. On differential drives. It would
| quickly show which drives are lying about syncing.
| weinzierl wrote:
| _" if you are creating a file, to ensure full synchronisation
| you also need to fsync the parent directory, otherwise the file
| can be fsynced but the update to the directory lost."_
|
| And if you need this in Java you still have resort to ugly
| hacks.
|
| https://github.com/apache/lucene/issues/7231
| jryan49 wrote:
| This says the bug is fixed?
| https://bugs.openjdk.org/browse/JDK-8066915
| weinzierl wrote:
| The point is that there is no official way to fsync a
| directory in Java and that everyone is relying on an
| unintentional side effect of an unrelated function to
| accomplish it. The link I supplied is about the fact that
| the side effect briefly disappeared in Java 9 until enough
| people complained.
|
| We're still living in xkcd 1172 land with this, have been
| for a decade or who knows how long.
| jryan49 wrote:
| Java does "abstract" the operating system away from you
| and in systems programming with java you can end up with
| leaky abstractions.
| procflora wrote:
| More on fsyncgate: https://lwn.net/Articles/752063/
| Brian_K_White wrote:
| "listen to what close() says"
|
| and
|
| "don't believe what close() says"
|
| are two different things.
|
| The article is only (initially) talking about the first, and
| that is valid.
|
| The second is just second-guessing the OS and hardware
| environment, and is invalid.
|
| Here's the rule to figure out if you need fsync() or not: "If
| you think you might need fsync(), you don't." ;)
|
| There are almost no cases where you should worry about the
| underlying layers. Basically if you aren't writing the
| filesystem itself, then you shouldn't be calling fsync().
|
| You have to check what open()/close() etc said, but if close()
| said it worked, then it worked. You're done. The fact that
| lightning might have struck the drive just exactly then is not
| your problem and not something you should try to do anything
| about.
|
| Unreliable networks, busses, batteries, etc none of that
| changes this. Those things all already have their own layers
| with their own responsibilities to be doing all the necessary
| testing and verifying and retrying before they return a success
| code to you.
|
| There is open(...,O_SYNC) and mount -o sync (udev rules) for
| the case of a camera or thumb drive connect by usb etc.
|
| It's not merely that you don't have to, it's that it's actively
| wrong to. fsync() is just joggling someone else's elbow while
| they're trying to do their job, and if it seems to solve some
| problem, that actually just exposes that you have some logic or
| order of operations problem and you aren't doing your own job.
|
| "trust the other layer" or "trust the api contract" is
| unrelated to and does not conflict with "Be forgiving in your
| inputs and strict in your outputs.".
|
| It just means:
|
| Do: Check the returned error from, say, malloc().
|
| Don't: Get a success from malloc() and then go try to do things
| to prove that malloc() actually did work.
|
| That would be insane and impossible because it would have to
| apply equally to everything, including every single keyword or
| function you would use as part of the verification. How do you
| know when you so much as set a value to variable that it
| actually got set? If you printf the variable to prove it, how
| do you know printf didn't lie?
|
| The logic for trusting close() is no different from the logic
| for trusting malloc().
|
| Our responsibility is just to stay within the bounds of defined
| behavior and not make any assumptions about anything that isn't
| promised.
| trashburger wrote:
| Arguably, one should call `flush()` on the file first. Resource
| deallocation must always succeed; otherwise a lot of invariants
| break. This is why Zig's close method[0] ignores errors (with the
| exception of `EBADF`).
|
| [0]:
| https://github.com/ziglang/zig/blob/fb0028a0d7b43a2a5dd05f07...
| AndyKelley wrote:
| Note that the "unreachable" there is equivalent to assert(error
| != EBADF), so really it's not even an exception, it's just
| helpful to crash there when debugging if you get that error.
| Important to understand that EBADF is not a catchable error
| because the kernel may have already reused that file descriptor
| for something else, in which case you wouldn't get EBADF, you
| would close an unrelated file descriptor.
| nickcw wrote:
| Here is my favorite solution to this problem //
| CheckClose is a utility function used to check the return from
| // Close in a defer statement. func CheckClose(c io.Closer,
| err *error) { cerr := c.Close() if
| *err == nil { *err = cerr }
| }
|
| Use like this - you must name the error return
| func whatever() (err error) { f, err := os.Open(blah)
| // ... defer CheckClose(f, &err) // ...
| }
|
| This closes the file and if there wasn't an existing error,
| writes the error from f.Close() in there.
| Belphemur wrote:
| I still question why defer doesn't support doing exactly that.
|
| After all it's like the go language provide us with a cleanup
| function that in 99% of the time shouldn't be used unless we
| manually wrap what it's calling to properly handle error.
|
| In the end, what's the point of defer ?
| randomdata wrote:
| _> I still question why defer doesn 't support doing exactly
| that._
|
| When would it ever be useful? You'd soon start to hate life
| if you actually tried using the above function in anything
| beyond a toy application.
|
| _> 99% of the time shouldn 't be used_
|
| 1. 99% of the time it is fine to use without further
| consideration. Even if there are errors, they don't matter.
| The example from the parent comment is a perfect case in
| point. Who cares if Close fails? It doesn't affect you in any
| way.
|
| 2. 0.999% of the time if you have a function that combines an
| operation that might fail in a manner you need to deal with
| along with cleanup it will be designed to allow being called
| more than once, allowing you, the caller, to separate the
| operation and cleanup phases in your code.
|
| 3. 0.001% you might have to be careful about its use if a
| package has an ill-conceived API. If you can, fix the API.
| The chances of you encountering this is slim, though,
| especially if you don't randomly import packages written by a
| high school student writing code for the first time ever.
| bryancoxwell wrote:
| The whole point of this post is that an error returned from
| file.Close DOES matter
| foldr wrote:
| I think from a Go point of view, the lesson to be drawn
| from that is "don't defer a function call if you need to
| check its error value", rather than "defer needs to
| support checking of function return values".
|
| In the example at hand, it really makes more sense to
| call Close() as soon as possible after the file is
| written. It's more of an issue with the underlying OS
| file API making error checking difficult.
|
| In 99% of cases, the solution to this problem will be to
| use a WriteFile function that opens, writes and closes
| the file and does all the error handling for you.
| randomdata wrote:
| _> the lesson to be drawn from that is "don't defer a
| function call if you need to check its error value"_
|
| Isn't the lesson here: If you must have a Close method
| that might fail in your API, ensure it can safely be
| called multiple times?
|
| As long as that is true, you can approach it like you
| would any other API that has resources that might need to
| be cleaned up. f, _ := os.Create(...)
| defer f.Close() // perform writes and whatever
| else if err := f.Close(); err != nil {
| // recover from failure }
|
| (os.File supports this, expectedly)
|
| _> the solution to this problem will be to use a
| WriteFile function_
|
| If it were the solution you'd already be using
| os.WriteFile. It has a time and place, but often it is
| not suitable. Notably because it requires the entire file
| contents to be first stored in memory, which can become
| problematic.
|
| Certainly you could write a custom WriteFile function
| that is tuned to your specific requirements, but now
| you're back to needing to be familiar with the
| intricacies of a lower-level API in order to facilitate
| that.
| foldr wrote:
| Sure, that's an alternative, although it means there will
| be some code paths where the error returned by f.Close()
| becomes the error returned by the entire function and
| others where it is ignored (though you could easily log
| it). That might be fine, but you also might want to
| handle all the cases explicitly and return a combined
| error in a case where, say, a non-file-related operation
| fails and then the file also fails to close.
| randomdata wrote:
| _> becomes the error returned by the entire function_
|
| If you find the error returned by f.Close to be
| significant, are you sure returning again it is the right
| course of action? Most likely you want to do something
| more meaningful with that state, like retrying the write
| with an alternate storage device.
|
| Returning the error is giving up, and giving up just
| because a file didn't close does not make for a very
| robust system. Not all programs need to be robust,
| necessarily, but Go is definitely geared towards building
| systems that are intended to be robust.
| randomdata wrote:
| You seem confused. The article is about writing a file
| where it does matter, but the comment example, which is
| what we're talking about, only reads a file. If close
| fails after read, who gives a shit? What difference is it
| going to make? All your read operations are complete
| already. Close isn't going to trigger a time machine that
| goes back and time and undos the reads you've performed.
| It is entirely inconsequential.
| Cthulhu_ wrote:
| Only if you can safely assume the OS, file system, or std
| lib cleans up any open file handles that failed to close;
| I'm 99% sure this is the case in 99% of cases, but there
| may be edge cases (very specific filesystems or
| hardware?) where it does matter? I don't know.
| randomdata wrote:
| You can't safely assume that, but what are you going to
| do about it when it does fail? There is nothing you can
| do. There isn't usually a CloseClose function to use when
| Close fails. If Close fails, that's it. You're otherwise
| out of luck.
|
| Certainly, in the write failure case where there is write
| failure you'd want to try writing to something else
| (ideally), notify someone that an operation didn't happen
| (as a last resort), or something to that effect in order
| to recover.
|
| But in this case there is no need to try again and nobody
| really cares. Everything you needed the resources for is
| already successfully completed. If there is failure when
| releasing those resources, so what? There is nothing you
| can do about it.
| jrockway wrote:
| > If close fails after read, who gives a shit?
|
| ulimit -n
|
| You ignore errors on close, and one morning you wake up
| with your app in CrashLoopBackoff with the final log
| message "too many files". How do you start debugging
| this?
|
| Compare the process to the case where you do log errors,
| and your log is full of "close /mnt/some-terrible-fuse-
| filesystem/scratch.txt: input/output error". Still
| baffling of course, but you have some idea where to go
| next.
| randomdata wrote:
| To start, you need to figure out why Kubernetes isn't
| retaining your stack trace/related metadata when the app
| crashes. That is the most pressing bug. Which is probably
| best left to the k9s team. You outsourced that aspect of
| the business of good reason, no doubt.
|
| After they've fixed what they need to fix you need to use
| the information now being retained to narrow down why
| your app is crashing at all. Failing to open a file is
| expected behaviour. It should not be crashing.
|
| Then maybe you can get around to looking at the close
| issue. But it's the least of your concerns. You've got
| _way bigger_ problems to tackle first.
| jrockway wrote:
| The app crashes because "too many files" includes the fd
| accept(2) wants to allocate so your app can respond to
| the health check.
| randomdata wrote:
| A file not able to opened is expected, always! accept is
| no exception here. Your application should not be
| crashing.
|
| If I recall, Kubernetes performs health checks over HTTP,
| so presumably your application is using the standard
| library's https server to provide that? If so, accept is
| full abstracted away. So, if that's crashing, that's a
| bug in Go.
|
| Is that for you to debug, or is it best passed on to the
| Go team?
| saurik wrote:
| Which of course isn't really a "solution", as it results in
| ambiguous error semantics; this is especially the case with
| Go's defer, as it pushes the code to the end of the entire
| function, not merely some intermediate relevant scope... the
| result is that a file used near the beginning of a function
| might fail to close but that error will not be realized until
| after something else in the function fails, after the point of
| no return on the close.
|
| Really, this is kind of a fundamental limitation of automatic
| allocation semantics, which is effectively a monad that is
| being stacked with the error propagation monad in a confusing
| manner that means you "should" only defer operations which
| don't have errors.
|
| Even in languages with exceptions, such as C++ and Java, they
| had to wrangle with this problem and failed to solve it: in C++
| 11 or 17 or whatever, deconstructors are now by default nothrow
| in an attempt to prevent this kind of mistake.
|
| ...but then, what does one do with close?! In some sense, the
| entire concept of close _must not fail_ , and yet it exists in
| a world where we don't really believe in anything that can't
| fail, as we like moving around failure semantics.
|
| FWIW, Linus has suggested that the kernel should largely accept
| that application developers don't ever check the return value
| of close... but has also stated that developers should sync the
| file first if they care _and also check close_ (at least, to be
| maximally correct).
|
| But like, what does one do then? How do you ever recover from
| this? I actually think you can't, if you are in a cleanup
| operation... not without breaking your error regime. I think--
| and this is also where the article eventually goes in the
| updates (though without noting you can add your own
| boolean)--you should therefore both explicitly close (and/or
| maybe flush/sync) the file after writing to it _and_ have a
| "if I didn't close this, close it" in your cleanup handler;
| critically, the (explicit) former checks for errors, while the
| (implicit) latter doesn't.
| d0mine wrote:
| The solution is that there are multiple solutions that are
| suitable in different situations.
|
| If you need "strict" requirements then it is likely
| impossible but you can get close enough if you use sqlite. Or
| more lightweight atomic/thread-safe/fault-tolerant file
| library. You could rollout your own: it easy to start, and
| continue until the error rate is tolerable for your
| application (though it may take more dev time).
|
| If you don't need db-like strict guarantees. Just write your
| app knowing that it may fail (data may be lost, corrupted).
| It may be ok in a lot of cases.
| formerly_proven wrote:
| > FWIW, Linus has suggested that the kernel should largely
| accept that application developers don't ever check the
| return value of close... but has also stated that developers
| should sync the file first if they care and also check close
| (at least, to be maximally correct).
|
| Close just behaves very differently depending on the actual
| filesystem, too. Usually it's very fast because it doesn't do
| much of anything, but e.g. on NFS close will actually wait
| for writeback to the server to complete (due to close-to-open
| semantics).
| everforward wrote:
| I think some of those issues with errors could be worked
| around. Wrapping errors provides a way to return a "cleanup
| failed" error that also includes the root cause of being a
| failed write. Likewise, there are packages for handling
| multi-errors.
|
| I think the reality is that most of us push anything where
| the return status of Close would be important into the
| database, specifically because it handles semantics like this
| and simultaneous writes for us. It's like half the selling
| point of SQLite; you could write JSON documents and handle
| all the edge cases yourself, or just jam it in SQLite and
| quite worrying about Close and simultaneous writes and all
| that junk.
| saurik wrote:
| I mean, we can cause the same kind of problem there, as
| people might try to write a scope exit handler to commit a
| transaction. You'd then run into the same issue, and so
| "commit transaction" isn't a thing which should ever be in
| such a construct. Of course, deleting the objects for the
| transaction / closing the database connection / etc. would
| be fine to ignore errors from (and hopefully
| wouldn't/shouldn't fail anyway) and so those can and should
| be automated: you defer close but manually call and check a
| commit.
|
| Once you accept this reality, the file case is the same:
| putting the sync and/or first critical close inline is
| equivalent work. The issue is that you simply can't -- no
| matter what the mechanism is -- slip back and forth between
| your scope maintenance and your error handling monads,
| resulting in needing cleanup operations where failure is
| not an option; and, so, you either must not care about the
| error in the context of the call or must do something even
| more drastic like terminate the entire program for
| violating semantics.
|
| FWIW, I do appreciate that people are less likely to make
| that kind of mistake when working with a database, as
| people largely get that you should even try to commit a
| transaction if the code in it had failed somehow.
| Additionally, I appreciate that if you have a very tight
| scope -- which Go makes hard, but can still be pulled off
| -- the "close and throw an error if and only if we don't
| have an error right now" strategy is not at all horrible...
| it just isn't a "solution" to the underlying issue without
| an understanding of why.
|
| Put elsewise, I think it is useful to appreciate that there
| is more of a universal theoretical / math reason why this
| is awkward and why it kind of needs to be built in a
| specific way, and that this issue transcends the syntax or
| even the implementation details of how you are trying to
| manage errors: at the end of the end of the day, all of
| these techniques people discuss are in some sense
| equivalent, and, at best, most of these workarounds at
| offer are ways to incorrectly model the problem due to some
| systems giving you too much rope.
| everforward wrote:
| I don't think that holds for database transactions
| because of the semantics of rollbacks. Trying to do a
| rollback after committing is effectively a no-op (I
| believe it returns an error, but doesn't actually change
| the DB), so you can defer a rollback and only call commit
| on the happy path.
|
| I don't believe people generally care about the error
| context on the rollback, which makes it safe to defer
| into a context that can't interact with the error
| handling monads. Rollbacks shouldn't generally fail, even
| if they do there's basically nothing you can do about it,
| and there are few differences between a successful and
| failed rollback beyond resources on the DB server until
| the connection is closed.
|
| The Commit is the portion that contains the context
| people care about in their errors, and that is still
| safely in a context where it can interact with error
| handling.
|
| I believe files can get similar atomicity, but it
| requires doing IO in strange ways. E.g. updating a file
| isn't atomic, but mv'ing one is. So you can copy the file
| you want to update into /tmp, update the copy, and then
| mv the copy to the original file (commit is mv'ing it,
| rollback is rm'ing it or just ignoring it).
|
| Database transactions aren't atomic and do have the same
| issue if they reference external resources, though. E.g.
| if you have a database that stores an index of S3 files,
| transactions won't save you from writing a file to S3 but
| then failing to write a record for it into the database.
| That does muddle the error handling again.
| randomdata wrote:
| >>> f, err := os.Open(blah) //
| ... defer CheckClose(f, &err)
|
| What knowledge do you hope to gain of f.Close fails here?
| red_admiral wrote:
| If I understand the OP correctly, if Close() fails then you
| can't trust that the data was written, even if the previous
| Write() succeeded.
| randomdata wrote:
| That's not it. You can't write to a file handle returned by
| os.Open.
|
| https://pkg.go.dev/os#Open func Open(name
| string) (*File, error) Open opens the named file
| for reading. If successful, methods on the returned file
| can be used for reading; the associated file descriptor has
| mode O_RDONLY. If there is an error, it will be of type
| *PathError.
| Cthulhu_ wrote:
| Exactly that; for critical operations like e.g. a database,
| if a write fails you've got corrupted data and you have a
| Major Issue.
|
| That said, I'm not sure how they would handle a file close
| failure, wouldn't the file be corrupted anyway because some
| of the bits may have been written? Then again, at least you
| can raise the alarms if Close fails, because silent
| failures are worse than failures.
| randomdata wrote:
| _> if a write fails._
|
| We're talking about a read-only case. os.Open returns a
| read-only file handle. If you try writing to it, you'll
| get an error already at that point. If close fails, who
| cares?
|
| _> I 'm not sure how they would handle a file close
| failure_
|
| Ideally there is some kind of failover you can resort to,
| but if there is no other option at very least you will
| want to notify a human that what they thought was written
| isn't actually. But when only reading, you don't need to
| fall back to anything - all the reads were successful -
| and what is it to the human? What they thought was
| supposed to happen did!
| sa46 wrote:
| I use a similar pattern, cribbed from one of the Go databases
| that aspired never to ignore errors. The difference is that
| errs.Capture preserves existing errors and formats the error
| with a message--important if generalized to handle any error
| function. package errs //
| Capture runs errFunc and assigns the error, if any, to *errPtr.
| Preserves the // original error by wrapping with
| errors.Join if the errFunc err is non-nil. func
| Capture(errPtr *error, errFunc func() error, msg string) {
| err := errFunc() if err == nil {
| return } *errPtr = errors.Join(*errPtr,
| fmt.Errorf("%s: %w", msg, err)) }
|
| I conventionally use mErr to distinguish from err.
| func doThing() (_ string, mErr error) { f, err :=
| os.Open("foo") if err != nil {
| return "", fmt.Errorf("open file: %w", err) }
| // Use the file... defer errs.Capture(&mErr,
| f.Close, "close file") return "", nil
| }
| joeshaw wrote:
| I caution against this approach, as you are not really dealing
| with the error when it occurs. If the work you do after the
| defer has other side effects, you may have just gotten your
| application into an inconsistent state and it's very hard to
| see in code why this might be.
|
| `defer` is really not well-suited for error handling, its
| benefit is mainly in resource cleanup where failure is
| impossible or doesn't matter. (This makes it fine for `Close`
| on read-only file I/O operations, and not so great for writes.)
| blcknight wrote:
| > If the work you do after the defer has other side effects
|
| Defer is by definition the last work you do in a function,
| there won't be more work except by the caller who will get
| the error returned to them.
|
| If you are structuring a function that writes a file, and
| then does something with it, defer isn't appropriate, since
| you should close it before you do any more work.
| Cthulhu_ wrote:
| It's possible to have multiple defers in a function though
| (so you have multiple "last work in a function"; nowhere is
| it dictated that a function should only have one operation
| that needs to clean something up at the end. Think for
| example copying one file to another.
| Thorrez wrote:
| > If the work you do after the defer has other side effects,
| you may have just gotten your application into an
| inconsistent state and it's very hard to see in code why this
| might be.
|
| Can you give an example case of how this could happen?
| joeshaw wrote:
| This is a contrived example, but imagine a situation where
| I have a file I want to write on disk and then have a
| reference to it in a database. If I have a flow like:
| func UpdateUser(user *User) (err error) { f,
| err := os.Create("/some/file.txt") if err !=
| nil { return err }
| defer CheckClose(f, &err) if _, err :=
| f.Write(somedata); err != nil { return err
| } if err := db.UpdateUser(user,
| "/some/file.txt"); err != nil { return err
| } return }
|
| This function might have updated the user in the database
| with a new file despite the fact that `CheckClose` (defined
| up-thread) does check to see if the `Close` failed and
| returned an error. The calling code won't have known this
| has happened.
|
| The core problem is that the error checking is not done
| soon enough, either because Go programmers are conditioned
| to `defer f.Close()` from nearly all example code -- most
| of it demonstrating reads, not writes -- or because they
| are handling the error, but only in a deferred function,
| not earlier.
|
| A more correct way to do this would be:
| func UpdateUser(user *User) error { f, err :=
| os.Create("/some/file.txt") if err != nil {
| return err } defer f.Close()
| if _, err := f.Write(somedata); err != nil {
| return err } if err :=
| f.Sync(); err != nil { return err
| } if err := f.Close(); err != nil {
| return err } if err :=
| db.UpdateUser(user, "/some/file.txt"); err != nil {
| return err } }
|
| `Sync()` flushes the data to disk, and `Close()` gives a
| "last-chance" opportunity to return an error. The `defer
| f.Close()` exists as a way to ensure resource cleanup if an
| error occurs before the explicit `f.Close()` toward the end
| of the function. As I mentioned in an update to the post,
| double `Close()` is fine.
| red_admiral wrote:
| I wouldn't mind the language adding some syntax for this, based
| on the java try-with-resource but using error values rather
| than exceptions. Something like f, err =
| with(os.Open(thing)) { // do stuff }
| // at this point err has the same semantics as your CheckClose
| Cthulhu_ wrote:
| I believe your example doesn't introduce any new syntax
| though. Go is highly resistant to adding syntax (and that's a
| good thing, it keeps the tooling fast and barrier to entry
| low).
| pansa2 wrote:
| AFAIK Python/C# use a similar approach to Go - instead of
| `defer`, they have `using`/`with` statements. Go's `defer` seems
| more flexible though - it can execute custom code each time it's
| used, whereas `using`/`with` always call `__exit__`/`Dispose`.
|
| How does the Python/C# approach compare to Go's in this
| situation? How are errors in `__exit__`/`Dispose` handled?
| d0mine wrote:
| > Exceptions that occur during execution of this method will
| replace any exception that occurred in the body of the with
| statement
|
| https://docs.python.org/3/library/stdtypes.html#contextmanag...
| pansa2 wrote:
| Thanks!
| Too wrote:
| From the application point of view, exceptions are more natural
| to catch, so if the close inside the exit-function throws, the
| nearest catch block will not be far up the stack. Compared to
| go, where you are much more unlikely to have a recover block,
| because it is considered such a rare case.
| cccbbbaaa wrote:
| The snippet in the first update is very wrong. The manpage for
| Linux's implementation of close() explicitly says that it should
| not be called again if it fails. Apparently, this is the same
| under FreeBSD.
| klodolph wrote:
| The implementation of `os.File.Close()` in Go clears the file
| descriptor.
|
| https://pkg.go.dev/os#File.Close
|
| "Close will return an error if it has already been called."
|
| An os.File is a data structure containing a file descriptor and
| some other fields. It is safe to call Close() multiple times,
| because it will only call the underlying syscall close() once.
| weinzierl wrote:
| Here is a related question that has been on my mind for a while,
| but I have yet to find a good answer for:
|
| If I write to file on a reasonably recent Linux and a sane file
| system like ext or zfs, at which point do I have the guarantee
| that when I read the same file back, that it is consistent and
| complete?
|
| Do I really need to fsync or is Linux smart enough to give me
| back the buffer cache? Does it make a difference if reader and
| and writer are in the same thread or same process?
| fanf2 wrote:
| Normally, immediately when write(2) returns.
|
| It's more complicated if the computer shut down in between,
| depending on how clean the shutdown was.
| karmakaze wrote:
| TL;DR - defer Close ignores the error from Close, so don't.
| klodolph wrote:
| Eh, the real answer is close twice (as in the article update):
| func f(filename string) error { fp, err :=
| os.Create(filename) if err != nil { return
| err } defer fp.Close() if _, err :=
| fmt.Fprintln(fp, "Hello, world!"); err != nil {
| return err } return fp.Close() // safe to call
| multiple times }
|
| The .Close() method is safe to call multiple times. This
| behavior is documented on the os.File
| jagrsw wrote:
| It might be well implementation-dependent.
|
| On Linux, the close() system call rarely fails unless you
| provide an invalid file descriptor. L.Torvalds has stated that
| on Linux, close() immediately removes the file descriptor from
| the process, regardless of the underlying implementation's
| success or failure. Any errors related to the actual closing of
| the file are handled within the kernel and won't affect user-
| space programs. I know that go Close is not posix/linux close,
| but in majority of cases it'll boil down to it.
|
| To quote: Retrying the close() after a failure
| return is the wrong thing to do, since this may cause a
| reused file descriptor from another thread to be closed.
| This can occur because the Linux kernel always releases
| the file descriptor early in the close operation, freeing
| it for reuse; the steps that may return an error, such as
| flushing data to the filesystem or device, occur only later in
| the close operation. Many other implementations
| similarly always close the file descriptor (except in the
| case of EBADF, meaning that the file descriptor was
| invalid) even if they subsequently report an error on
| return from close(). POSIX.1 is currently silent on this
| point, but there are plans to mandate this behavior in the next
| major release of the standard.
| Ferret7446 wrote:
| More generally, don't ignore errors that you need to handle. This
| problem isn't exclusive to this very specific case on Go.
| im3w1l wrote:
| Didn't I see this thread the other day including comments?
| Investigating, Algolia search shows this thread as being posted 2
| days ago, and the memorable comments too.
| im3w1l wrote:
| @dang did the timestamps get messed up?
| fragmede wrote:
| https://news.ycombinator.com/item?id=26998308
| im3w1l wrote:
| Why would the comments have changed timestamps though?
| jsnell wrote:
| The posting timestamp is temporarily adjusted, such that
| people don't complain about a week day old submission
| being on the front page.
|
| The comment timestamps are temporarily adjusted to be
| consistent with the adjusted posting times, such that
| readers aren't confused by the comments predating the
| apparent posting time.
|
| The timestamps will revert back to their original values
| in a few days.
|
| What I don't understand is why this went into the second
| chance pool if the original submission made it to the
| front page and got >100 points.
| tome wrote:
| I noticed this before:
| https://news.ycombinator.com/item?id=40049170
| Agingcoder wrote:
| I dislike the multiple close pattern - I was bitten by this
| behavior years ago, when the second close() ended up closing
| another file which had been opened between the first and second
| close ( I think they were actually sockets ). It was a bona fide
| bug on my side , but it made for unpleasant memories, and a
| general distrust of such idioms on my side unless there's a
| language wide guarantee somewhere in the picture.
| jsndnnd wrote:
| I don't understand how that could happen, since the original
| file handle would have been invalidated.
|
| Which operating system did you experience this under and was it
| the operating system, your Libc or what else in the stack which
| caused this?
| dwattttt wrote:
| The scenario is that after the original file handle is
| closed, a new open occurs and the file is assigned the same
| handle value.
|
| Then code acting on the stale handle of the first file closes
| it, and accidentally closes the new file instead.
| dlock17 wrote:
| In the stdlib Go code for file.Close, it doesn't actually do
| the syscall after the first call to Close, so there's your
| language guarantee.
|
| That is a scary sounding error though.
| snatchpiesinger wrote:
| "Commit pending writes" and "discard file handle" should ideally
| be separate operations, with the former potentially returning
| errors and the latter being infallible. "Discard file handle"
| should be called from defer/destructiors/other language RAII
| constructs, while "commit pending writes" should be called
| explicitly in the program's control flow, with its error
| appropriately handled or passed up the stack.
|
| Whether if it's allowed to write again after a "commit pending
| writes" operation or not is a separate design decision. If not,
| then in some languages it can be expressed as an ownership taking
| operation that still allows for handling the error.
| SPBS wrote:
| I think anything other than deferring Close() and then calling
| Close() again explicitly is overengineering. Like anything that
| requires creating a cleanup function, capturing a named return
| value, requires contorting in unnatural ways to handle a very
| common scenario. Just... defer Close() the soonest you can (after
| checking for errors), then call Close() again at the end. Most
| sane providers of Close() should handle being called multiple
| times (I know os.File does, as well as sql.Tx ).
| rollulus wrote:
| Can someone tell me what's going on here? This very post
| including the comments appeared on HN two days as well. I thought
| I was getting crazy but Google confirms.
| Izkata wrote:
| If a post doesn't get much attention, there's a small chance of
| it getting a boost a few days later. If this happens, the
| timestamps are all adjusted to make it look like it was just
| posted.
|
| More details: https://news.ycombinator.com/item?id=26998308
| iudqnolq wrote:
| The end of the article has my favorite solution.
|
| This is also how I'd solve it in Rust, except the defer would be
| implicit. func doSomething() error {
| f, err := os.Create("foo") if err != nil { return err
| } defer func(){ _ = f.Close() }() if
| _, err := f.Write([]byte("bar"); err != nil { return err }
| return f.Close() }
| klodolph wrote:
| Yes, note you'll also see: defer f.Close()
|
| Instead of, defer func(){ _ = f.Close() }()
|
| I think this is likely a code style difference due to working
| with a linter that alarms on discarded error returns, but I'm
| not sure. Both options have the same behavior, unless you
| reassign f (defer f.Close() will use the original value, defer
| func() ... will use the current value).
| iudqnolq wrote:
| Yup, I wrote it that way because I have a linter for unused
| error returns with a handful of exceptions (like closing http
| request bodies).
| klodolph wrote:
| I think I would add an exception for this too, because it
| crops up a lot.
| iudqnolq wrote:
| I can't decide because in this pattern you ignore once
| and check once and I like the lint for the check. Ideally
| the linter could recognize this pattern. Even better
| would be if the linter could catch when you never close,
| I've made that mistake a few times.
| spease wrote:
| Isn't what the article suggests (with defer and then WriteString)
| technically a race condition? Is there no way that the closer can
| get called before WriteString executes?
| assbuttbuttass wrote:
| defer gets executed at the end of the current function
| spease wrote:
| Thanks! Not a Go programmer, so I saw the parentheses after
| the cloaure def and assume it translated to "execute this on
| a goroutine in the background immediately"
| tomohawk wrote:
| This is the way. func doSomething() (err
| error) { var f *os.File f, err =
| os.Create("foo") if err != nil {
| return } defer func(){
| if nil != f { f.Close() }
| }() _, err = f.Write([]byte("bar")
| if err != nil { return }
| err = f.Close() f = nil return
| }
|
| EDIT: fixed the bug
| nivyed wrote:
| The article suggests using a named return value `err` to allow
| the return value of `Close` to be be propagated - unless doing so
| would overwrite an earlier error: defer func()
| { cerr := f.Close() if err == nil {
| err = cerr } }()
|
| Wouldn't it be better to use `errors.Join` in this scenario? Then
| if both `err` and `cerr` are non-nil, the function will return
| both errors (and if both are `nil`, it will return `nil`):
| defer func() { cerr := f.Close() err =
| errors.Join(err, cerr) }()
| jfindley wrote:
| This is from 2017, errors.Join did not exist at the time. But
| yes, today you'd do it differently.
| throwaway173920 wrote:
| IMO the formatting of the error string returned by errors.Join
| is atrociously opinionated and not very logging-friendly - it
| adds a newline between each error message. I know I'm not the
| only one that has this opinion
| Matl wrote:
| Wouldn't you use slog if you want logging friendly?
| kjksf wrote:
| It's a trivial function: https://cs.opensource.google/go/go/+
| /refs/tags/go1.23.1:src/...
|
| You could write your own errorsJoin() and change Error()
| method to suit your needs.
|
| But really in this particular scenario you would be better
| served by something like: func
| errorsConcat(err1 error, err2 error) error { if
| (err1) { return err1; }
| return err2; }
|
| And then do: err = errorsConcat(err, f.Close())
|
| In the scenario described in this article, errors.Join()
| would most often reduce to that (in terms of what Error()
| string would produce).
| admax88qqq wrote:
| Go error handling is brutal.
|
| I miss exceptions
| Matl wrote:
| There was [1].
|
| 1 - https://go.googlesource.com/proposal/+/master/design/go2d
| raf...
| Cthulhu_ wrote:
| Exceptions aren't exceptional though; they are too expensive
| for not-exceptional errors, like failing writes.
|
| That said, a language feature where you can throw lightweight
| error values without generating a stack trace etc might be a
| middle ground. But it won't land in Go, given the discussion
| about alternative error handling some years ago.
|
| Anyway, in practice it's not that bad. A write can go wrong,
| you as a developer should write code that handles that
| situation. Exceptions lead a developer to miss errors, or to
| not handle them in a finegrained manner - e.g. generic "catch
| all, log error, maybe" style error handling.
| admax88qqq wrote:
| Exceptions are slow in some languages based upon how they
| are implemented. I'm not convinced that is fundamental to
| exceptions and rather a choice of how they were
| implemented. In Java exceptions arn't actually that slow,
| most of the cost is just allocating the exception object
| (and allocations in Java are fast).
|
| > Exceptions lead a developer to miss errors, or to not
| handle them in a finegrained manner - e.g. generic "catch
| all, log error, maybe" style error handling.
|
| I don't see how Go error handling makes people handle
| things any more explicitly than exceptions. Most people
| just `if err != nil { return err }`, which to be honest is
| the _correct_ logic in many cases, and it's pretty easy to
| forget to check if err and keep on trucking. At least with
| exceptions if you don't catch it your thread terminates
| making unhandled exceptions
|
| Exception bubbling means its easier to catch the error at
| the level that makes sense, and because they are real
| objects type checking is easier as opposed to the
| performance of `errors.Is()` which is surprisingly slow.
| joeshaw wrote:
| OP here. Another commenter pointed out that `errors.Join`
| didn't exist when I wrote this, but I wouldn't have changed my
| guidance if it had.
|
| The core issue here is that you want to deal with errors as
| soon as possible. The async nature of writes to files makes
| this more challenging. Part of the purpose of this post was to
| inform people that you don't necessarily always see write
| errors at write-time, sometimes you see them at close-time and
| should handle them there. (And sometimes you don't see them at
| all without fsync.)
|
| Even if you do handle errors in your deferred close, you may be
| taking other actions that you'd rather not have until the file
| i/o is completed, and this could leave your program in an
| inconsistent state. Side effects in Go are common, so this is a
| practical concern and it is hard to spot and debug.
| zabzonk wrote:
| how, when and why can close() fail? and what can you do about it
| if it does?
| defrost wrote:
| Fails if (say) OS can't write out pending cache and confirm
| data written to device.
|
| Causes include memory failure, drive cable melted, network
| cable pulled, etc.
|
| What to do?
|
| How important is the data being written? Is the only copy of
| just aquired data from a $10 million day geophysical survey?
| How much time and resources can you spend on work arounds,
| multiple copies, alternative storage paths, etc.
|
| In aquisition you flush often, worst case lose a minute rather
| than a day.
|
| In, say, seismic quisition, you might aquire audio data from
| microphone array and multi track raw audio to SEGY tape banks
| AND split raw data to thermal plotter AND processing WHERE RAW
| DATA -> (digitally to DAT AND hard drives) and through
| processing WHERE COOKED DATA -> digital storage.
|
| In processing pipelines a failed write() or close() isn't so
| bad, you flag that it happened and you can try to repipe the
| raw data to get a savable second result.
|
| Ultimately you want human operator control on what and when to
| do something - it's a hardware problem or resource starvation
| at the root.
| russdill wrote:
| Handling it this way in a user process is insane and
| essentially cargo culting. If your data is that valuable, you
| have redundant systems.
| joeshaw wrote:
| OP here. I appreciate the comments I've read here, and it might
| inspire a new blog post: "Defer is for resource cleanup, not
| error handling."
| corytheboyd wrote:
| I'm a bit new to Golang, but not good programming practices.
| Isn't ignoring the error value returned by a function a very bad
| practice in general? Regardless of if the function call returning
| it is used in defer? Not just for file write operations?
| Woansdei wrote:
| Sometimes there is nothing you can do when there is an error,
| in that case there is no point in adding several layers of
| error forwarding until you ignore it somewhere higher up.
| corytheboyd wrote:
| Is... NOT ignoring errors just not an option? I don't get it.
| If you propagate errors up but not all the way to being
| handled, haven't you failed in a very simple, easy to fix
| way? Should you have a linter catching these things?
| deergomoo wrote:
| In this case the issue is that defer is a very good way to
| ensure you don't forget to close the file in any branches,
| but a bad way to return values (you have to set the value
| of a named return variable, which is one of Go's odder
| features).
|
| > Should you have a linter catching these things?
|
| JetBrains' GoLand will in fact warn you of this. If the
| error truly is immaterial you can instead do
|
| defer func() { _ = f.Close() }()
|
| which is verbose but explicit in its intent to ignore the
| error.
| corytheboyd wrote:
| > [...] you have to set the value of a named return
| variable
|
| Ahhhh okay I see it now. I definitely prefer to not use
| that feature as well, and I'm surprised it's even there
| given how well the rest of the language adheres to "only
| one way to do things". Doubly agree that it's a strange
| "hack" for forwarding the deferred return value... oof
|
| > JetBrains' GoLand will in fact warn you of this
|
| Heh yeah that's what prompted me to ask, as I noticed
| (and very much appreciated) these hints. 100% agree with
| the verbose-but-explicit example you gave, and do that
| myself.
| jrockway wrote:
| Ignoring errors is generally a poor practice. You don't have to
| stop your program on errors, but you should at least log some
| percentage of them so that when the failure cascades, you have
| some idea where the failure started.
| yyyfb wrote:
| Boggles my mind that after more than 60 years of computer
| science, we still design tools (programming languages) where the
| simplest tasks are full of gotchas and footguns. This is a great
| example.
| erikaww wrote:
| Funny thing is that there is a near footgun with this go: if
| you defer and set a non named return in a defer, like cErr,
| that won't actually set that variable. Not sure what actually
| happens in that case but godbolt would tell you. In that case,
| the error would get swallowed
| account42 wrote:
| It's hardly a footgun. Close may be able to report some
| additional errors with getting the data on persistent storage
| but it won't report all of them anyway. For most applications,
| ignoring the return of close is perfectly fine in practice.
| cflewis wrote:
| Agreed. Just log it and move on. The code _probably_ wrote
| what it needed to even if it didn't close. If truly cared
| that you got everything out correctly, you'd need to do more
| work than a blind `defer Close()` anyway and you'd never have
| written the code like this.
| akira2501 wrote:
| > the simplest tasks
|
| The tasks seem simple from 30,000 feet up in the air. Once you
| get down into the dirt you realize there's absolutely nothing
| simple about what you're proposing.
|
| A filesystem is a giant shared data structure with several
| contractual requirements and zero guarantees. That people think
| a programming language could "solve" this is what is boggling
| to me.
| sedatk wrote:
| Windows doesn't have this gotcha, FWIW.
| Veserv wrote:
| The problem is that the filesystem primitives are garbage, so
| it is impossible to make something safe and reasonably
| performant. This is not a case of "speed at all costs" where
| huge footguns are added for marginal performance, this is
| avoiding 10x and up slowdowns that would be required to be safe
| due to the anemic primitives. If the filesystem had better
| primitives/APIs, like barriers and proper asynchronous
| completion, it would be trivial to design tools that are safe
| and performant. But, without them it is like trying to build a
| skyscraper out of mud and toothpicks.
| nextaccountic wrote:
| This is also why, in Rust, relying on drop to close a file (which
| ironically is the poster child for RAII) is a bad pattern.
| Closing a file can raise errors but you can't reasonably treat
| errors on drop.
|
| What we really need is a way to handle effects in drop; one way
| to achieve that is to have the option to return Result in a drop,
| and if you do this then you need to handle errors at every point
| you drop such a variable, or the code won't compile. (This also
| solves the async drop issue: you would be forced to await the
| drop handling, or the code wouldn't compile)
| surajrmal wrote:
| I actually really like this idea. Is there somewhere I can read
| more about it?
| masklinn wrote:
| > This is also why, in Rust, relying on drop to close a file
| (which ironically is the poster child for RAII) is a bad
| pattern.
|
| Is it though? It ensures the fd is closed which is what you
| want, and if you have some form of unwinding in the language
| you can't really ask for more. And aborts are, if anything,
| worse.
|
| It also works perfectly well for reading, there's no value to
| close errors then.
|
| > Closing a file can raise errors but you can't reasonably
| treat errors on drop.
|
| It's mostly useless anyway, since close does not guarantee that
| the data has been durably saved. If you want to know that, you
| need to sync the file, and in that case errors on close are
| mostly a waste of time:
|
| - if you've opened the file for reading you don't care (errors
| are not actionable, since you can't retry closing on error)
|
| - if you've flushed a write, you don't care (for the same
| reason as above)
|
| The one case where it matters is if you care but missed it, in
| which case we'd need a bunch of different things:
|
| - a version of must_use for implicit drops
|
| - a consuming flush-and-close method on writeable files
|
| - a separate type for readable and writeable files in order to
| hook both, and a suite of functions to convert back and forth
| because even if rust had the subtyping for you don't want to
| move from write to read without either flushing or explicitly
| opting out of it as you're moving into a "implicit drop is
| normal" regime
|
| > one way to achieve that is to have the option to return
| Result in a drop, and if you do this then you need to handle
| errors at every point you drop such a variable
|
| That is nonsensical, the entire point of drop is that it's a
| hook into default / implicit behaviour. How do you "handle
| errors" when a drop is called during a panic? It also doesn't
| make sense from the simple consideration that you can get drops
| in completely drop-unaware code.
|
| Consuming methods is what you're looking for.
| bqmjjx0kac wrote:
| You can kind of achieve this at runtime like so
| struct Foo { bool dirty, } impl
| Foo { fn clean_up(&mut self) { // ...
| self.dirty = false; } } impl
| Drop for Foo { fn drop(&mut self) { if
| self.dirty { panic!("Foo was not cleaned up
| before its lifetime ended"); } }
| } fn bar(foo: &mut foo) {
| foo.clean_up(); }
| almostdeadguy wrote:
| Confused why this is displayed as being 9 hours old when I
| remember seeing it on the front page a couple days ago. Search
| also seems to confirm this:
| https://hn.algolia.com/?dateRange=pastWeek&page=0&prefix=fal...
| russdill wrote:
| On production systems, it may often be better to completely
| ignore the problem at this level. On modern hardware if a disk is
| throwing an io error on write you are having a bad day. I can
| almost guarantee that while you might happily modify your code so
| it properly returns the error, there almost certainly aren't test
| cases for ensuring such situations are handled "correctly",
| especially since the error will almost certainly not occur in
| isolation.
|
| It may often be better to handle the issue as a system failure
| with fanotify. https://docs.kernel.org/admin-guide/filesystem-
| monitoring.ht...
| cryptonector wrote:
| Please no. Just handle errors from `close()` when you're
| writing files.
|
| > On modern hardware if a disk is throwing an io error on write
| you are having a bad day.
|
| And how would you know you're having a bad day if apps ignore
| those errors?
| russdill wrote:
| I'm arguing that the proper thing to do here is to kill the
| process along with whatever else is using the block device.
|
| Whether you handle the error or immediately or if you allow
| the error to occur after a defer, you still are almost
| certainly not handling it properly and are taking a speed hit
| for your troubles.
| Too wrote:
| Absolutely not. Imagine a text editor failing when I hit
| save. Do you really want that to crash the application? No.
| You want a fallback to ctrl+A ctrl+C, paste into a email
| and send to yourself. Giving the user a choice to save on a
| different drive is also a possibility, maybe your thumb
| drive just got a nudge and lost connection for a second.
|
| This is really all normal use cases under normal
| conditions.
| cryptonector wrote:
| Using signals for this really sucks. Anyways, you don't get
| to do this now because of backwards-compatibility. Just
| handle the error.
| nitwit005 wrote:
| A write failing is totally normal. Unplug an external hard
| disk.
| cryptonector wrote:
| You don't want to just `fsync()`, but also flush whatever is
| buffered and _then_ `fsync()`.
|
| Another thing is that if you're holding an flock on that file,
| it's nice that closing it will drop the lock. Generally you want
| to drop the lock as soon as you're done writing to the file.
| Deferring the dropping of the lock to the deferred close might
| cause the lock to be held longer than needed and hold back other
| processes/threads -- this doesn't really apply in TFA's example
| since you're returning when done writing, but in other cases it
| could be a problem.
|
| Do not risk closing twice if the interface does not specifically
| allow it! Doing so is a real good way to create hard-to-find
| corruption bugs. In general if you see `EBADF` when closing any
| fd values other than `-1`, that's a really good clue that there
| is a serious bug in that code.
| meling wrote:
| The following from the first update:
|
| if err := f.Close(); err != nil { return err } return nil
|
| Is equivalent to
|
| return f.Close()
| w10-1 wrote:
| Java's Closeable.close() has declared multiple invocations as
| safe for 20 years, when it first became an interface.
| jrockway wrote:
| I defer these AND check the errors.
| https://pkg.go.dev/go.uber.org/multierr#hdr-Deferred_Functio...
| has a nice API.
|
| I wrote my own version for $WORK package (as our policy is that
| all errors must be wrapped with a message), used like:
| func foo() (retErr error) { x := thing.Open(name)
| defer errors.Close(&retErr, x, "close thing %v", name)
| ... }
|
| You can steal the code from here:
| https://github.com/pachyderm/pachyderm/blob/master/src/inter...
|
| Finally, errcheck can detect errors-ignored-on-defer now, though
| not when run with golangci-lint for whatever reason. I switched
| to nogo a while ago and it found all the missed closes in our
| codebase, which were straightforward to fix. (Yes, I wish the
| linter auto-ignored cases where files opened for read had their
| errors ignored. It doesn't, so we just check them.)
|
| Multi-errors are nice.
| sedatk wrote:
| Never thought that `Close` could fail, as `CloseHandle` on
| Windows never fails. I wonder how .NET behaves in the same
| scenario on Linux.
| slaymaker1907 wrote:
| Best practice IMO would be to wrap the closable thing in a
| wrapper object that handles the case where Close is called
| multiple times and then defer close that one as well as closing
| at the end of the function. Another idea would be to define a
| small lambda that either returns the input value or returns error
| if Close returns an error if you want to use early returns.
___________________________________________________________________
(page generated 2024-09-10 23:01 UTC)