[HN Gopher] Context Control in Go
___________________________________________________________________
Context Control in Go
Author : todsacerdoti
Score : 148 points
Date : 2024-02-08 08:40 UTC (1 days ago)
(HTM) web link (zenhorace.dev)
(TXT) w3m dump (zenhorace.dev)
| j1elo wrote:
| This touches tangentially on a very interesting idea apart from
| contexts, at least for me as I've been recently learning Go:
|
| > _it's an anti-pattern for libraries to start their own
| goroutines. Best practices dictate that you should perform your
| work synchronously and let the caller decide if they want it to
| be asynchronous._
|
| This is something I had not discovered yet, probably because it's
| just a "common knowledge" thing that doesn't get explained in the
| Go tutorials, but regardless seems like a good idea in general.
|
| It also coincidentally threads well with something I read
| yesterday: _The bane of my existence: Supporting both async and
| sync code in Rust_ [1]. While not being well versed at all in
| Rust, I constantly had precisely this same thought: why not make
| a synchronous library by default, then let the application choose
| whether it wants to use it as-is, or to put an async runtime on
| top of it?
|
| It only makes sense to me, and I would apply this best practice
| from Go if I was trying to make a Rust library. Especially given
| that in Rust there is no "official" standard async runtime, so I
| believe that authors ought to not assume which runtime end users
| should be forced to depend on.
|
| [1]: https://news.ycombinator.com/item?id=39061839
| jeffbee wrote:
| I agree with not forcing an execution model on your callers, if
| you can avoid it. I also try to extend the rule to be as
| flexible as possible for callers. For example, in C++, I don't
| like to see a function that returns a container. I prefer a
| function that accepts an output iterator as a parameter.
| Nullabillity wrote:
| There are two different questions here:
|
| 1. Should the code be async-aware? That is, should it be able
| to yield to other tasks?
|
| 2. Should the code launch background tasks?
|
| 1 is a cross-cutting concern, in a cooperative multitasking
| environment, anything that is unable to yield is, well,
| preventing other tasks from running.
|
| In Go, 1 is always true, and implemented by the runtime itself.
| In Rust, this means using async I/O, like preferring tokio over
| std.
|
| 2 is what the advice is about. The Rust equivalent here would
| be tokio::spawn.
|
| Writing async-independent Rust code requires you to tackle 1,
| maybe by writing a trait for all the I/O actions that you
| require from your executor, which could return blocking futures
| if you select to run synchronously.
|
| That said, blocking also prevents you from using async for
| structured concurrency in the library implementation, which may
| or may not be a big deal for your use-case.
| disintegrator wrote:
| I think it's totally fine for libraries to spin up goroutines
| internally. My interpretation is that the library's public
| interface should appear to be synchronous. As a contrived
| example: package main import (
| "context" "example.com/spider" )
| func main() { ctx, cancel :=
| context.WithCancel(context.Background()) defer cancel()
| _, _ = spider.Crawl(ctx, "https://en.wikipedia.org") }
|
| In this case `Crawl` is a blocking call and, under the hood, it
| may well spin up a pool of goroutines to crawl a site. It's
| also really nice that context is available to tie the lifetimes
| of the main program (goroutine) to child goroutines without
| coloring functions (like with async-await).
|
| I used to work with the Go pubsub client
| (https://pkg.go.dev/cloud.google.com/go/pubsub) a lot and that
| has a whole bunch of scheduling and batching functionality
| handled in goroutines and on the outside you're calling
| `topic.Publish(ctx, &pubsub.Message{Data: []byte("payload")})`.
| xyzzy_plugh wrote:
| If you're willing to plumb through all the knobs imaginable
| then this is a fine approach. But if spider.Crawl just ran
| unbounded or with a fixed bound it could trivially become a
| huge headache.
|
| There are many patterns in Go that are preferable to just
| starting a bunch of concurrent work.
|
| For example the pubsub client has options to disable batching
| and limit in flight connections.
| disintegrator wrote:
| Indeed, it's down to the library API to give you the right
| knobs to control for concurrency. It says more about the
| quality of the library if it mismanages goroutines than it
| does about whether or not libraries should use goroutines
| at all.
| twic wrote:
| > There are many patterns in Go that are preferable to just
| starting a bunch of concurrent work.
|
| I have not worked in Go for several years, but i remember
| that when i did, more experienced people told me that this
| was exactly what you were supposed to do in Go. That where
| in another language you might set up a queue and a
| threadpool and so on, in Go, you should just spawn a load
| of goroutines, and let the runtime sort it out.
|
| Is this no longer the canonical approach?
| jerf wrote:
| It's kind of an "I know it when I see it" situation. To sit
| down and try to write a rigidly-specified set of rules on
| when it is and is not OK for a library to spin goroutines
| would be very difficult.
|
| Yet the basic principle isn't that hard: Your code should
| generally be what is considered to be "sync", and it is up to
| the user to decide if they want that to be "async" by using a
| goroutine themselves.
|
| This rule is primary for libraries that try to be "helpful"
| by, say, decoding an image unconditionally in a goroutine or
| something and providing a "promise" of some sort you can read
| the results from. Don't do that. If a Go programmer wants a
| "promise"-like behavior, any Go code can be so converted by
| an end-user at any time and the best thing the library can do
| in that case is just stay out of the way of the already-ever-
| present features that allow you to do that.
|
| But on the flip side, I expect a library implementing a
| parallel map to have its own goroutines. As a parallel map
| user, I basically don't want to see them or have to think
| about them. At best, maybe the library has some knobs I can
| tune, but I don't want to be managing them. That would defeat
| the entire purpose of such a library. A deliberately
| recursive and parallel crawling library, documented to be as
| such as where that feature is its major utility, fits into
| this category. By calling ".Crawl" I am clearly asking for
| this functionality explicitly, by the nature of the contract
| of the library. Which is also a good use case for structured
| concurrency, which Go does not explicitly implement into its
| language but still makes for an easier and safer library than
| the alternative.
| skybrian wrote:
| It seems like it would be better if the concurrency were
| pluggable somehow. Maybe Crawl takes some kind of worker-
| starting interface, with a suitable default implementation?
|
| Then the job of the crawler is to find new units of work,
| not to schedule them. In theory it could be done single-
| threaded by pulling work from a queue.
| jerf wrote:
| That would be an inner platform:
| https://en.wikipedia.org/wiki/Inner-platform_effect
|
| The Go scheduler is already taking units of work called
| goroutines and scheduling them. It's no big deal to ask
| the crawling system to have some limit on how many
| goroutines it'll use, the patterns for that are well-
| established, and also necessary because it's not all
| about the goroutines in this case. Crawling needs
| controls to limit how many requests/sec it makes to a
| given server, how deeply to recurse, what kind of
| recursion, etc. anyhow so it's not like it particularly
| sticks out to also have a concurrency parameter.
| skybrian wrote:
| Fair enough. But I'm not sure a wrapper around starting a
| goroutine counts as an inner platform, because it's not
| doing much work, and it's not really work that the Go SDK
| does. Choosing when to start goroutines and how many to
| start is an application concern.
|
| Depending on how it's done, it might be a decent way to
| structure a crawler?
| javcasas wrote:
| You know what happens when your library spins a goroutine and
| that goroutine crashes? Your program crases, and you don't
| have the chance of putting any recovery on it.
|
| Your library with buggy goroutines take down the whole
| program, and there is nothing you can do to fix it.
| VonGallifrey wrote:
| Isn't that the same if you make a completely synchronous
| library without any goroutine? If your library panics then
| the whole program crashes.
| skybrian wrote:
| It's not the same because for any single goroutine, you
| can catch the panic at top level. But it only works if
| you wrote the top-level code for each goroutine. (For a
| completely synchronous program, you wrote the main
| function.)
|
| If all the work is done with one function call, it might
| be pretty similar to a program crash, except that you can
| log or restart in main, and you could use it as part of a
| larger program that does other stuff too.
| throwaway894345 wrote:
| I think it's fine for libraries to provide a toplevel
| `Crawl()` method which manages the goroutines as a
| convenience, but these libraries should expose the more
| parameterized methods as well so callers can have more fine-
| grained control.
| randomdata wrote:
| _> This is something I had not discovered yet_
|
| No doubt because the Go team disagrees. They have been
| abundantly clear that, from their point of view, goroutines
| should be used and even used as part of the public API _when it
| makes sense_.
|
| That said, it is still probably really good advice for
| newcomers who won't have a good understanding for the cases
| where it does make sense, and especially because goroutines are
| the shiny thing every newcomer wants to play with and will try
| to find uses that don't make sense just to use it. As a rule,
| you don't want to force the caller into things they might not
| want. In the vast majority of cases, a synchronous API is what
| you will want to give them as it is the most versatile.
|
| And, really, that's something that applies generally. For
| example, in the case of the common (T, error) pattern, you will
| want to ensure T is always useful even when there is an error.
| The caller may not care about the error condition. That's not
| you for, the library author, to decide. The fewer assumptions
| you can make about the caller the better.
| tsimionescu wrote:
| > For example, in the case of the common (T, error) pattern,
| you will want to ensure T is always useful even when there is
| an error. The caller may not care about the error condition.
|
| This applies to maybe 0.1% of functions. The overwhelming
| majority of functions, in the Go stdlib as well as real
| projects, that return (T, error) return an empty meaningless
| value for the error cases.
| randomdata wrote:
| Not my experience. Some _very_ early code did not recognize
| this, but since then pretty much everyone has come to agree
| that values should always be useful. If you are writing a
| function today, there is no reason to not observe this.
|
| In practice, that typically means returning the zero value.
| To which idioms suggest that zero values should be useful.
| Rob Pike's Go Proverb[1] even states: "Make the zero value
| useful." Most commonly when returning (T, error) that zero
| value is nil. In Go, nil is a useful value!
|
| If the caller wants to observe the error state, great. But
| it is needlessly limiting if you force it upon them. That
| is not for the library author to decide.
|
| [1] https://go-proverbs.github.io
| thowdfasdf23411 wrote:
| Practically speaking, the (T, error) pattern is pervasive
| because there isn't any other alternative. Go simply
| lacks sum types.
|
| > Not my experience.
|
| To what experience do you speak of? My 5,000+ hours in
| kubernetes and terraform space tells me Rob Pike's views
| are fan fiction at best.
| randomdata wrote:
| Let's be real, Kubernetes is a Java project with code
| that just happens to share some resemblance to Go syntax.
| It's also one of the oldest projects using Go, long
| predating the "make zero values useful" proverb, so it is
| not surprising that it doesn't follow the idioms
| recognized today. Idioms cannot be conceived in advance.
| They emerge from actual use after finding out what works
| and what doesn't.
|
| What new code being written today is violating that
| pattern?
| eweise wrote:
| I usually return zero values just because its easy, not
| because its useful. I don't expect the caller to use the
| return value if err !=nil and haven't heard anything to
| the contrary on my team. If Go were a more powerful
| language, we would be returning Either[A,B] not multiple
| return values, which would guarantee that you rely on one
| or the other, not some weird in-between case.
| randomdata wrote:
| _> I don 't expect the caller to use the return value if
| err !=nil and haven't heard anything to the contrary on
| my team._
|
| Yet you admit to following the advice for error,
| returning the zero value for err and making it useful
| when you do. If you don't have a meaningful error state,
| why not just return junk? Clearly you recognize the value
| of making the return values useful, always. Why make
| exceptions?
| eweise wrote:
| if I need to return some person,error how do I return
| junk for the person? I just return person{}, error. I
| guess I could fill person out with a bunch of silly
| values but why would I do that work? If there was some
| easier way to make a person and it was filled with junk,
| I wouldn't hesitate to use it because the caller would
| never use the value.
| randomdata wrote:
| Logically, in that case you would return nil, just like
| you do for error. There is no person to return. nil is
| how Go signifies the absence of something. nil is useful,
| as proven by error. Why make exceptions?
|
| It's funny how people forget how to write software as
| soon as the word error shows up. I don't get it.
| ongy wrote:
| Because nil panics on member accessors... It's the
| opposite of what you claim to be the standard in go.
|
| Thanks for demonstrating that you forget how to write
| software around erros.
| randomdata wrote:
| What are you returning for error in its "junk" state,
| then? Clearly not nil, else by your assertion your code
| will panic. error has member accessors you will call -
| Error() if nothing else.
|
| Methinks you've not thought this through. What's it about
| the word error that trips up programmers like this?
| ongy wrote:
| On top of the issue with nil not being a useful value for
| most types
|
| Nil requires pointer values. I.e. it's impossible to know
| whether something is a pointer to allow for nil, or
| because a copy would be prohibitively expensive and
| therefore references are used, or even because it's into
| a mutable structure.
|
| Go's overlapping of implicit nullability and by value/by
| reference marker make it entirely useless to build
| information into APIs / necessarily promotes the value
| into a different type to use.
| eweise wrote:
| That's not what my team of gofers decided. Apparently
| zero is better than nil. I know how to write code but Go
| is its own thing. I mean the idea of returning multiple
| values is totally goofy in itself.
| randomdata wrote:
| _> Apparently zero is better than nil._
|
| Zero often is better than nil. Consider something like
| atoi. If it fails, 0 is often exactly what you want. No
| need to care about any error state. Although the error
| state is there if your situation is different. The caller
| gets to choose.
|
| But for something like a person that doesn't exist, nil
| is almost assuredly the appropriate representation. You
| didn't end up with an empty person, or a made up person,
| you ended up with no person. nil is how the absence of
| something is represented. Same reason you return nil when
| there is no error.
|
| There seems to be no disagreement that nil is the proper
| return value for cases where there is no error. Why would
| no person be different?
|
| _> I mean the idea of returning multiple values is
| totally goofy in itself._
|
| It is, but then again so is accepting multiple inputs.
| Neither is mathematically sound, but they have proven
| useful in practice.
| eweise wrote:
| yeah I get it. you're talking common sense but I'm coding
| in Go.
| randomdata wrote:
| Yes, the biggest mistake Go made was introducing the
| error keyword.
|
| It should have used banana. If it were (T, banana),
| nobody would have trouble with these concepts. There's
| just something about the word error that causes
| programmers to lose their mind for some reason.
| ongy wrote:
| No. And GP explictly said they don't tend to make it
| useful, but only do it when it's easy.
|
| Making the return value of e.g. a database handle always
| "useful" is a ridiculously dangerous idea that can lead
| to application bugs further down the route becaause some
| list/get returned an empty value to continue the pattern
| of "useful" empty values.
|
| The main reason there is ever a useful error next to a
| non-nill err is because go doesn't have a useful way to
| not do it.
| thowdfasdf23411 wrote:
| > What new code being written today is violating that
| pattern?
|
| You are putting the burden of proof on me now? How
| unfair, You didn't bring any. Go to CNCF and pick
| anything written in Go.
|
| > Let's be real, Kubernetes is a Java project
|
| Let's be real, Rob Pike is the flat earther of PLT. Sum
| types are Rob Pike's Foucault pendulum.
| randomdata wrote:
| _> You are putting the burden of proof on me now?_
|
| No. I don't give a shit about what you do. Where did you
| dream up this idea?
|
| _> Let 's be real, Rob Pike is the flat earther of PLT._
|
| No doubt, but when using the programming language of flat
| earthers, one has to accept that the particular world is,
| indeed, flat.
|
| But the advice is undeniably sound. There is no
| programming language where you should leave someone
| hanging with junk values. You might avoid junk in other
| languages using some other means (e.g. sum types), but it
| is to be avoided all the same.
| tsimionescu wrote:
| Zero and nil values are almost always bogus. The Go
| language itself doesn't even respect that proverb: the
| zero value of a map is not a useful map.
|
| There are some rare cases where a 0 value is actually
| meaningful in some way. But even for types where it is
| fully functional like integers, it's often not meaningful
| in the specific context it is used.
| deergomoo wrote:
| One problem I've found as a newcomer to Go (and I'm
| perfectly willing to accept that I just haven't developed
| the right "language mindset" yet) is that the zero value
| can be problematic--particularly for scalar types--
| because it's often a perfectly valid value in a model
| where you need a way to indicate an invalid value.
|
| Obviously if there is a possibility of invalidity, you
| would expect the caller to check the error, but the fact
| that I always have to return _something_ as the callee,
| and always have to make sure I 'm not accidentally using
| the value in error conditions as the caller, is just
| asking for mistakes to me.
|
| I appreciate that it's not the path Go has chosen to
| tread, but I find Result<T, Error> to be so much more of
| a foolproof pattern than (T, error), especially
| considering prevention-of-foot-shooting is an established
| Go design goal.
|
| (Equally obviously you could use a pointer and return
| nil, but I find that muddles the semantics, because there
| are multiple reasons you might opt to use pointers
| besides the ability to express "no value".)
| Kamq wrote:
| If the zero value is valid, I usually just use a pointer
| to the scalar type in question
| randomdata wrote:
| Given (T, error), what do you return for error when no
| error occurred? When error is "invalid"? The caller is,
| no doubt, expecting you to be consistent, so the answer
| for T no doubt lies therein.
|
| There is nothing special about errors.
| Joker_vD wrote:
| I've seen just two APIs that returned non-nil/non-default
| T (representing the partially completed work) with a non-
| nil error, and those were a constant source of bugs and
| errors. I've changed those to always return dummy empty
| T, and even though the retries now hurt performance more
| (they could not re-use partial completed result), it was
| a much more straight-forward code.
| acaloiar wrote:
| > it's an anti-pattern for libraries to start their own
| goroutines. Best practices dictate that you should perform your
| work synchronously and let the caller decide if they want it to
| be asynchronous.
|
| This line reminded me that we all need to beware of the "anti-
| pattern" police. Some develpers use this term effectively by
| explaining precisely why something is an "anti-pattern".
|
| But more often it's used as a way to shut down conversation and
| any actual critical thinking. There's a lot of nuance behind
| what makes something an "anti-pattern", and simply declaring
| something an anti-pattern isn't enough. "This is an anti-
| pattern and this is why ..." is enough. But I still avoid using
| the term regardless.
|
| FYI I'm not saying the author is the anti-pattern police, but
| it does sound like they've found themselves on the police's
| radar.
| throwaway894345 wrote:
| >> it's an anti-pattern for libraries to start their own
| goroutines
|
| The specific argument here is that by writing sync functions,
| the library is more abstract because the caller can decide
| whether to run the function sync or async. I agree with this,
| but there are lots of areas where we could issue guidance to
| make libraries more abstract.
|
| For example, instead of a library function which returns a
| pointer to allocated memory (e.g., `NewFoo() _Foo`) we should
| write functions which take pointers to memory and the caller
| can figure out whether to allocate them on the stack or the
| heap (e.g., `NewFoo(out_ Foo)`). I'm not advocating for this as
| a general rule of thumb because writing that kind of code in Go
| would not be very ergonomic, but there's a lot of performance-
| sensitive code even in the standard library that is written
| that way.
|
| Another example would be 'inversion of control', wherein
| functions take interface parameters and callers decide what
| implementation to pass in.
| divan wrote:
| It might also come from the common sense of using the CSP model
| (goroutines and messages, basically). Being asynchronous or
| synchronous is not the property of the function; it's the
| caller's decision. So every function is "synchronous" and it's
| up to the caller to decide when to spin it into the background
| and what for.
|
| As others commented, it doesn't preclude from you using
| goroutines in libraries if you _really need to_. It is just
| important to remember that it's not obvious to the caller.
|
| In light of this "common sense", async/await concurrency
| concept doesn't make any sense. Why would function dictate how
| exactly it should be called by a caller? Is "watching TV" an
| async or sync action? Depends on the caller - whether they put
| all their attention into this action or doing it "in the
| background" while performing other tasks. It's not an inherent
| property of the "whatching TV" function. I have no idea why so
| many people think that async/await is a good idea for
| expressing concurrent systems.
| NorthTheRock wrote:
| I've always seen this as the exact opposite view - from go's
| concurrency model, every function is "synchronous" so the
| caller is not given a choice, if they want to run it
| asynchronously they have to create a new thread, then if they
| care about the result deal with inter-thread communication.
|
| With async/await, you're explicitly giving control to the
| caller to decide, you can await this promise now and have the
| thread treat it as synchronous, you can spawn a new task to
| run it in the background, or you can join it with other
| promises where you don't care about order and await for the
| results of the group.
| divan wrote:
| Interesting. I see two different aspects here:
|
| 1) Mental model. My claim comes from the firm belief that
| the more code is aligned with how we think, the easier it
| is to reason about the code. I naturally think about
| actions as they are not async or sync by nature - rather,
| it's me who's in charge of how the action is going to be
| executed (back to my "watching TV" example). Human
| attention here serves as an analogy to utilizing the
| logical CPU core during runtime.
|
| 2) Performance consideration. What you described indeed can
| work, too, but it comes at a cost. With Go, yes, you have
| to handle async results yourself (if you care about
| results), but you now understand the price of this and can
| make better judgments of the code and complexity and have
| better performance overall.
| JohnCClarke wrote:
| TBH: Context feels like a wart to me. It works, but it's not
| elegant. Golang has an aversion to thread^H^H^H^H^H^Hgoroutine
| local storage. Instead it provides this kludgey experience.
|
| I really feel that Golang V.2 should invent a better, native, way
| of controlling threads of execution.
| gray_-_wolf wrote:
| I always thought that context should just be a goroutine local
| thing always available, automatically inherited when `go` is
| executed (obviously with the option to set an explicit one).
| mariusor wrote:
| I think Jonathan Blow's programming language, Jai, has an
| implicit context available to any function. However in Jai
| the context has a lot of implicit functionality, on the top
| of my head at least logging plumbing, and allocator plumbing.
| swdunlop wrote:
| I think of contexts as being Go's answer to dynamic variables
| in earlier languages, like Lisp, and less like thread local
| storage (like Pthreads). Much like how Go works with errors,
| explicit is favored over implicit -- being able to see the
| context pass through, and whether a function expects a
| context, tells you a lot about the function you are about to
| call.
|
| If a function does not take a context, you know it probably
| cannot be interrupted, just like when a function does not
| return an error, you know it should not fail. In my work
| projects, this is also a cue that the function does not do
| any logging since we always carry a zerolog.Logger in our
| contexts enriched with trace information about the request
| and handler.
|
| This also makes life easier for me as a reviewer -- I can see
| the context passing, I can spot when there is a bad pattern,
| like retaining a context, or failing to handle an error. It
| does not require me to maintain a detailed mental map of
| which functions employ dynamic variables or can throw
| exceptions.
| valenterry wrote:
| The only way to do that would be to introduce a concept that is
| orthogonal to functions (and their signatures) / errors, and
| that would mean an incredible increase in complexity. I doubt
| the Go authors will do that.
| bborud wrote:
| Hmm, I think you think of context differently from me. For me
| context is something you use to manage execution. I mostly use
| it to notify different parts of the program that "you can stop
| what you are doing now". For instance if you are processing a
| request and the client went away. Or you have run out of time.
|
| You're talking about context as a way to distribute data? I do
| that as well. For instance to provide auth/session data to
| requests, but that's usually just limited to one path in my
| software that does this. (I agree it is clumsy, but not because
| it is the "wrong" thing to do, but rather the API feels a bit
| dodgy).
|
| If you are talking about something like thread-local storage,
| that's really a very different thing from both the control
| aspect of context and the request data aspect.
|
| What extra functionality do you want for goroutine control and
| why do you think Go needs it?
| d0gsg0w00f wrote:
| What annoys me about Context is there's no way to tell if
| it's honored by the callee.
|
| And when I'm accepting Context I'm annoyed at having to write
| handlers for it all through the stack having no idea how/if
| people will use it.
| bborud wrote:
| > What annoys me about Context is there's no way to tell if
| it's honored by the callee.
|
| On the occasions where I've needed that I've used a
| WaitGroup and done wg.Add(1) at the point where I start
| goroutines and then have a defer wg.Done() as the first
| thing in the goroutine. I don't think the functionality
| belongs in Context. And if you put it there, you'd just end
| up complicating things.
|
| > And when I'm accepting Context I'm annoyed at having to
| write handlers for it all through the stack having no idea
| how/if people will use it.
|
| How would you propose you do it instead?
| tsimionescu wrote:
| > On the occasions where I've needed that I've used a
| WaitGroup and done wg.Add(1) at the point where I start
| goroutines and then have a defer wg.Done() as the first
| thing in the goroutine. I don't think the functionality
| belongs in Context. And if you put it there, you'd just
| end up complicating things.
|
| I'm not sure this is the same thing. The point of Context
| is to propagate cancelations or timeouts across multiple
| layers of your app and libraries, it's not supposed to be
| useful for directly started goroutines.
| bborud wrote:
| Perhaps not, but if you have no idea what ought to be
| cancelled how would it help you to know that something
| has been cancelled?
|
| What changes would you make to Context?
| rad_gruchalski wrote:
| > What annoys me about Context is there's no way to tell if
| it's honored by the callee.
|
| Everybody has to be decent enough to do their part.
|
| > And when I'm accepting Context I'm annoyed at having to
| write handlers for it all through the stack having no idea
| how/if people will use it.
|
| Thank you for doing yours :)
| cle wrote:
| I really disagree with this. A function taking a Context is a
| really important signal to me about the semantics of that
| function. I also much prefer being able to see context values
| explicitly passed around, instead of values that magically
| appear out of the ether, without a clear code path to find out
| where they came from, what goroutine it's bound to, where that
| goroutine came from and its lifecycle, etc.
| eweise wrote:
| The context is just a big bag of stuff. you don't know what's
| really in it. Ends up almost any method that needs something
| from the big bag ends up having a context parameter, but you
| don't know why that method needs it.
| cle wrote:
| Separating these concerns (cancellation vs. bag-of-request-
| scoped stuff) might make sense. I'm specifically talking
| about the cancellation side of contexts. I don't think
| there's a good answer to this other problem of whether
| those two concerns should be combined into one mechanism,
| the options that I know about all have mixed tradeoffs.
|
| I still think an explicit bag-of-stuff is better than an
| implicit one though.
| eweise wrote:
| Believe me I've tried to write Go code that doesn't
| follow the Go conventions and can't get my PRs approved
| even with tiny differences. So the idea that I could
| separate these two concerns might be a good one, but in
| practice it would be impossible.
| skybrian wrote:
| Right, you don't know what's in it, but you know you need
| to forward it if you start a goroutine.
| eweise wrote:
| I also need to forward it to nonroutines because there is
| a logger in the context and almost every function wants
| to use the logger. So essentially, we have to almost
| alway pass context as the first argument to any function
| unless its some private function trivial function.
| badrequest wrote:
| Treating a context value as a bag to fetch data out of is
| the first mistake. They should only ever be used to control
| things like deadlines and whether or not a part of a
| function executes. IMHO they should disallow attaching
| values to a context.
| leoqa wrote:
| Are we missing a thread pool/executor like abstraction for
| workers? If we really want callers to be able to control
| concurrency primitives deeper down the stack, we should
| coalesce on an executor paradigm that the library can use as
| it's work queue.
| skybrian wrote:
| An alternative would be a language that has structured
| concurrency built in. [1]
|
| The rules around goroutines and context seem to point in the
| direction of structured concurrency. For example, if any
| goroutines started in a function get cleaned up before return
| then that's following the rules of structured concurrency.
|
| Thread-local storage is bad because it's implicit and causes
| bugs when used with concurrency; if you farm out some work to
| another goroutine, it will break.
|
| [1] https://en.wikipedia.org/wiki/Structured_concurrency
| zupa-hu wrote:
| > Thread-local storage is bad because it's implicit and
| causes bugs when used with concurrency; if you farm out some
| work to another goroutine, it will break.
|
| Can you elaborate on why being implicit is bad and how it
| causes bugs?
|
| I understand that shared data (via pointers) may cause race
| conditions and other unexpected behavior, so let's say we
| require that the thread-local storage can only store values
| (with value semantics).
|
| If you could point out any issues with that, I'd greatly
| appreciate it.
| skybrian wrote:
| It's been a while, but the underlying issue is that threads
| aren't always one-to-one with server requests. You can have
| a request where some work is handled by multiple threads.
| Or, a single thread can do some work for multiple requests.
|
| So one possible bug is that you have a function that
| implicitly depends on thread-local storage, and then you
| move some work to another thread and call the function
| there, and it doesn't work because its dependencies aren't
| there. You need to manually set up the thread-local storage
| of each new thread.
|
| Another bug is that if a thread does work on multiple
| requests (say, a task queue), some thread-local storage
| could leak data from a different request.
|
| In larger systems, it might even be worse: one request can
| be farmed out to multiple _servers_ and then you need to
| pass the context along over the network when doing rpc.
| This only works for serializable data, but things like
| deadlines can be propagated, and a request id that ties it
| all together is useful for logging.
|
| "Which request am I working on" is something that's
| transient and often doesn't map directly to OS-level
| objects. (Although it does map one-to-one in simple cases.)
| liampulles wrote:
| I think it splits people the same way that err values split
| people, which is that Go makes more things values in service of
| making the control flow plain.
| kevincox wrote:
| This has a lot of interesting accidental rebuttals to some
| "features" of go.
|
| > If you're not in an entry-point function and you need to call a
| function that takes a context, your function should accept a
| context and pass that along.
|
| This is in contrast to the fact that Go's cheap threading means
| that you don't need to colour your functions with async or not
| async. But this quote that you sort of have to do this with
| context or no context.
|
| It isn't quite as bad as you can "skip steps" such as passing a
| context to a callback directly rather than needing the function
| that calls the callback to support contexts. But still in general
| your functions do have colour if you want to use contexts
| properly.
|
| > most seasoned Go devs would leap out of their seats to tell you
| it's an anti-pattern for libraries to start their own goroutines.
|
| If goroutines are so cheap then why not let the library spawn
| them. As long as the interface doesn't reveal if they are being
| used or not it shouldn't matter.
| icholy wrote:
| > This is in contrast to the fact that Go's cheap threading
| means that you don't need to colour your functions with async
| or not async. But this quote that you sort of have to do this
| with context or no context.
|
| This doesn't have the usual problems associated with colored
| functions though (ie, calling async functions from non-async
| functions or vice-versa). If you don't need cancellation, pass
| a `context.Background()` and you're done.
|
| > If goroutines are so cheap then why not let the library spawn
| them. As long as the interface doesn't reveal if they are being
| used or not it shouldn't matter.
|
| Agreed. providing a synchronous API is what's important.
| mariusor wrote:
| > you don't need to colour your functions with async or not
| async.
|
| Context is useful for synchronous functions also, it has
| nothing to do with async.
| jerf wrote:
| You can't extend the concept of "coloring" a function to all
| possible environment and parameters a function needs to
| execute. That's not because that's a useless concept; I
| actually find it a very important concept to be thinking about
| and I often explicitly think in terms of trying to minimize the
| size of such things in my code. But you can't extend the
| "coloring" concept that far because you've stretched it all out
| of shape at that point, into an entirely different concept. All
| code in all languages everywhere is going to have state that
| flows through some combination of function parameters through
| the program and have certain requirements for that state
| without which the functions (methods/whatever) will not run.
|
| Coloration is a very particular very strong instance of such
| things that is so strong it causes its own special effects and
| imposes very special constraints on the code. Generally if you
| need to call something that wants a context but you don't have
| one, you just pass in the trivially-obtained
| "context.Background()" and move on. Nowhere near the level of
| blockage as a color issue.
|
| "If goroutines are so cheap then why not let the library spawn
| them."
|
| It's not about cost, it's about software engineering, and it's
| a particular antipattern you may not know about if you're not
| in the community. As I said in another post, many libraries
| "helpfully" spawn goroutines to do a thing and offer a promise-
| like interface to the results. This is the core antipattern
| being referred to, which I've seen quite a lot. The resulting
| API is complexified relative to simply having a function that
| takes parameters and returns results. If you write such a
| complex API, an end-user of that API can't uncomplexify it.
| However, if you write the simple, normal function, an end-user
| of your API who _does_ want that additional functionality can
| trivially add it, and moreover, they can add it in whatever
| other combination of things they may want, e.g., perhaps your
| library is part of a three-step pipeline you choose to run in
| its own goroutine, or some other complex threading setup you
| need. It is better for a library to provide the simple
| "synchronous" API than to try to guess and possibly even as a
| result forstall the real setup you need.
|
| It isn't a hard-and-fast rule that libraries must never spawn
| goroutines, it's a particular set of antipatterns being
| referred to.
| the_gipsy wrote:
| > You can't extend the concept of "coloring" a function to
| all possible environment and parameters a function needs to
| execute.
|
| It's not the particular abstraction, it's the concept of two
| different colors: functions that take or don't take
| context.Context. Ultimately we do have two colors. Seasoned
| go devs will have refactored some to the other by drilling
| through `ctx` or removing it and know exactly what "color"
| is.
|
| > many libraries "helpfully" spawn goroutines to do a thing
| and offer a promise-like interface to the results. This is
| the core antipattern being referred to, which I've seen quite
| a lot. The resulting API is complexified relative to simply
| having a function that takes parameters and returns results.
| If you write such a complex API, an end-user of that API
| can't uncomplexify it. However, if you write the simple,
| normal function, an end-user of your API who does want that
| additional functionality can trivially add it, and moreover,
| they can add it in whatever other combination of things they
| may want
|
| Do you not read that as "two colors"?
| jerf wrote:
| "Ultimately we do have two colors."
|
| If you consider this a color, we don't have two colors. We
| have millions.
|
| As I said, it is not that such a concept would be useless;
| I use it all the time. But it's not "color" any more.
|
| "Do you not read that as "two colors"?"
|
| That is in response to a completely different question
| about the cheapness of goroutines, not coloration.
|
| Coloration is much stronger than you seem to be
| understanding. It is not "oh, this function requires that
| parameter and that one does not, so they must be different
| colors". It is that you _can 't_ correctly run an async
| function from within a synchronous one and vice versa in
| the languages in which these are considered completely
| different things (which does not include Go). While
| conversion is ultimately possible, it is expensive and
| high-consquence. If you have a hard time seeing that
| because, say, a sync function can "simply" bring up an
| async execution engine for its async calls and an async
| function can "simply" spawn an entire OS thread to run sync
| code and collect the results through a promise, consider
| nesting such an approach arbitrarily deeply as a deep call
| stack alternates between async and sync calls, which is
| 100% realistic. It becomes more clear how high-consequence
| this is if you remember that such programming language
| constructs must be able to compose essentially arbitrarily
| deeply.
|
| Context parameters can be satisfied by no-context functions
| simply by passing "context.Background()", and the result is
| low-consequence. The code does not come apart at the seams,
| the code using contexts simply ends up not having any data
| come from the background (empty) context and the background
| context will never generate a cancellation event, which is
| apparently what the caller wants since they are asking for
| that more-or-less explicitly. If that is not what the
| caller wants, "ctx, cancelF :=
| context.WithTimout(context.Background(), time.Second)" is
| also trivially available, correct, and low-consequence. If
| you define coloration down to this level, you completely
| lose the entire point of the original essay, which is the
| high-effort, high-consequence effects of bridging sync and
| async code. Defining colors down to "This function takes a
| file pointer, thus it is 'file pointer colored'" is
| profoundly missing the entire point. The idea of function
| color is useful precisely because it is limited to only
| certain high-cost conversions, not spread so thin as to
| cover literally every function parameter. Contexts aren't
| very special; it is literally easier come up with one if
| you don't have one ("context.Background()") than it is to
| come up with an integer, in which you must actually pick
| one. It isn't anywhere near special enough to justify being
| called a "color" any more than a file pointer, or a
| database connection, or any of hundreds of other resource
| types, most of which impose more constraints on the code
| than contexts.
| the_gipsy wrote:
| You are the one trying to include "all things like file
| pointers" as colors, as a strawman. But nobody is trying
| to say that.
|
| Context is tacked on to goroutines to control async
| stuff, unlike e.g. filepointers or database-handles.
|
| Using context.Background() is fine and great for tests or
| e.g. some CLI program. But consider deeply nesting
| functions that all do context.Background().
|
| Go just smudges everything as gray and calls it a day.
|
| Edit: what I'm trying to say is that go has not
| _completely_ solved the colored function problem, which I
| think is what you 're implying.
| Thaxll wrote:
| Go routine are cheap this is not the point, the point is should
| libraries expose blocking or not blocking API. Who should
| create goroutine etc...
| memset wrote:
| Question about contexts in general: is there a way to "guarantee"
| that contexts are used correctly by whoever is consuming my
| context?
|
| For example, if I call http.NewRequestWithContext(), how do I, as
| the caller, know that http is doing the "right thing" with that
| value, rather than ignoring it?
|
| In the OP's example, intuitively it seems like an explicit Stop()
| function gives the caller explicit control of when to stop and
| that anyone implementing a Worker (if Worker were an interface)
| would know that the Stop() function should do cleanup.
|
| However, if I only pass in a context when calling Run(), wouldn't
| it be easy for someone to ignore a deadline?
| badrequest wrote:
| Most of the time you accept a context because some downstream
| function requires it (i.e. I'm writing an HTTP client, and the
| `net/http` std lib functions require context for some part of
| it). You can have general confidence that the standard library
| will respect things like context deadlines, even if the
| wrappers that invoke that don't necessarily.
| rollulus wrote:
| > most seasoned Go devs would leap out of their seats to tell you
| it's an anti-pattern for libraries to start their own goroutines.
| Best practices dictate...
|
| Then I'm afraid that either the author is not familiar with the
| standard library, or that it is not built according the "best
| practices".
|
| Rant: the phrase "best practice" increasingly irritates me.
| Basically it has become a synonym for "my own opinion just trust
| me". It's like the Jedi hand gesture to force your beliefs onto
| someone to end a discussion.
| diggan wrote:
| > Basically it has become a synonym for
|
| Its been a thing for a long time! As far as I know, Feynman
| first put it into words (regarding science, but applies equally
| to software engineering) in 1974:
|
| > In the South Seas there is a cargo cult of people. During the
| war they saw airplanes land with lots of good materials, and
| they want the same thing to happen now. So they've arranged to
| imitate things like runways, to put fires along the sides of
| the runways, to make a wooden hut for a man to sit in, with two
| wooden pieces on his head like headphones and bars of bamboo
| sticking out like antennas--he's the controller--and they wait
| for the airplanes to land. They're doing everything right. The
| form is perfect. It looks exactly the way it looked before. But
| it doesn't work. No airplanes land. So I call these things
| cargo cult science, because they follow all the apparent
| precepts and forms of scientific investigation, but they're
| missing something essential, because the planes don't land.
|
| We programmers usually call it "cargo culting", blindly
| following "best practices" and "design patterns" without any
| deeper understanding of why and when those should be applied.
| j1elo wrote:
| To be honest for me when someone talks about "best practices" I
| expect it to be _actual best practices_ that usually get
| documented somewhere that is a documentation resource well
| regarded by the "community" of that tool or language.
|
| So if it happened (no idea) that the author here was taking
| this "best practice" out of their own ass... well, then yeah
| I'd agree that's just an opinion and not an actual, community-
| agreed-upon, very common and very well documented "best"
| practice in the sense that I usually regard as useful and
| reliable.
|
| Nevertheless, I agree with the author that the concept of
| "don't return immediately from your API function, instead block
| until the function's work has been completely done, and only
| then return", seems to me as a valid and quite good idea.
| Regardless of how many internal goroutines might have been used
| in order to comply with this behavior, the external surface
| should look like a blocking call. If any caller doesn't want to
| block their thread on it, _they_ in turn can always run it in a
| goroutine.
| assbuttbuttass wrote:
| > I expect it to be actual best practices that usually get
| documented somewhere that is a documentation resource well
| regarded by the "community" of that tool or language.
|
| It's documented in the Google go style guide: https://google.
| github.io/styleguide/go/decisions#synchronous...
|
| Note that the advice is slightly different, not "don't use
| goroutines," but rather "any internal goroutines need to be
| cleaned up before the function returns"
| clktmr wrote:
| IIRC for the http package at least, the original author stated
| it was a mistake.
| italicmew123 wrote:
| Rule 3: Don't store contexts - What about this:
| https://github.com/golang/go/blob/master/src/net/http/client...
| req = &Request{ Method: redirectMethod,
| Response: resp, URL: u, Header:
| make(Header), Host: host, Cancel:
| ireq.Cancel, ctx: ireq.ctx, }
|
| Isn't this considered storing the context?
| maxwellg wrote:
| Yep - (my understanding is) the Go HTTP stdlib module predates
| the concept of context in Golang, so the implementation was
| bolted on to ensure backwards compatibility.
| NewRequestWithContext was only added in Go 1.13 [1].
| Previously, requests were cancelled manually with CancelReqest
| [2]. This is an unfortunate wart of the language - it means
| it's very easy to accidentally spin up a new Request which
| doesn't inherit the parent context by calling NewRequest
| instead. And adding the context via the builder pattern means
| it's possible to introduce the storage bugs described in the
| article. My preferred way to consume a context would be to take
| it in when the work is actually about to be performed - e.g.
| client.Do(ctx, reqest)
|
| 1 - https://pkg.go.dev/net/http#NewRequestWithContext 2 -
| https://pkg.go.dev/net/http#Transport.CancelRequest
| __turbobrew__ wrote:
| Yea I don't really agree with the rule that goroutines
| shouldn't be started in libraries either. For example, say you
| are building a library to send metrics to a metrics collector.
| For me it makes sense that your metrics library contains a
| buffer of metrics which it batches data together and sends to
| the metrics collector asynchronously. This would be implemented
| as having a library goroutine which batches and sends metrics
| data. I guess in theory your library could have a 'Flush'
| method and then if the application wants async flushing the
| application can start a goroutine which periodically calls
| flush. But then the application needs to know the ideal
| frequency to flush, how to handle failures to flush, how to
| backoff, etc. These things are probably better done by the
| library writer.
| Philip-J-Fry wrote:
| https://go.dev/blog/context-and-structs
|
| >Exception to the rule: preserving backwards compatibility
|
| See that section of the blog post. It talks about the different
| approaches they could have took and why they chose the one they
| did.
| pianoben wrote:
| I don't really agree that it's an antipattern for a library to
| create a goroutine. If you consider that starting a worker is, in
| a sense, an _entry point_ , you can even claim to otherwise
| conform to The Rules.
|
| Why would I want to make my caller think about scheduling library
| internals? As long as I'm managing resources appropriately, and
| exposing knobs as necessary, what's the problem?
| matthewaveryusa wrote:
| What about storing the context for lazy-loading? Like a client
| that's initialized in main() with a ctx, but starts async workers
| on first-use deeper in and across go routines?
| fl0ki wrote:
| I find that combining context and errgroup, with due care, lets
| us approximate Structured Concurrency to great benefit. I just
| think more care is due than most people give.
|
| When a function returns, we should be able to trust that any
| goroutines it created have already terminated and will not have
| other side effects. This is important because Go doesn't enforce
| read/write thread-safety any other way, so we need it to be clear
| from the code when those reads/writes may happen.
|
| It's also important because side effects from those lingering
| routines could have other consequences, e.g. the retry for an IO
| operation could overlap with a past attempt, violating invariants
| in ways that would be really hard to reproduce and debug.
|
| This sounds really simple, why wouldn't you do it that way?
| Idiomatic use of errgroup encourages you to do it that way, but
| not everyone does it that way, and sadly not every project even
| uses errgroup in the first place. It's very common to see a
| routine observe cancellation and return immediately, even if it
| created its own goroutines which it can't guarantee have aborted
| yet.
|
| Aside, it's also sadly extremely common to see people reinvent
| errgroup badly with "error channels" that at best don't join the
| other routines and at worst deadlock them because they block on
| sending errors that nobody is receiving any more.
|
| That's why if you do this, you basically ban the `go` keyword and
| strictly use errgroup, even for routines which can't return
| errors. (WaitGroup can do this too, but it's harder to use right,
| because the Add/Done count have to add up exactly and there's no
| enforcement that they do).
|
| If you do this right, then it shouldn't matter whether a function
| creates goroutines to help with its work, such as timeout
| channels or parallel processing or what have you. What matters is
| that the function still acts like a synchronous one from the
| outside.
|
| The worst I've seen is when people know they have to use
| errgroup, but they create one large errgroup in main and pass it
| around as a mutable argument to everything to add more tasks to
| it. They don't understand that when it's used correctly, it also
| nests and encapsulates entirely, so it's never the argument to or
| return from a function.
|
| Of course it gets more complicated if an object has long-running
| goroutines that outlive any particular function. Then you need to
| call more functions just to create those wait points. For
| example, it's not enough to cancel a database cursor as a
| context, you should still block on closing it, otherwise its own
| routines can still be running when you go back and start another
| operation. Again, sadly all too common.
|
| Caveat 1: errgroup only returns the first error, which for many
| routines is just "cancelled". That's not useful, and it takes the
| place of what could have been a real error. I suppress errors
| like that, so that actual cleanup errors, if any, are the ones
| surfaced.
|
| Caveat 2: errgroup doesn't trap panics, if you want the panic to
| be surfaced as a neatly packaged error, you have to install your
| own handler. Every project I have has its own simple version of
| this, and I've seen many other projects come to the same
| conclusion.
| andrewstuart2 wrote:
| I'd argue here that it's not a problem of context storage, it's a
| problem of not ignoring cancellation in certain situations. Since
| context, for better or worse, has two purposes, you may still
| want a lot of the request-scoped data for later operations after
| the initial timeout/deadline is done. And since context has a
| standardized bag of data, it's more future-proof to just keep
| using the context and its data rather than, say, extracting the
| data you know about today (like trace/span ids) and storing that
| for later use.
|
| Fortunately, following appropriate patterns here got a lot easier
| with 1.21 and the addition of `context.WithoutCancel`. If you're
| going to store a context for later use, since there's potentially
| e.g. tracing data you still want to keep, make sure you
| appropriately `context.WithoutCancel` to keep the data without
| keeping the original deadline.
| SPBS wrote:
| It's not a hard rule that context should not be struct fields.
| See https://github.com/golang/go/issues/22602 (context: relax
| recommendation against putting Contexts in structs)
|
| "Right now the context package documentation says
|
| > Do not store Contexts inside a struct type; instead, pass a
| Context explicitly to each function that needs it. The Context
| should be the first parameter, typically named ctx: [...]
|
| This advice seems overly restrictive. @bradfitz wrote in that
| issue:
|
| > While we've told people not to add contexts to structs, I think
| that guidance is over-aggressive. The real advice is not to store
| contexts. They should be passed along like parameters. But if the
| struct is essentially just a parameter, it's okay. I think this
| concern can be addressed with package-level documentation and
| examples."
| neonsunset wrote:
| Maybe C# got me spoiled with CancellationToken which seems like a
| nicer API. And, perhaps, synchronization context as well if you
| are writing a GUI application and need a render thread, to make
| sure you yield to the right one. Though if that's not the
| preferred pattern, publishing a message to a channel on one end
| and then reading them from another is always an option.
| valcron1000 wrote:
| Is it correct to assume that the "cancellation" part of `Context`
| is similar to C#'s `CancellationToken`? Also, it looks like it
| allows to pass some "implicit parameters" to another function. If
| that's the case, why does Go have a single entity performing two
| roles: cancellation + implicit parameter passing? I would expect
| to have these things separated.
| valcron1000 wrote:
| I also find it curious that a language with a preemptive
| scheduler requires manual "yield" points by constantly checking
| on the context if the current function should stop executing.
___________________________________________________________________
(page generated 2024-02-09 23:00 UTC)