[HN Gopher] Data Race Patterns in Go
___________________________________________________________________
Data Race Patterns in Go
Author : statenjason
Score : 67 points
Date : 2022-06-10 19:35 UTC (3 hours ago)
(HTM) web link (eng.uber.com)
(TXT) w3m dump (eng.uber.com)
| thallavajhula wrote:
| >We developed a system to detect data races at Uber using a
| dynamic data race detection technique.
|
| By system do you mean a process or a tool that detects these?
| JamesSwift wrote:
| Really good article, and gave me several ideas to track down some
| gremlins that have been bugging me in a go codebase recently.
| bumper_crop wrote:
| Sorry to say, but these hit close to home for me. A lot of the
| synchronization paradigms in Go are easy to misuse, but lead the
| author into thinking it's okay. the WaitGroup one is particularly
| poignant for me, since the race detector doesn't catch it.
|
| I'll add one other data race goof: atomic.Value. Look at the
| implementation. Unlike pretty much every other language I've
| seen, atomic.Value isn't really atomic, since the concrete type
| can't ever change after being set. This stems from fact that
| interfaces are two words rather than one, and they can't be
| (hardware) atomically set. To fix it, Go just documents "hey,
| don't do that", and then panics if you do.
| Groxx wrote:
| The lack of generics has forced all Go concurrency to be
| intrusive (i.e. implemented by the person using _literally any
| concurrency_ ), and yeah. It's horrifyingly error-prone in my
| experience. It means everyone needs to be an expert, and lol,
| everyone is not an expert.
|
| Generics might save us. Expect to see `Locker<T>` and
| `Atomic<T>` types cropping up. Unbounded buffered thread-safe
| queues backing channels. Etc. I'm very, very much looking
| forward to it.
| morelisp wrote:
| > atomic.Value isn't really atomic, since the concrete type
| can't ever change after being set.
|
| How does this mean it's _non-atomic_? As far as I know you can
| still never Load() a partial Store(). (Also, even if it was
| possible, this would never be a good idea...)
| bumper_crop wrote:
| That's why I opened with "Look at the implementation". Go is
| unable to store the type and the pointer at the same time, so
| it warps what "atomic" means. Pretty much every other
| language has atomic mean "one of these will win, one will
| lose". Go says "one will win, one will panic and destroy the
| goroutine.
|
| In fact, it's even worse than that. If the Store() caller
| goes to sleep between setting the type and storing the
| pointer, it causes every Goroutine that calls Load() to
| block. They can't make forward progress if the store caller
| hangs.
| jchw wrote:
| This is pretty cool. 50 million lines of code is quite a large
| corpus to work off of.
|
| I'm surprised by some of them. For example, go vet nominally
| catches misuses of mutexes, so it's surprising that even a few of
| those slipped through. I wonder if those situations are a bit
| more complicated than the example.
|
| Obviously, the ideal outcome is that static analysis can help
| eliminate as many issues as possible, by restricting the
| language, discouraging bad patterns, or giving the programmer
| more tools to detect bugs. gVisor, for example, has a really
| interesting tool called checklocks:
|
| https://github.com/google/gvisor/tree/master/tools/checklock...
|
| While it definitely has some caveats, ideas like these should
| help Go programs achieve a greater degree of robustness.
| Obviously, this class of error would be effectively prevented by
| borrow checking, but I suppose if you want programming language
| tradeoffs more tilted towards robustness, Rust already has a lot
| of that covered.
| likeabbas wrote:
| > I suppose if you want programming language tradeoffs more
| tilted towards robustness, Rust already has a lot of that
| covered
|
| Does anyone _not_ want robustness of their language to cover
| their mistakes?
| metadat wrote:
| > 2. Slices are confusing types that create subtle and hard-to-
| diagnose data races
|
| The "Slices" example is just nasty! Like, this is just damning
| for Go's promise of _" _relatively_ easy and carefree
| concurrency"_.
|
| Think about it for a second or two,
|
| >> The reference to the slice was resized in the middle of an
| append operation from another async routine.
|
| What exactly happens in these cases? How can I trust myself, as a
| fallible human being, to reason about such cases when I'm trying
| to efficiently roll up a list of results. :-/
|
| Compared to every other remotely mainstream language, perhaps
| even C++, these are extremely subtle and sharp.. nigh, _razor
| sharp_ edges. Yuck.
|
| One big takeaway is this harsh realization: Golang guarantees are
| scant more than what is offered by pure, relatively naive and
| unadulterated BASH shell programming. I still will use it, but
| with newfound fear.
|
| As a multi-hundred-kloc-authoring-gopher: I love Go, and this
| article is _killing me_ inside. Go appears extremely sloppy at
| the edges of the envelope and language boundaries, moreso than
| even I had ever realized prior to now.
|
| Full-disclosure: I am disgusted by the company that is Uber, but
| I'm grateful to the talented folks who've cast a light on this
| cesspool region of Golang. Thank you!
|
| p.s. inane aside: I never would've guessed that in 2022, Java
| would start looking more and more appealing in new ways. Until
| now I've been more or less "all-in" on Go for years.
| masklinn wrote:
| > The "Slices" example is just nasty!
|
| I've got to say I'm not _entirely_ clear on what they talk
| about specifically.
|
| Is it simply that the `results` inside the goroutine will be
| desync'd from `myResults` (and so the call to myAppend will
| interact oddly with additional manipulations of results), or is
| it that the copy can be made mid-update, and `result` itself
| could be incoherent?
| symfoniq wrote:
| I have the same question.
|
| They talk about the "meta fields" of a slice. Is the problem
| that these "meta fields" (e.g. slice length and capacity) are
| passed by value, and that by copying them, they can get out
| of sync between coroutines?
| aaronbee wrote:
| I believe they made a mistake with that example. It doesn't
| look unsafe to me because the myResults sliced passed to the
| goroutine is not used. Or perhaps the racy part was left out
| of their snippet.
|
| Below is what might be what they have meant. This code
| snippet is racy because an unsafe read of myResults is done
| to pass it to the goroutine and then that version of
| myResults is passed to safeAppend: func
| ProcessAll(uuids []string) { var myResults []string
| var mutex sync.Mutex safeAppend := func(results
| []string, res string) { mutex.Lock()
| myResults = append(myResults, res) mutex.Unlock()
| } for _, uuid := range uuids { go
| func(id string, results []string) { res :=
| Foo(id) safeAppend(myResults, id)
| }(uuid, myResults) # <<< unsafe read of myResults }
| }
|
| EDIT: Formatting and clarity
| [deleted]
| Philip-J-Fry wrote:
| So a slice consists of a pointer to a backing array, a length
| and a capacity. If you don't use a pointer and pass this
| slice around you will _copy_ it.
|
| This is problematic because even though you copy it, you're
| still pointing at the same backing array.
|
| Therefore, a backing array with data like [1,2,3,4,5] could
| be pointed at by 2 slice headers (slice metadata) looking
| like
|
| A: {len: 2, cap: 10} [1,2] B: {len:5, cap: 10} [1,2,3,4,5]
|
| So any append operations on slice A will mess up the data in
| that backing array.
|
| Now, sometimes your append will resize the slice, in which
| case the data is copied and a slice with a new larger backing
| array is returned. If this was happening concurrently then
| you'd lose the data in racing appends.
|
| If the append doesn't need to resize the slice, then you'll
| overwrite the data in the backing array. And so you'll
| corrupt the data in the slice.
|
| Here's an example I threw together:
| https://go.dev/play/p/qRUKUwIf3vx
|
| Although the code in the post doesn't actually look like it
| has an issue. Their tooling just flagged it up as it
| _potentially_ has an issue if the copy was actually used in
| the function. But the `safeAppend` function targets the
| correct slice each time.
| morelisp wrote:
| Without disagreeing that it's an enormous footgun, one good way
| to avoid such slice issues is to use the uncommon `a[x:y:z]`
| form to ensure the slice can't grow. As we're starting to write
| a lot of generic slice functions with 1.18, we're using this
| form in almost all of them which may add elements.
| masklinn wrote:
| > one good way to avoid such slice issues is to use the
| uncommon `a[x:y:z]` form to ensure the slice can't grow.
|
| Do you mean you always use `a[x:y:y]` in order to ensure
| there is no extra capacity and any append will have to copy
| the slice?
|
| Is append _guaranteed_ to create a new slice (and copy over
| the data) if the parameter is at capacity? Because if it
| _could_ realloc internally then I don 't think this trick is
| safe.
| morelisp wrote:
| > Because if it could realloc internally then I don't think
| this trick is safe.
|
| Slices are 3 word values of (ptr, len, cap). They cannot be
| "realloced internally", changing any of those three things
| requires creating a new slice.
| masklinn wrote:
| Of course but the new slice could be (ptr, len+1, cap+x)
| because realloc() was able to expand the buffer in-place.
| Which yields essentially the same behaviour as an append
| call with leftover capacity.
|
| But I guess realloc is a libc function, and Go probably
| goes for mmap directly and would implement its own
| allocator, and so might not do that. Unless / until they
| decide to add support for it.
| throwaway894345 wrote:
| > What exactly happens in these cases? How can I trust myself,
| as a fallible human being, to reason about such cases when I'm
| trying to efficiently roll up a list of results. :-/
|
| For me: minimize shared mutable data. If I really can't get rid
| of some shared mutable data, I mutex it or use atomics or
| similar. This works _very well_ --I almost never run into data
| races this way, but it is a discipline rather than a technical
| control, so you might have to deal with coworkers who lack this
| particular discipline.
| metadat wrote:
| Absolutely, the disappointing part is that as code authors,
| we need to constantly remember about various (otherwise
| appealing and even encouraged by the syntax,) footguns and
| "never approach suxh areas" of (totally valid) syntax.
|
| Reminds me of programming in Javascript (it's extreme
| example, but the similarity is there).
| throwaway894345 wrote:
| Yeah, it's a bit disappointing. It doesn't bother me too
| much, but it could be improved by a linter which could help
| you find shared mutable state. Without a concept of "const"
| (for complex types, anyway), I'm not sure how feasible such
| a linter would be.
| jchw wrote:
| Slices may just be one of the best and worst parts of Go.
| They're cumbersome, their behavior sometimes feels
| 'inexplicable,' and even as an experienced developer you are
| likely to eventually fallen into one of the traps where your
| 'obvious' code isn't so obvious.
|
| That said... when programming in programming languages
| _without_ a slice type, I always want to have one. And though
| it 's confusing at times, the design does actually make sense;
| without a doubt, it's hard to think of how you would improve on
| the actual underlying design.
|
| I really wish that Go's container types were persistent
| immutable or some-such. It wouldn't solve _everything_ , but it
| feels to me like if they could've managed to do that, it
| would've been a lot easier to reason about.
| knorker wrote:
| Indeed. Probably the most common data structure ever used,
| list of stuff, Go managed to make subtle and full of
| surprises. A knife without a handle.
|
| This makes the stated reason for the delay of generics hard
| to understand. They didn't wait to get
| list/vector/array/slice right.
| masklinn wrote:
| > And though it's confusing at times, the design does
| actually make sense; without a doubt, it's hard to think of
| how you would improve on the actual underlying design.
|
| Go slices are absolutely the worst type in Go, because out of
| laziness they serve as both slices and vectors rather than
| have a separate, independent, and opaque vector types.
|
| This schizophrenia is the source of most if not all their
| traps and issues.
|
| > I really wish that Go's container types were persistent
| immutable or some-such.
|
| That would go against everything Go holds dear, since it's
| allergic to immutability and provides no support whatsoever
| for it (aside from simple information hiding).
| travisd wrote:
| Worth noting that some of these can be detected statically -- and
| some are detected by go vet (e.g., passing a sync.Mutex by
| value). I don't think it detects the wg.Add bug, but that seems
| relatively straightforward(+) to add a check for.
|
| (+famous last words, I know)
| [deleted]
| dubswithus wrote:
| > We developed a system to detect data races at Uber using a
| dynamic data race detection technique. This system, over a period
| of six months, detected about 2,000 data races in our Go code
| base, of which our developers already fixed ~1,100 data races.
|
| This isn't open source, correct?
___________________________________________________________________
(page generated 2022-06-10 23:00 UTC)