Post AcjfiDRcAfUUZKLSSG by louis@emacs.ch
 (DIR) More posts by louis@emacs.ch
 (DIR) Post #AcjfiCRZtJrpSvJul6 by djrmarques@emacs.ch
       2023-12-12T13:47:13Z
       
       0 likes, 0 repeats
       
       Going crazy trying to solve a race condition on my go program. I am using a Mutex to try and guarantee that no two go routines are writing and reading from the same struct, but somehow the race detector keeps saying I still have a race condition. I probably am not using the mutex right
       
 (DIR) Post #AcjfiDRcAfUUZKLSSG by louis@emacs.ch
       2023-12-12T14:11:55Z
       
       0 likes, 0 repeats
       
       @djrmarques Show your code 🙂
       
 (DIR) Post #AcjfiEU8In6DnQWz1E by djrmarques@emacs.ch
       2023-12-12T13:50:58Z
       
       0 likes, 0 repeats
       
       Probably need to read more on how the race detector works
       
 (DIR) Post #AcjsEoy1OfhreyoiDw by djrmarques@emacs.ch
       2023-12-12T16:32:20Z
       
       0 likes, 0 repeats
       
       @louis https://pastebin.com/yiiWewKg. This is my coding for this https://pdos.csail.mit.edu/6.824/labs/lab-mr.html.Basically, this coordinator will have GetTask RPC calls from one for various workers. ON top of that, there is another part of the code calling the Done method every couple of seconds.
       
 (DIR) Post #Acjuaoz4DN5BzWvMaO by louis@emacs.ch
       2023-12-12T16:58:41Z
       
       0 likes, 0 repeats
       
       @djrmarques You fell into the popular Go trap, in that you essentially assign only the last of the newTask assignments into your map, so all tasks are effectively the same one task.That issue is so popular that the Go designers are planning changes to the language:https://github.com/golang/go/discussions/56010Just do a nt := newTask() inside the loop and omit the declaration in line 83.Does that solve your issue?
       
 (DIR) Post #AcjuzsYY4PEKJIT808 by djrmarques@emacs.ch
       2023-12-12T17:03:15Z
       
       0 likes, 0 repeats
       
       @louis Dammnnnnn. That could probably be it. I need to try it out later (not on my laptop now), but thanks A LOT! No way I would figure this one out any time soon :D
       
 (DIR) Post #Acjv5GHPIHyXKr19ea by djrmarques@emacs.ch
       2023-12-12T17:04:14Z
       
       0 likes, 0 repeats
       
       @louis But if this is truly the problem, then I have no idea why my code is working (the code works when I only have one Worker, so no concurrent access)
       
 (DIR) Post #AcjvAJG9dtx81bq4em by louis@emacs.ch
       2023-12-12T17:05:08Z
       
       0 likes, 0 repeats
       
       @djrmarques Little more explanation about the issue:You declare a variable in line 83, and in your for loop overwrite the content of this variable with a newTask(). Then you assign the address of nt into your map. However, that address is always the same address because the variable was declared outside of the loop.The result is, that all your tasks point to the same newTask() that was created in the last iteration of the loop. 🙂
       
 (DIR) Post #AcjvPGV6BNiJoOxb9M by louis@emacs.ch
       2023-12-12T17:07:47Z
       
       0 likes, 0 repeats
       
       @djrmarques I'm curious if that solved your problem. That issue is so hard to spot that basically every Go developer will run into it.The latest release of Go LSP even issues a warning for this as far as I know.Hope it helped!
       
 (DIR) Post #AcjwplQBwT4HC7remu by offset___cyan@emacs.ch
       2023-12-12T17:23:49Z
       
       0 likes, 0 repeats
       
       @louis @djrmarques probably the second "billion-dollar mistake" in programming language design
       
 (DIR) Post #AcjxJ9Cib6tJLxZckC by djrmarques@emacs.ch
       2023-12-12T17:29:06Z
       
       0 likes, 0 repeats
       
       @louis I will need to test this out later (or probably only tomorrow). But  one question, is my usage of the Mutex correct? Basically, every operation that I write that might change the Coordinator struct is wrapped on Lock and Unlock.  Not used to write concurrent programming.
       
 (DIR) Post #Acjz867XvAoWVs0sRU by louis@emacs.ch
       2023-12-12T17:49:31Z
       
       0 likes, 0 repeats
       
       @djrmarques From what I see there could be a few issues. In lines 102, 115, 129, 131, 144, 146, 193 you access fields of your Task directly from the coordinator. These accesses will not be protected by the Mutex. You should only access/update fields of the Task by using methods of Task which lock it.So instead of t.finished you should have a t.finished() method that locks the task like you do it with reset() start() etc.Does that make sense?
       
 (DIR) Post #AcjzPF7HjDDlKt4B7I by djrmarques@emacs.ch
       2023-12-12T17:52:40Z
       
       0 likes, 0 repeats
       
       @louis Total sense. Thanks!
       
 (DIR) Post #AckHZ60kEei7ANr5VY by djrmarques@emacs.ch
       2023-12-12T21:16:08Z
       
       0 likes, 0 repeats
       
       @louis So the tip you gave me ended up now solving the problem entirely (it was nonetheless super helpful). I ended up changing the location of the mutex, and declaring it at the top of the file as var mu sync.Mutex instead of a field in the struct. The race condition seems to have been fixed for now. Man the MIT guy was not joking, debugging concurrent code it is not trivial.