https://old.reddit.com/r/cpp/comments/1b55686/maybe_possible_bug_in_stdshared_mutex_on_windows/ jump to content my subreddits edit subscriptions * popular * -all * -random * -users | * AskReddit * -pics * -funny * -movies * -gaming * -worldnews * -news * -todayilearned * -nottheonion * -explainlikeimfive * -mildlyinteresting * -DIY * -videos * -OldSchoolCool * -TwoXChromosomes * -tifu * -Music * -books * -LifeProTips * -dataisbeautiful * -aww * -science * -space * -Showerthoughts * -askscience * -Jokes * -IAmA * -Futurology * -sports * -UpliftingNews * -food * -nosleep * -creepy * -history * -gifs * -InternetIsBeautiful * -GetMotivated * -gadgets * -announcements * -WritingPrompts * -philosophy * -Documentaries * -EarthPorn * -photoshopbattles * -listentothis * -blog more >> cpp cpp * comments * other discussions (2) Want to join? Log in or sign up in seconds.| * English [ ][] [ ]limit my search to r/cpp use the following search parameters to narrow your results: subreddit:subreddit find submissions in "subreddit" author:username find submissions by "username" site:example.com find submissions from "example.com" url:text search for "text" in url selftext:text search for "text" in self post contents self:yes (or self:no) include (or exclude) self posts nsfw:yes (or nsfw:no) include (or exclude) results marked as NSFW e.g. subreddit:aww site:imgur.com dog see the search faq for details. advanced search: by author, subreddit... this post was submitted on 03 Mar 2024 138 points (96% upvoted) shortlink: [https://redd.it/1b55] [ ][ ] [ ]remember mereset password login Submit a new link Submit a new text post Get an ad-free experience with special benefits, and directly support Reddit. get reddit premium cpp joinleave273,841 readers 843 users here now Discussions, articles, and news about the C++ programming language or programming in C++. For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow. --------------------------------------------------------------------- Get Started The C++ Standard Home has a nice getting started page. Videos The C++ standard committee's education study group has a nice list of recommended videos. Reference cppreference.com Books There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++. --------------------------------------------------------------------- Show all links Filter out CppCon links Show only CppCon links a community for 15 years MODERATORS * message the mods discussions in r/cpp <> X 8 * 1 comment Videos like the median problem solving video by Pete Isensee at CppCon 2023 138 * 67 comments Maybe possible bug in std::shared_mutex on Windows 153 * 215 comments Is CMake the de facto standard mandatory to use? 14 * 6 comments [C++20][performance] Optimized enum_to_name and name_to_enum with no bloat 4 * 10 comments Digital signal processing for microcontrollers in C++ ? Where to start ? 12 * 5 comments My late discovery of std::filesystem - Part I 16 * 14 comments Has anyone created constexpr versions of cmath functions? 7 * 36 comments Speed of GUI libraries in C++ 3 * 24 comments Introducing `simple_enum`: A New Approach to enum name computation in C++ 0 * 6 comments Encryption and Decryption using Linear Algebra with C++ Welcome to Reddit, the front page of the internet. Become a Redditor and join one of thousands of communities. x 137 138 139 Maybe possible bug in std::shared_mutex on Windows (self.cpp) submitted 21 hours ago * by forrestthewoods A team at my company ran into a peculiar and unexpected behavior with std::shared_mutex. This behavior only occurs on Windows w/ MSVC. It does not occur with MinGW or on other platforms. At this point the behavior is pretty well understood. The question isn't "how to work around this". The questions are: 1. Is this a bug in std::shared_mutex? 2. Is this a bug in the Windows SlimReaderWriter implementation? I'm going to boldly claim "definitely yes" and "yes, or the SRW behavior needs to be documented". Your reaction is surely "it's never a bug, it's always user error". I appreciate that sentiment. Please hold that thought for just a minute and read on. Here's the scenario: 1. Main thread acquires exclusive lock 2. Main thread creates N child threads 3. Each child thread: 1. Acquires a shared lock 2. Yields until all children have acquired a shared lock 3. Releases the shared lock 4. Main thread releases the exclusive lock This works most of the time. However 1 out of ~1000 times it "deadlocks". When it deadlocks exactly 1 child successfully acquires a shared lock and all other children block forever in lock_shared(). This behavior can be observed with std::shared_mutex, std::shared_lock/std::unique_lock, or simply calling SRW functions directly. If the single child that succeeds calls unlock_shared() then the other children will wake up. However if we're waiting for all readers to acquire their shared lock then we will wait forever. Yes, we could achieve this behavior in other ways, that's not the question. I made a StackOverflow post that has had some good discussion. The behavior has been confirmed. However at this point we need a language lawyer, u/STL, or quite honestly Raymond Chen to declare whether this is "by design" or a bug. Here is code that can be trivially compiled to repro the error. #include #include #include #include #include #include #include struct ThreadTestData { int32_t numThreads = 0; std::shared_mutex sharedMutex = {}; std::atomic readCounter = 0; }; int DoStuff(ThreadTestData* data) { // Acquire reader lock data->sharedMutex.lock_shared(); // wait until all read threads have acquired their shared lock data->readCounter.fetch_add(1); while (data->readCounter.load() != data->numThreads) { std::this_thread::yield(); } // Release reader lock data->sharedMutex.unlock_shared(); return 0; } int main() { int count = 0; while (true) { ThreadTestData data = {}; data.numThreads = 5; // Acquire write lock data.sharedMutex.lock(); // Create N threads std::vector> readerThreads; readerThreads.reserve(data.numThreads); for (int i = 0; i < data.numThreads; ++i) { readerThreads.emplace_back(std::make_unique(DoStuff, &data)); } // Release write lock data.sharedMutex.unlock(); // Wait for all readers to succeed for (auto& thread : readerThreads) { thread->join(); } // Cleanup readerThreads.clear(); // Spew so we can tell when it's deadlocked count += 1; std::cout << count << std::endl; } return 0; } Personally I don't think the function lock_shared() should ever be allowed to block forever when there is not an exclusive lock. That, to me, is a bug. One that only appears for std::shared_mutex in the SRW-based Windows MSVC implementation. Maybe it's allowed by the language spec? I'm not a language lawyer. I'm also inclined to call the SRW behavior either a bug or something that should be documented. There's a 2017 Raymond Chen post that discusses EXACTLY this behavior. He implies it is user error. Therefore I'm inclined to boldly, and perhaps wrongly, call this is an SRW bug. What do y'all think? Edit: Updated to explicitly set readCounter to 0. That is not the problem. * 67 comments * share * save * hide * report all 67 comments sorted by: best topnewcontroversialoldrandomq&alive (beta) [ ] Want to add to the discussion? Post a comment! Create an account [-]STLMSVC STL Dev 118 points119 points120 points 20 hours ago (40 children) As a reminder, triple backticks don't work for Old Reddit readers like me. You have to indent by four spaces. According to my reading of your code, the Standard, and [DEL:MSDN :DEL] Microsoft Learn, this is a Windows API bug. I don't see any squirreliness in your code (e.g. re-entrant acquisition, attempts to upgrade shared to exclusive ownership), I see no assumptions about fairness/FIFO, N4971 [thread.sharedmutex.requirements.general]/2 says "The maximum number of execution agents which can share a shared lock on a single shared mutex type is unspecified, but is at least 10000.", and I see nothing in the SRWLOCK documentation that says that it can behave like this. I don't think that this is an STL bug - we're justified in wanting to use SRWLOCK for this purpose (and we can't change that due to ABI anyways), and you've already reduced this to a direct Win32 repro anyways. * permalink * embed * save * report * reply [-]forrestthewoods[S] 47 points48 points49 points 19 hours ago (26 children) this is a Windows API bug Wow! Not every day that's the answer. What's the best next step to report this issue? Is that something I should report somewhere? Is it something you want to report internally since STL depends on SRW? * permalink * embed * save * parent * report * reply [-]STLMSVC STL Dev 111 points112 points113 points 17 hours ago (25 children) It is extremely difficult for programmer-users to report bugs against the Windows API (we're supposed to direct you to Feedback Hub, but you may as well transmit your message into deep space). I've filed OS-49268777 "SRWLOCK can deadlock after an exclusive owner has released ownership and several reader threads are attempting to acquire shared ownership together" with a slightly reduced repro. Thanks for doing your homework and creating a self-contained repro, plus pre-emptively exonerating the STL. I've filed this OS bug as a special favor - bug reports are usually off-topic for r/cpp. The microsoft/STL GitHub repo is the proper channel for reporting STL misbehavior; it would have been acceptable here even though the root cause is in the Windows API because this situation is so rare. If you see STL misbehavior but it's clearly due to a compiler bug, reporting compiler bugs directly to VS Developer Community is the proper thing to do. * permalink * embed * save * parent * report * reply [-]forrestthewoods[S] 27 points28 points29 points 16 hours ago (1 child) Thank you! Is there a way for me to follow OS-49268777 publicly? I tried searching for a bug tracker with IDs like that and couldn't find one. But I must admit I don't know my way around Microsoft's public tooling these days. * permalink * embed * save * parent * report * reply [-]STLMSVC STL Dev 29 points30 points31 points 16 hours ago (0 children) No, it's internal (unlike VS DevCom). * permalink * embed * save * parent * report * reply [-]rbmm 14 points15 points16 points 13 hours ago (22 children) i full research this case - "SRWLOCK can deadlock after an exclusive owner has released ownership and several reader threads are attempting to acquire shared ownership together" - rbmm/SRW_ALT (github.com) really not deadlock, but one thread can exclusive acquire the lock in this case, instead shared. ReleaseSRWLockExclusive first remove Lock bit and then, if Waiters present, walk by Wait Blocks for wake waiters ( RtlpWakeSRWLock ) but this 2 operations not atomic. in between, just after Lock bit removed, another thread can acquire the lock. and in this case acquire always will be exclusive by fact, even if thread ask for shared access only but because exclusive access include shared as well, i be not consider this as exactly bug. and the problem in the concrete code was only because thread is **wait** inside lock, which is wrong by sense of locks. if be thread not wait, but do own task in lock (synchronization access to data) will be no any deadlock - just after this thread release lock - all another threads can enter to lock as usual. example of fix concrete code, which show that no deadlock by fact, and other readers not hung for ever int DoStuff(ThreadTestData* data) { // Acquire reader lock data->sharedMutex.lock_shared(); ULONG64 time = GetTickCount64() + 1000; // wait until all read threads have acquired their shared lock // but no more 1000 ms !! data->readCounter.fetch_add(1); while (data->readCounter.load() != data->numThreads && GetTickCount64() < time) { std::this_thread::yield(); } // Release reader lock data->sharedMutex.unlock_shared(); return 0; } * permalink * embed * save * parent * report * reply [-]hexane360 6 points7 points8 points 4 hours ago (3 children) Getting an exclusive lock when you request a shared lock is indeed a bug. The c++17 standard states: In addition to the exclusive lock ownership mode specified in 33.4.3.2, shared mutex types provide a shared lock ownership mode. Multiple execution agents can simultaneously hold a shared lock ownership of a shared mutex type. And shared_mutex::lock_shared has the following postcondition: "The calling thread has a shared lock on the mutex." In other words, an exclusive lock is not a subtype of a shared lock, because a shared lock is required by the standard to allow shared ownership. A lock which doesn't allow shared ownership is not allowed by the standard. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (2 children) i only research the windows implementation of SRW locks (the same as PushLock in kernel mode) wich in some rare case can give caller exlusive access, despite he requested shared only. of course shared ! = exlusive, but from my look this must not produce any problems, if folow next (my, not standard) rule - thread inside SRW lock (c++ mutex) must not wait for another threads, which try enter to the same lock or not try acquire lock reqursive. even in case shared locks. * permalink * embed * save * parent * report * reply [-]hexane360 [score hidden] 2 hours ago (1 child) It's fine to have that as a personal rule, but the fact stands: Microsoft is not conforming to the C++ standard in this instance. As an aside, your personal rule basically boils down to "don't assume shared locks are actually shared". This shouldn't be necessary IMO. Also, even with this rule, you'd still get a potential performance degradation when 1 thread is working on a problem that should be multithreaded. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (0 children) you'd still get a potential performance degradation when 1 thread is working on a problem that should be multithreaded. no, in absolute most case shared access will be really shared. so multiple threads at once can be inside shared lock. but very rare (when some thread release from exclusive and other at this time request shared) shared can sudenly became exlusive. possible make more simply implementation if SRW lock. it probably will be more strict in this point, but have less perfomance and be less fair. i for example for fun create own implementation of SRW lock. it look like work correct in this case - i test exactly with this code and all was ok here. but my implementation use by fact LIFO order of waiters, which standard implementation try avoid * permalink * embed * save * parent * report * reply [-]KingAggressive1498 [score hidden] 3 hours ago* (3 children) it is for sure unusual to explicitly wait on other readers to enter the shared lock like in OP but that seems to be a "minimal reproduction case". it's not unusual in general to wait on other threads to reach a compatible point in logic (for example parallel fill of an array may wait for all threads to finish their part before continuing), and it's not unusual to nest synchronization for complex data structures (for example updating a single element in a bucket array only requires shared access to the outer array and individual bucket, but exclusive access to the individual element). My guess is OP discovered this issue in a more complex case using some combination of similar logic. I do agree OP is probably playing with a hand grenade with this combination - any exclusive-preferring lock may deadlock this logic similarly if an unrelated thread asks for exclusive access before all the shared locking threads make it - but that doesn't make "exclusive shared locking" as we're seeing here valid. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 3 hours ago (2 children) it's not unusual in general to wait on other threads to reach a compatible point in logic but only not need wait inside lock. we can exit from lock and then wait. for example winapi have SleepConditionVariableSRW and sure c++ have something like this. "minimal reproduction case"..issue in a more complex case let in this case OP describe real aplication logic and will be interested check/research this or propouse another solution i be say next : thread inside SRW lock (c++ mutex) must not wait for another threads, which try enter to the same lock or not try acquire lock reqursive. even in case shared locks. * permalink * embed * save * parent * report * reply [-]KingAggressive1498 [score hidden] 2 hours ago* (1 child) but only not need wait inside lock. we can exit from lock and then wait. I agree that's the logically simplest workaround and a better way anyway, but normally that kind of change is an optimization and not actually necessary. More importantly in complex codebases, the inner synchronization may be considerably removed logically from the outer synchronization. It may require a non-trivial overhaul to make it happen. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (0 children) in complex codebases.. - of course. i myself, in self code, never wait inside locks . anyway i here only debug, research and create repro for this case, for prove exactly exclusive access happens, despite shared was requested * permalink * embed * save * parent * report * reply [-]tialaramex 3 points4 points5 points 4 hours ago (1 child) This at least makes me less anxious about exactly what's wrong with the SRWLock. I assume Microsoft's concurrency people will develop, test and release a patch for this across all platforms in the next few weeks. * permalink * embed * save * parent * report * reply [-]Tringi [score hidden] 2 hours ago (0 children) This at least makes me less anxious [...] Same. I've been reviewing our most important code whole Sunday and haven't yet found anything where it would affect correctness for us. I assume Microsoft's concurrency people will develop, test and release a patch for this across all platforms in the next few weeks. That I wouldn't exactly count on. * permalink * embed * save * parent * report * reply [-]rbmm 4 points5 points6 points 8 hours ago (6 children) rbmm/SRW-2: shared to exclusive (github.com) i also create clear case for this situation. accurate reproduction on the first try and not hundreds of cycles * permalink * embed * save * parent * report * reply [-]ack_error 0 points1 point2 points 4 hours ago (5 children) Your linked readme mentions: and if the code was written correctly, such behavior would not cause any problems. we got MORE than we asked for. but so what ? did we ask for shared access? we got it. then we must work with the data that this SRW protect and exit. and everything will be ok. no one will even notice anything. I'm not sure this is fine for correctly written code. As SRW locks are defined, it should be OK to write code where two threads concurrently take a shared lock on the same SRW lock object and then one thread waits on the other. The behavior being seen makes this prone to deadlock. It reduces the guarantee to "multiple threads might be able to read concurrently", which is weaker than expected for general reader/writer locks. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (4 children) i think that thread which have shared access to lock must not care - are some another thread(s) was inside lock at same time. all what we need - read only access to data here, guarantee that data not changed, until we hold shared access. i be add next rule - thread inside SRW lock must not wait for another threads, which try enter to the same lock or not try acquire lock reqursive. even in case shared locks. * permalink * embed * save * parent * report * reply [-]ack_error [score hidden] 2 hours ago (3 children) This isn't a requirement imposed by either the general definition of a reader/writer lock or the SRWLock implemented by Windows, however. It's fine to define best practices around rules of locks, but neither the definition of a Win32 SRWLock or std::shared_mutex allows for these additional rules that you've proposed. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (2 children) yes, but what i personally can todo more, than rbmm/SRW-2: shared to exclusive (github.com)? also exist such note in docs. Shared mode SRW locks should not be acquired recursively as this can lead to deadlocks when combined with exclusive acquisition. but i not write docs nor implementation * permalink * embed * save * parent * report * reply [-]ack_error [score hidden] 1 hour ago (1 child) To be clear, I don't think you need to do anything, you've already contributed quite a bit by narrowing down a stable repro and the specific bug in the SRWLock implementation. But I do think it is a little questionable to suggest that code that does this pattern isn't correct -- it may not be the best pattern, but it's supposed to work. As for recursive locks, that doesn't apply here. Recursive locking refers to a situation where the same thread takes a second shared lock when it already has a shared lock on the same object. That definitely isn't supported by SRWLocks and is a usage error. * permalink * embed * save * parent * report * reply continue this thread [-]Tringi [score hidden] 2 hours ago (4 children) A question: If the lock is acquired exclusive, instead of shared, but then released by ReleaseSRWLockShared, isn't there a danger of SRWLOCK bits getting into unexpected state? * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (3 children) the lock really acquired with AcquireSRWLockShared call and released with ReleaseSRWLockShared call so all is correct here. no any errors. i mean that AcquireSRWLockShared call sometime (very rarely) can by fact give exclusive for caller. but my opinion - if we not wait in lock for other threads, which try enter this lock (and better not wait in lock at all) and not try recursive acquire the same lock - this must not lead to any problems * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 2 hours ago (2 children) the ReleaseSRWLockShared correct handle release of lock even if AcquireSRWLockShared make exclusive access * permalink * embed * save * parent * report * reply [-]Tringi [score hidden] 37 minutes ago (1 child) Thanks! Yes, that's what I was asking, because I don't know the implementation details of SRWLOCK. * permalink * embed * save * parent * report * reply [-]rbmm [score hidden] 25 minutes ago (0 children) https://github.com/mic101/windows/blob/master/WRK-v1.2/base/ntos/ex/ pushlock.c this is code for PushLock in kernel, but SRW locks in user mode have almost exactly the same implementation. * permalink * embed * save * parent * report * reply [-]Tringi 23 points24 points25 points 15 hours ago (4 children) I have deleted my other comment because of constant editing, but after a couple of attempts I can reproduce it on my trusted old LTSC 2016. I can reproduce it as far as back as on Server 2008 (Vista)! As a software vendor that does a lot of pretty parallel stuff with a lot of synchronization, I have to say: This is pretty concerning! * permalink * embed * save * parent * report * reply [-]CodeMonkeyMark 2 points3 points4 points 4 hours ago (1 child) a lot of pretty parallel stuff Pfft, only 120 processes across 272 logical CPUs? Those are rookie numbers! I run that many instances of MS Excel every day! * permalink * embed * save * parent * report * reply [-]Tringi [score hidden] 3 hours ago (0 children) Hah :-D That's a single process btw ...not that trivial to schedule it for max performance on that machine with Windows and everything. But a lot of fun nonetheless. * permalink * embed * save * parent * report * reply [-]ack_error 1 point2 points3 points 4 hours ago (1 child) Yeah, I was thinking of switching some old critsec-based code over to SRW locks, but this looks like potentially another PulseEvent() class situation. If it does get confirmed as an OS bug, it'd probably only be fixable back to Windows 10 at best, and both the MS STL and programs directly using SRW locks would have to work around it. * permalink * embed * save * parent * report * reply [-]Tringi [score hidden] 3 hours ago (0 children) Seems basically that shared/read acquire can, in some cases, acquire exclusively/write. If you are locking some small independent routine, as vast majority of code does, then it's no problem, perhaps just some unexpected loss of performance. But if you are doing a little more complex synchronizations, well, it's what we see here. Regarding good old critical sections: Those are always exclusive, so there would be no problem. But I wouldn't hurry with replacing them. At some build of Win10 they were rewritten in terms of WaitOnAddress and are now way faster than they used to be. Not as fast as SRW locks (about 4.5x slower), but still very good considering they offer reentrancy while SRW locks don't. As for backporting changes, I'm very curious if 2015 LTSB and 2016 LTSB get any potential fixes. * permalink * embed * save * parent * report * reply [-]Top_Satisfaction6517Bulat 3 points4 points5 points 12 hours ago* (2 children) we're justified in wanting to use SRWLOCK for this purpose according to OP, this bug doesn't manifest with MinGW, so they probably found another way to implement mutexes (may be less efficient) * permalink * embed * save * parent * report * reply [-]STLMSVC STL Dev 14 points15 points16 points 11 hours ago (0 children) Yeah, slower for users and more work for us are two bad tastes that taste bad together. * permalink * embed * save * parent * report * reply [-]KingAggressive1498 [score hidden] 1 hour ago* (0 children) much less efficient. libstdc++ on Windows has a dependency on a library that emulates pthreads on windows, which IIRC locking a pthread_rwlock_t as a readlock requires locking as exclusive, then locking a "shared mutex entry guard" mutex, updating some internal data, then unlocking both. And those mutexes are both the emulated pthread mutexes, which uses a single-checked atomic fast path backed by an event handle for necessary waits; a reasonable enough pre-Win8 implementation honestly * permalink * embed * save * parent * report * reply [-]kndb -5 points-4 points-3 points 11 hours ago (1 child) So what are you (STL) guys planning to do about it? This is clearly a properly written C++ code that causes a deadlock/bug on Windows. * permalink * embed * save * parent * report * reply [-]rbmm -1 points0 points1 point 7 hours ago (0 children) i be will not say that this is good code (no matter c++ or not) for real product. usage of lock is wrong by design. we must not wait in lock. lock need use only for synchronize access to data. so need enter to lock, access data, and release lock. if worker threads do exactly this - deadlock is gone. the thread which spin in lock, really (can) hold it exlusive and he wait on another workers, until they enter to lock. again - this is wrong. for what this? worker must not care about state of another worker threads, only about state of data. so 1 thread wait for N-1 workers enter to lock. but this N-1 workers wait when this 1 thread release the lock. as result and deadlock. if be code was next - no any locks and problem will be *wait until all read threads have acquired their shared lock* - but this is really src of problem int DoStuff(ThreadTestData* data) { data->sharedMutex.lock_shared(); DoSomeStuffLocked(); data->sharedMutex.unlock_shared(); return 0; } * permalink * embed * save * parent * report * reply [+]NilacTheGrim comment score below threshold-29 points-28 points-27 points 16 hours ago (1 child) The bug is in the OP's use of uninitialized atomics. Before C++20, default constructed std::atomic_t is completely uninitialized. Likely each run he gets a 0 randomly or whatever.. until he doesn't. * permalink * embed * save * parent * report * reply [-]jaerie 19 points20 points21 points 13 hours ago (0 children) This is a very well-researched post, if you have a hunch, at least be explicit that you didn't do anything to verify it, don't just state it as fact. * permalink * embed * save * parent * report * reply [-]saddung 19 points20 points21 points 19 hours ago* (0 children) I ran your code it locked up within 30-150 loops for me Also tried replacing shared_mutex with my own wrapper around SRWLock, same issue. It does appear to be a bug with SRWLock to me, I don't think I've encountered because it requires all the readers to coordinate and deliberately hold the lock at the same time. * permalink * embed * save * report * reply [-]umop_aplsdn 14 points15 points16 points 18 hours ago (0 children) Properly formatted code: #include #include #include #include #include #include #include struct ThreadTestData { int32_t numThreads = 0; std::shared_mutex sharedMutex = {}; std::atomic readCounter; }; int DoStuff(ThreadTestData* data) { // Acquire reader lock data->sharedMutex.lock_shared(); // wait until all read threads have acquired their shared lock data->readCounter.fetch_add(1); while (data->readCounter.load() != data->numThreads) { std::this_thread::yield(); } // Release reader lock data->sharedMutex.unlock_shared(); return 0; } int main() { int count = 0; while (true) { ThreadTestData data = {}; data.numThreads = 5; // Acquire write lock data.sharedMutex.lock(); // Create N threads std::vector> readerThreads; readerThreads.reserve(data.numThreads); for (int i = 0; i < data.numThreads; ++i) { readerThreads.emplace_back(std::make_unique(DoStuff, &data)); } // Release write lock data.sharedMutex.unlock(); // Wait for all readers to succeed for (auto& thread : readerThreads) { thread->join(); } // Cleanup readerThreads.clear(); // Spew so we can tell when it's deadlocked count += 1; std::cout << count << std::endl; } return 0; } * permalink * embed * save * report * reply [-]CandyCrisis 13 points14 points15 points 20 hours ago (0 children) Summoning u/STL * permalink * embed * save * report * reply [-]farmdve [score hidden] 3 hours ago (2 children) This might get buried in the comments but many moons ago in 2014 or 2016 perhaps, I was an avid player of Battlefield 4. In certain occasions with specific MSVC redist versions the game had a serious deadlock whereby the process could not be killed in any way and the RAM usage was there but the process was deadlocked. No tool could kill it, even those that used kernel-mode drivers. Only a reboot fixed it until I started the game again. starting the game again worked I think but exiting also lead to the same issue so now you had two copies of battlefield4.exe consuming RAM but not exiting. Downgrading the MSVC redist absolutely solved the problem at the time. Could be the same bug? * permalink * embed * save * report * reply [-]KingAggressive1498 [score hidden] 3 hours ago (0 children) this was reproduced with direct use of SRWLock. The problem is that under the right conditions, a request for a shared lock may be silently upgraded to an exclusive lock which is why everytime this deadlocks it has a single shared locker make it through. the SRWLock implementation is in kernel32.dll IIRC, which is not part of the MSVC redistributable. * permalink * embed * save * parent * report * reply [-]STLMSVC STL Dev [score hidden] 12 minutes ago (0 children) Not possible given the timeline. We implemented C++17 shared_mutex in VS 2015 Update 2, which was released 2016-03-30. * permalink * embed * save * parent * report * reply [-]EverydayTomasz 5 points6 points7 points 19 hours ago (10 children) depending on the os thread scheduler, your data.sharedMutex.unlock() could be called before your threads execute lock_shared(). so, some child threads will block on lock_shared() and some won't. is this what you trying to do? but I think the issue might be with the atomic readCounter not being initialized to 0? so, technically if you don't init the readCounter, you will have undefined value, and some of your threads will get stuck looping on the yield(). SS 29.6.5 [atomics.types.operations.req] P 4 of N 4140 (the final draft for C++14) says: A ::A () noexcept = default; Effects: leaves the atomic object in an uninitialized state. [ Note: These semantics ensure compatibility with C. -- end note ] * permalink * embed * save * report * reply [-]STLMSVC STL Dev 14 points15 points16 points 17 hours ago* (3 children) The initialization rules are a headache to think about and I'm certainly not going to think that hard on a weekend, [DEL:but I believe because it's an aggregate and their top-level initialization is ThreadTestData data = {};, they actually do get the atomic zeroed out here.:DEL] You've certainly got a good point in general about avoiding garbage-init. I ruled it out as the source of the problem here - this still repros even when the loop begins with an explicit .store(0). * permalink * embed * save * parent * report * reply [+]NilacTheGrim comment score below threshold-8 points-7 points-6 points 16 hours ago* (2 children) Aggregates just call default c'tor for members (if the members are classes that have a default c'tor).. they never go ahead and avoid the default c'tor! That would be evil. And in this case the std::atomic with default c'tor will be called.. and on anything before C++20.. leaves the atomic holding uninitialized data. Bug is definitely due to that.. or I may say, I suspect with 98% certainty that's what it is. OP likely tested same code (with the uninitialized atomic) with the lower-level SRW implementation, concluding that's the problem.. when it's really his own code and his mis-use of atomics. * permalink * embed * save * parent * report * reply [-]STLMSVC STL Dev 14 points15 points16 points 16 hours ago (1 child) You're half-right, thanks for the correction. I forgot that atomic itself is a class (what am I, some kind of STL maintainer?). However, the bug is not related to this. It still repros if you initialize the atomic to 0, or explicitly .store(0) before doing any work. * permalink * embed * save * parent * report * reply [-]NilacTheGrim 0 points1 point2 points 15 hours ago (0 children) Ah ok .. well thanks for the follow-up -- definitely quite an onion then. * permalink * embed * save * parent * report * reply [-]KingAggressive1498 0 points1 point2 points 18 hours ago (5 children) default constructor performs value initializion starting in C++20 * permalink * embed * save * parent * report * reply [-]EverydayTomasz 0 points1 point2 points 18 hours ago (0 children) there is no difference between atomic int32_t and regular int32_t: std::atomic value1; // no value std::atomic value2{0}; // set to 0 int32_t value3; // no value int32_t value4 = 0; // set to 0 value1 and value3 will both be uninitialized. here is the discussion on this. * permalink * embed * save * parent * report * reply [+]NilacTheGrim comment score below threshold-11 points-10 points-9 points 16 hours ago (3 children) Yes but I bet you $5 OP compiled with not-C++20. His code is bad. STL on MSVC is fine. * permalink * embed * save * parent * report * reply [-]forrestthewoods[S] 7 points8 points9 points 15 hours ago (2 children) That's not the issue. Setting read_counter to 0 either explicitly or in the struct declaration does not change the behavior. STL on MSVC is fine. I recommend you read the other comments on this post. The currently top comment in particular. Cash app is my preferred method to receive your five dollars. DM me to coordinate. :) * permalink * embed * save * parent * report * reply [+]NilacTheGrim comment score below threshold-6 points-5 points-4 points 15 hours ago (0 children) Still tho fix your example to initialize atomics properly. Lead by example. Kiddies may be watching that are poor and don't have C++20.. * permalink * embed * save * parent * report * reply [-]WasserHase 1 point2 points3 points 18 hours ago (2 children) I know that STL has confirmed your bug, but is there not a problem with your algorithm in 3.2: Yields until all children have acquired a shared lock Does this not assume that the yielding threads won't starve out the other threads, which haven't acquired the lock yet? I don't think there is such a guarantee in the standard. this_thread::yield() is only a hint, which the implementation is allowed to ignore and a few threads can get stuck in this loop while (data->readCounter.load() != data->numThreads) { std::this_thread::yield(); } And not allow any other threads to progress to actually increment the counter. Or am I wrong? * permalink * embed * save * report * reply [-]forrestthewoods[S] 12 points13 points14 points 18 hours ago (0 children) Adding a sleep doesn't change the behavior. This isn't an issue of thread starvation or priority inversion. * permalink * embed * save * parent * report * reply [-]vlovich 1 point2 points3 points 5 hours ago (0 children) Yield will typically call the OS yield primitive which will ask the scheduler to yield (which it may not but it may). Regardless, the OS will at some point schedule all the threads (all non-embedded OSes these days are a preemptible design AFAIK) which wouldn't explain the observation that only 1 thread got a shared lock. * permalink * embed * save * parent * report * reply [-]rbmm 1 point2 points3 points 6 hours ago (0 children) data->sharedMutex.lock_shared(); in concrete case sometime executed as data->sharedMutex.lock(); without shared. and as result other N-1 DoStuff threads wait for .unlock() call ( unlock_shared() do this the same ). if describe exactly what happens. if be thread NOT "wait until all read threads have acquired their shared lock" nothing be deadlock, if it break spin and release lock, all continue executed as excepted. only can not understand - this is "wait until all read threads have acquired their shared lock" for test only or here you try implement some product logic * permalink * embed * save * report * reply [+]Chemical-Ice-9835 comment score below threshold-6 points-5 points -4 points 12 hours ago (0 children) Only Raymond Chen can analyze this correctly * permalink * embed * save * report * reply [-]CheapShotsBot -2 points-1 points0 points 14 hours ago (0 children) Interesting. It's almost like the scheduler isn't giving time share to those threads in order to wake up from the lock. * permalink * embed * save * report * reply [-]D2OQZG8l5BI1S06 -2 points-1 points0 points 7 hours ago (0 children) I'm not sure you're allowed to hold the lock while creating threads. * permalink * embed * save * report * reply [+]NilacTheGrim comment score below threshold-25 points-24 points-23 points 16 hours ago* (2 children) Your bug is in your use of std::atomic. Before C++20, atomics are not initialized to anything with their default constructor... unless you explicitly initialize them. Compile with C++20 and a standards-compliant compiler should initialize to 0. Or.. you know.. initialize to 0 yourself! Even if you think you will always be in C++20-land -- Don't ever leave atomics uninitialized. If you must, please static_assert C++20 using the relevant constexpr expressions. See the C++ docs. Relevant docs: https://en.cppreference.com/w/cpp/ atomic/atomic/atomic * permalink * embed * save * report * reply [-]F54280 25 points26 points27 points 14 hours ago (1 child) So you obviously made that change and the program worked flawlessly, right? Did you do that before or after reading the top answer by STL posted 4 hours before yours that confirms this as a Window bug and that setting the value to zero doesn't change anything? * permalink * embed * save * parent * report * reply [+]Top_Satisfaction6517Bulat comment score below threshold-14 points -13 points-12 points 12 hours ago (0 children) when you see an obvious novice's flaw in noname's code - do you still read through 100+ comments in the hope that the problem isn't in the flaw, but in the Windows bug undiscovered for 15 years? * permalink * embed * save * parent * report * reply * about * blog * about * advertising * careers * help * site rules * Reddit help center * reddiquette * mod guidelines * contact us * apps & tools * Reddit for iPhone * Reddit for Android * mobile website * <3 * reddit premium Use of this site constitutes acceptance of our User Agreement and Privacy Policy. (c) 2024 reddit inc. All rights reserved. REDDIT and the ALIEN Logo are registered trademarks of reddit inc. Advertise - technology [pixel] p Rendered by PID 32 on reddit-service-r2-loggedout-65c78c4999-47jc7 at 2024-03-03 23:00:32.487644+00:00 running 5600ee2 country code: US.