Subj : lwp_mutex_lock hang? To : comp.programming.threads From : Johan Piculell Date : Tue Mar 22 2005 08:58 am Hi. I have a strange lockup problem in our application that I just cannot understand. We have only experienced this at one customer and not been able to reproduce so far. Problem seems to occur several times per week however, and it hangs on a shared memory mutex in several threads: ----------------- lwp# 6 / thread# 6 -------------------- ff165acc lwp_mutex_lock (f9800000) ff160cb8 mutex_lock_kernel (f9800000, 0, ff17a04c, ff178000, 0, f9800000) + cc ff161b48 mutex_lock_internal (ff070a00, 0, 0, ff178000, 1e2a8, fe797654) + 268 001d84fc __1cMSolarisMutexElock6M_i_ (13cdd14, 1402d74, 3a03fc, 1d84ec, 1402a78, 2256a10) + 10 0010e198 __1cPSharedTableImplElock6M_i_ (3680518, 1402d74, a0, ffffffff, 30, 13cdc85) + 34 ----------------- lwp# 7 / thread# 7 -------------------- ff165acc lwp_mutex_lock (f9800000) ff160cb8 mutex_lock_kernel (f9800000, 0, ff17a04c, ff178000, 0, f9800000) + cc ff161b48 mutex_lock_internal (ff070c00, 0, 0, ff178000, 1e2a8, fe797654) + 268 001d84fc __1cMSolarisMutexElock6M_i_ (13cdd14, 1403274, 3a03fc, 1d84ec, 1402a78, 2256a10) + 10 0010e198 __1cPSharedTableImplElock6M_i_ (3680518, 1403274, a0, ffffffff, 30, 47c7105) + 34 ----------------- lwp# 8 / thread# 8 -------------------- ff165acc lwp_mutex_lock (f9800000) ff160cb8 mutex_lock_kernel (f9800000, 0, ff17a04c, ff178000, 0, f9800000) + cc ff161b48 mutex_lock_internal (ff070e00, 0, 0, ff178000, 1e2a8, fe797654) + 268 001d84fc __1cMSolarisMutexElock6M_i_ (13cdd14, 1403738, 3a03fc, 1d84ec, 1402a78, 2256a10) + 10 0010e198 __1cPSharedTableImplElock6M_i_ (3680518, 1403738, a0, ffffffff, 30, 5ea0535) + 34 The reason for having a shared mutex is because we can parallelize using threads and/or processes, but I cannot see any reference to f9800000 in any other process either and not even a lwp_mutex_lock. The hardware they are running on is a Sunfire 6800 with Solaris 8. We are linking with the alternate thread library (/usr/lib/lwp), not sure about the patch status but can easily be checked. Even if we have messed something up in our code (which usually is the case :-( ), is it really possible to have 3 threads in lwp_mutex_lock() on the same mutex? /Johan .