Subj : Re: How to diagnose/resolve libthread panic? To : comp.programming.threads From : roger.faulkner Date : Thu Jun 02 2005 07:04 pm Paul F. Pearson wrote: > Hi, > > I've searched Google, and this group specifically, and can't find any > good pointers on how to identify what's causing a libthread panic. > Could someone point me in the right direction? > > Specifically, the panic is occurring in a mostly-Ada program (with some > C linked in) on Solaris 8, patches up-to-date as of a couple of months > ago. Our compiler is gnat 3.15p1, and the C code is compiled using the > gcc included in that gnat. The process usually has about 277 lwps > running. We're using the Solaris native threads. The problem isn't > vert repeatable, but happens just often enough to be troubling. > > I don't have handy the specific output of the libthread panic, but it's > a SIGSEGV (signal 1). > > So, how does one go about diagnosing a problem of this sort? Is there > a way to trap the panic? What are the common causes? Right now, our > hope is to put in "hooks" and run with them, hoping the panic happens > and we can see what's going on. We just don't know how to go about > putting in the "hooks". > > As I said, if there's a good "start here" site and/or book, I'll be > glad to RTFM. If a more Solaris-centric newsgroup is more appropriate, > then I'll gladly go there. In Solaris 8, when using the standard libthread, there's no easy way. Usually a libthread panic (I presume you get a message saying "libthread panic ...") indicates a bug in libthread, not in your application. Of course your application might be causing some sort of data corruption via a wild pointer, but I doubt this. It's probably timing related -- a race condition bug in libthread. On Solaris 8, you should switch to using the alternate libthread. It is more robust. There are bugs in the old libthread, especially with respect to signals, that cannot be fixed due to its architecture. The alternate libthread has become the only libthread in Solaris 9. The instructions for using the alternate libthread are in the threads(3THR) man page for Solaris 8. You can find it at: http://docs.sun.com/app/docs/doc/806-0630/6j9vkb8kk?a=view Here is the relevant section from that man page: ALTERNATE IMPLEMENTATION The standard threads implementation is a two-level model in which user-level threads are multiplexed over possibly fewer lightweight processes, or LWPs. An LWP is the fundamental unit of execution that is dispatched to a processor by the operating system. The system provides an alternate threads implementation, a one-level model, in which user-level threads are associated one-to-one with LWPs. This implementation is simpler than the standard implementation and may be beneficial to some multithreaded applications. It provides exactly the same interfaces, both for POSIX threads and Solaris threads, as the standard implementation. To link with the alternate implementation, use the following runpath (-R ) options when linking the program: POSIX cc -mt ... -lpthread ... -R /usr/lib/lwp (32-bit) cc -mt ... -lpthread ... -R /usr/lib/lwp/64 (64-bit) Solaris cc -mt ... -R /usr/lib/lwp (32-bit) cc -mt ... -R /usr/lib/lwp/64 (64-bit) For multithreaded programs that have been previously linked with the standard threads library, the environment variables LD_LIBRARY_PATH and LD_LIBRARY_PATH_64 can be set as follows to bind the program at runtime to the alternate threads library: LD_LIBRARY_PATH=/usr/lib/lwp LD_LIBRARY_PATH_64=/usr/lib/lwp/64 Note that if an LD_LIBRARY_PATH environment variable is in effect for a secure process, then only the trusted directories specified by this variable will be used to augment the runtime linker's search rules. If you switch to the alternate libthread and your problem persists, then send me e-mail describing the problem, or post it here. There are lots of tools (adb, pstack, pmap, pflags, ...) that can be used to examine the core file, and because the alternate libthread is a simpler threads implementation, the problem will be easier to find. Roger Faulkner Sun Microsystems .