From danny@cs.huji.ac.il  Sun Oct 28 15:06:45 2007
Return-Path: <danny@cs.huji.ac.il>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C1D8116A41A
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 28 Oct 2007 15:06:45 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10])
	by mx1.freebsd.org (Postfix) with ESMTP id 7A09F13C4B2
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 28 Oct 2007 15:06:45 +0000 (UTC)
	(envelope-from danny@cs.huji.ac.il)
Received: from sunfire.cs.huji.ac.il ([132.65.16.80])
	by cs1.cs.huji.ac.il with esmtp
	id 1Im9it-000NbM-S9
	for FreeBSD-gnats-submit@freebsd.org; Sun, 28 Oct 2007 17:06:43 +0200
Received: from danny by sunfire.cs.huji.ac.il with local (Exim 4.68 (FreeBSD))
	(envelope-from <danny@cs.huji.ac.il>)
	id 1Im9it-0000YS-ON
	for FreeBSD-gnats-submit@freebsd.org; Sun, 28 Oct 2007 17:06:43 +0200
Message-Id: <E1Im9it-0000YS-ON@sunfire.cs.huji.ac.il>
Date: Sun, 28 Oct 2007 17:06:43 +0200
From: Danny Braniss <danny@cs.huji.ac.il>
Reply-To: Danny Braniss <danny@cs.huji.ac.il>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: dump(8) hangs on SMP - 4way and higher.
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         117603
>Category:       bin
>Synopsis:       [patch] dump(8) hangs on SMP - 4way and higher.
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    linimon
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Oct 28 15:10:01 UTC 2007
>Closed-Date:    Sat Aug 28 10:48:43 UTC 2010
>Last-Modified:  Sat Aug 28 10:48:43 UTC 2010
>Originator:     Danny Braniss
>Release:        FreeBSD 7.0-BETA1 amd64
>Organization:
>Environment:
System: FreeBSD sunfire 7.0-BETA1 FreeBSD 7.0-BETA1 #1: Sat Oct 20 16:30:43 IST 2007 danny@sunfire:/r+d/obj/sunfire/r+d/7.0/src/sys/HUJI amd64


	
>Description:
	dump will create 4 processes, 3 of which read from disk, and
	via some syncronization will seq. write to tape/file.
	the method used to sync. these 'slaves' worked fine on older,
	slower, non-smp hosts. on a dual cpu, dual core, it hangs
	very frequently.
>How-To-Repeat:
	dump 0aLf /some/file /
>Fix:
	patch follows.

--- tape.c.orig 2005-03-02 04:30:08.000000000 +0200
+++ tape.c      2007-10-28 16:17:46.728015000 +0200
@@ -109,11 +109,8 @@
 
 int master;            /* pid of master, for sending error signals */
 int tenths;            /* length of tape used per block written */
+
 static volatile sig_atomic_t caught; /* have we caught the signal to proceed? */
-static volatile sig_atomic_t ready; /* reached the lock point without having */
-                       /* received the SIGUSR2 signal from the prev slave? */
-static jmp_buf jmpbuf; /* where to jump to if we are ready when the */
-                       /* SIGUSR2 arrives from the previous slave */
 
 int
 alloctape(void)
@@ -685,15 +682,13 @@
 void
 proceed(int signo __unused)
 {
-
-       if (ready)
-               longjmp(jmpbuf, 1);
        caught++;
 }
 
 void
 enslave(void)
 {
+       sigset_t        s_mask;
        int cmd[2];
        int i, j;
 
@@ -704,6 +699,10 @@
        signal(SIGUSR1, tperror);    /* Slave sends SIGUSR1 on tape errors */
        signal(SIGUSR2, proceed);    /* Slave sends SIGUSR2 to next slave */
 
+       sigemptyset(&s_mask);
+       sigaddset(&s_mask, SIGUSR2);
+       sigprocmask(SIG_BLOCK, &s_mask, NULL);
+
        for (i = 0; i < SLAVES; i++) {
                if (i == slp - &slaves[0]) {
                        caught = 1;
@@ -793,12 +792,8 @@
                                       quit("master/slave protocol botched.\n");
                        }
                }
-               if (setjmp(jmpbuf) == 0) {
-                       ready = 1;
-                       if (!caught)
-                               (void) pause();
-               }
-               ready = 0;
+               if(!caught)
+                    sigsuspend(0);
                caught = 0;
 
                /* Try to write the data... */
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback 
State-Changed-By: iedowse 
State-Changed-When: Sun Nov 4 11:14:19 UTC 2007 
State-Changed-Why:  

Hi, thanks for submitting this. I wonder could you try the patch 
at 

http://people.freebsd.org/~iedowse/dump_117603.diff 

instead? There appears to be one problem with the original patch, 
as it passes an invalid NULL argument to sigsuspend(), which would 
have caused it to return immediately with an EFAULT errno. As far 
as I can tell that would have broken the synchronisation and caused 
blocks to be written out of order at times. 


Responsible-Changed-From-To: freebsd-bugs->iedowse 
Responsible-Changed-By: iedowse 
Responsible-Changed-When: Sun Nov 4 11:14:19 UTC 2007 
Responsible-Changed-Why:  
I'll take this 

http://www.freebsd.org/cgi/query-pr.cgi?pr=117603 

From: Ian Dowse <iedowse@iedowse.com>
To: Danny Braniss <danny@cs.huji.ac.il>
Cc: iedowse@FreeBSD.org, freebsd-bugs@FreeBSD.org,
    freebsd-gnats-submit@FreeBSD.org
Subject: Re: bin/117603: [patch] dump(8) hangs on SMP - 4way and higher. 
Date: Sun, 04 Nov 2007 12:57:23 +0000

 Hi,
 
 It's probably better to keep the caught variable, since isn't there
 a risk of a different signal completing the sigsuspend(), e.g.
 SIGINFO caused by Ctrl-T? This is why the patch I posted has a while
 loop around the sigsuspend call - it presumbably was an issue with
 the original pause() call too.
 
 Would you be able to test the patch I posted? I haven't yet found
 a setup here that triggers the original problem, so it would be
 good to know if using sigsuspend alone is enough. It wasn't clear
 to me what exactly was involved in the original race - the use of
 volatiles and the setjmp()/longjmp(), while fairly horrible, did
 appear to cover the more obvious race conditions.
 
 Ian
 
 In message <E1IodwH-000GBl-Qc@cs1.cs.huji.ac.il>, Danny Braniss writes:
 >This is a multipart MIME message.
 >
 >--==_Exmh_1194176785_547800
 >Content-Type: text/plain; charset=us-ascii
 >
 >wups, sent an old diffs.
 >
 >
 >--==_Exmh_1194176785_547800
 >Content-Type: text/plain ; name="dump.c.diffs"; charset=us-ascii
 >Content-Description: dump.c.diffs
 >Content-Disposition: attachment; filename="dump.c.diffs"
 >
 >--- tape.c.orig	Wed Mar  2 04:30:08 2005
 >+++ tape.c	Sun Nov  4 13:42:55 2007
 >@@ -109,11 +109,6 @@
 > 
 > int master;		/* pid of master, for sending error signals */
 > int tenths;		/* length of tape used per block written */
 >-static volatile sig_atomic_t caught; /* have we caught the signal to proceed?
 > */
 >-static volatile sig_atomic_t ready; /* reached the lock point without having 
 >*/
 >-			/* received the SIGUSR2 signal from the prev slave? */
 >-static jmp_buf jmpbuf;	/* where to jump to if we are ready when the */
 >-			/* SIGUSR2 arrives from the previous slave */
 > 
 > int
 > alloctape(void)
 >@@ -685,15 +680,13 @@
 > void
 > proceed(int signo __unused)
 > {
 >-
 >-	if (ready)
 >-		longjmp(jmpbuf, 1);
 >-	caught++;
 >+     // do nothing ...
 > }
 > 
 > void
 > enslave(void)
 > {
 >+	sigset_t	s_mask;
 > 	int cmd[2];
 > 	int i, j;
 > 
 >@@ -704,13 +697,11 @@
 > 	signal(SIGUSR1, tperror);    /* Slave sends SIGUSR1 on tape errors */
 > 	signal(SIGUSR2, proceed);    /* Slave sends SIGUSR2 to next slave */
 > 
 >-	for (i = 0; i < SLAVES; i++) {
 >-		if (i == slp - &slaves[0]) {
 >-			caught = 1;
 >-		} else {
 >-			caught = 0;
 >-		}
 >+	sigemptyset(&s_mask);
 >+	sigaddset(&s_mask, SIGUSR2);
 >+	sigprocmask(SIG_BLOCK, &s_mask, NULL);
 > 
 >+	for (i = 0; i < SLAVES; i++) {
 > 		if (socketpair(AF_UNIX, SOCK_STREAM, 0, cmd) < 0 ||
 > 		    (slaves[i].pid = fork()) < 0)
 > 			quit("too many slaves, %d (recompile smaller): %s\n",
 >@@ -733,6 +724,7 @@
 > 		              sizeof slaves[0].pid);
 > 
 > 	master = 0;
 >+	kill(slp->pid, SIGUSR2); // start the ball rolling
 > }
 > 
 > void
 >@@ -757,6 +749,7 @@
 > static void
 > doslave(int cmd, int slave_number)
 > {
 >+	sigset_t s_mask;
 > 	int nread;
 > 	int nextslave, size, wrote, eot_count;
 > 
 >@@ -774,7 +767,7 @@
 > 	    != sizeof nextslave) {
 > 		quit("master/slave protocol botched - didn't get pid of next sl
 >ave.\n");
 > 	}
 >-
 >+	sigemptyset(&s_mask);
 > 	/*
 > 	 * Get list of blocks to dump, read the blocks into tape buffer
 > 	 */
 >@@ -793,14 +786,7 @@
 > 				       quit("master/slave protocol botched.\n")
 >;
 > 			}
 > 		}
 >-		if (setjmp(jmpbuf) == 0) {
 >-			ready = 1;
 >-			if (!caught)
 >-				(void) pause();
 >-		}
 >-		ready = 0;
 >-		caught = 0;
 >-
 >+		sigsuspend(&s_mask);
 > 		/* Try to write the data... */
 > 		eot_count = 0;
 > 		size = 0;
 >
 >--==_Exmh_1194176785_547800--
 >
 >

From: Ian Dowse <iedowse@iedowse.com>
To: Danny Braniss <danny@cs.huji.ac.il>
Cc: freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject: Re: bin/117603: [patch] dump(8) hangs on SMP - 4way and higher. 
Date: Tue, 06 Nov 2007 10:03:27 +0000

 In message <E1IoyTE-0004u8-HK@cs1.cs.huji.ac.il>, Danny Braniss writes:
 >I didn't get your 2nd. message, but i'm now looking at the pr :-)
 >what if after
 >	sigemptyset(&mask);
 >we add
 >	sigaddset(&mask, SIGUSR2);
 >the sigsupend() should only return iff a SIGUSR2 was received.
 >would'nt that solve the ^T et.all issue?
 
 I think you'd need to use sigfillset() + sigdelset() instead, but
 this would obviously block all other signals, which quite possibly
 has unwanted side-effects. E.g. would the slave processes get left
 behind if you interrupted the dump with Ctrl-C?
 
 >at the moment only one host has this problem, and it's very unsettling, since
 >I can't reproduce it on another similar host. On the other hand someone else
 >reported the same issue, and my fix worked for him too.
 >Anyways, I see no harm in a little cleanup/upgrade :-)
 >also, my feeling is that the problem might be in the kernel, but I got lost
 >following the code.
 
 It's important to track down the actual cause of this, especially
 if it is a kernel bug. Have you tried the version of the patch I
 gave you yet? Its use of the "while (!caught)" loop should in theory
 help to narrow down whether this is a race condition or some other
 kind of signal loss problem. Also, further details would be helpful,
 such as whether the issue generally happens as dump is starting up
 or if it can happen after many megabytes of data have been written.
 
 Ian


From: Laurent Frigault <lfrigault@agneau.org>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/117603: [patch] dump(8) hangs on SMP - 4way and higher.
Date: Tue, 4 Mar 2008 23:00:55 +0100

 I have just tried your patch (dump_117603.diff) on 7.0 RC2 i386 , but
 unfortunatly it does not fix the problem for me
 
 The hardware is a DELL poweredge 860 quadcore proc
 
 # head -30 /var/run/dmesg.boot
 Copyright (c) 1992-2008 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
 FreeBSD is a registered trademark of The FreeBSD Foundation.
 FreeBSD 7.0-RC2 #0: Tue Feb 19 23:43:28 CET 2008
     root@XXXX:/usr/src/sys/i386/compile/ODO
 Timecounter "i8254" frequency 1193182 Hz quality 0
 CPU: Intel(R) Xeon(R) CPU           X3220  @ 2.40GHz (2400.10-MHz 686-class CPU)
   Origin = "GenuineIntel"  Id = 0x6fb  Stepping = 11
   Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
   Features2=0xe3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
   AMD Features=0x20100000<NX,LM>
   AMD Features2=0x1<LAHF>
   Cores per package: 4
 real memory  = 3757834240 (3583 MB)
 avail memory = 3673903104 (3503 MB)
 ACPI APIC Table: <DELL   PE_SC3  >
 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
  cpu2 (AP): APIC ID:  2
  cpu3 (AP): APIC ID:  3
 ioapic0: Changing APIC ID to 4
 ioapic1: Changing APIC ID to 5
 ioapic0 <Version 2.0> irqs 0-23 on motherboard
 ioapic1 <Version 2.0> irqs 32-55 on motherboard
 kbd1 at kbdmux0
 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
 acpi0: <DELL PE_SC3> on motherboard
 acpi0: [ITHREAD]
 ...
 
 
 When dump is stalled (launched by amanda-client), running truss -p on
 some of its processes for a few seconds restart it for several minutes.
 
 Any other ideas / patches ?
 
 Regards,
 -- 
 Laurent Frigault | <url:http://www.agneau.org/>
 Hold it right there, buddy ... that scruffy beard...those suspenders...
 that smug expression...YOU'RE ONE OF THOSE CONDESCENDING UNIX USERS!
 Here's a nickel, kid. Get yourself a better computer.

From: "D. Hampton Finger" <locnar@mail.utexas.edu>
To: bug-followup@Freebsd.org, danny@cs.huji.ac.il
Cc:  
Subject: Re: bin/117603: [patch] dump(8) hangs on SMP - 4way and higher.
Date: Thu, 6 Mar 2008 21:21:21 -0600 (CST)

 I have a 8 way system with the amd64 kernel running FreeBSD 7.0 release. 
 The system ran dump perfectly on 6.2, but once I upgraded to 7.0, it does 
 what this patch was supposed to fix.  The path does not fix the issue.
 
 I have a ktrace output file here: 
 http://www.cwrl.utexas.edu/~locnar/tracefile.tar.gz
 
 it is 1.7M, but expands to 22M.
 
 the following was run to generate the tracefile:
 
 ktrace -d -f tracefile.txt /sbin/dump -0u -L -b 64 -a -f \
 /raid/CWRL/Backup/bump/usr.dump /usr
 
 the dump will hang at random places, but seems to be around the time when 
 it reports to the STDOUT its status (DUMP done in....)  The dump process 
 is very much alive, as I can hit ctrl-c and it will ask me if I want to 
 abort the dump.  but it seems to be just waiting for the file system. 
 Dump will simply sit there doing nothing, waiting until the process is 
 killed days later.
 
 The reason that I say dump seems to fail at the report section is that it 
 will report the instant I run truss -p on any of the 4 child processes for 
 dump.
 
 Hampton
 
 ----------------------------------------------------------------
 D. Hampton Finger                     locnar@mail.utexas.edu
 CWRL Systems Analyst		      PH:  512-636-1701
 The University of Texas at Austin     FAX: 512-471-6745
                                        http://www.locnar.net
 ----------------------------------------------------------------

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/117603: commit references a PR
Date: Thu, 13 Mar 2008 00:46:18 +0000 (UTC)

 jeff        2008-03-13 00:46:12 UTC
 
   FreeBSD src repository
 
   Modified files:
     sys/kern             subr_sleepqueue.c 
   Log:
   PR 117603
    - Close a sleepqueue signal race by interlocking with the per-process
      spinlock.  This was mistakenly omitted from the thread_lock patch and
      has been a race since.
   
   MFC After:      1 week
   PR:             bin/117603
   Reported by:    Danny Braniss <danny@cs.huji.ac.il>
   
   Revision  Changes    Path
   1.48      +5 -2      src/sys/kern/subr_sleepqueue.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 
Responsible-Changed-From-To: iedowse->linimon 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Thu May 28 22:07:06 UTC 2009 
Responsible-Changed-Why:  
To submitter: the patch discussed was committed and MFCed.  Did this 
solve your problem? 

http://www.freebsd.org/cgi/query-pr.cgi?pr=117603 
State-Changed-From-To: feedback->closed 
State-Changed-By: linimon 
State-Changed-When: Sat Aug 28 10:48:04 UTC 2010 
State-Changed-Why:  
Feedback timeout (> 1 year). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=117603 
>Unformatted:
	Ian this patch fixed the problem for me on RELENG_7.
	I could not dump on a 2P dual-core Opteron 290 (amd64 kernel).

