From andrew@mux.org.uk  Sun Feb 15 13:17:48 2004
Return-Path: <andrew@mux.org.uk>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2F7BC16A4CE
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 15 Feb 2004 13:17:48 -0800 (PST)
Received: from 82-41-27-158.cable.ubr04.edin.blueyonder.co.uk (82-41-27-158.cable.ubr04.edin.blueyonder.co.uk [82.41.27.158])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C590843D2F
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 15 Feb 2004 13:17:47 -0800 (PST)
	(envelope-from andrew@mux.org.uk)
Received: from spatula.flat (spatula.flat [192.168.0.2])
	by myriad.flat (Postfix) with ESMTP id 4D1B6BC
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 15 Feb 2004 20:07:27 +0000 (GMT)
Received: by spatula.flat (Postfix, from userid 1001)
	id 6356B3F28; Sun, 15 Feb 2004 21:18:43 +0000 (GMT)
Message-Id: <20040215211843.6356B3F28@spatula.flat>
Date: Sun, 15 Feb 2004 21:18:43 +0000 (GMT)
From: Andrew Boothman <andrew@mux.org.uk>
Reply-To: Andrew Boothman <andrew@mux.org.uk>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: Panic in vr(4): interrupt related	
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         62889
>Category:       kern
>Synopsis:       panic: in vr(4): interrupt related
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bms
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 15 13:20:13 PST 2004
>Closed-Date:    Sun Oct 31 03:10:21 GMT 2004
>Last-Modified:  Sun Oct 31 03:10:21 GMT 2004
>Originator:     Andrew Boothman
>Release:        FreeBSD 5.2-CURRENT i386
>Organization:
>Environment:
System: FreeBSD spatula.flat 5.2-CURRENT FreeBSD 5.2-CURRENT #0: Sun Feb 8 03:34:19 GMT 2004 andrew@spatula.flat:/usr/obj/usr/src/sys/SPATULA i386


	
>Description:
The vr(4) driver appears to sometimes have problems dealing with interrupts. This has shown itself
once for me via a panic that occured during a shutdown via ACPI.

A loose transcription of the panic is:

Fatal trap 12
Fault virtual address     = 0x8
Fault code         = Supervisor ready, page not present
.
.
Current process = 21 (irq10: fwohci vr0+)
kernel: type 12 trap, code = 0
Stopped at vr_rxeof+0x140: pushl 0x8(%edx)

A trace then gave me :

vr_rxeof(...) ...
vr_intr(...) ...
ithread_loop(...) ...
fork_exit(...) ...
fork_tampoline(...) ...

Following a post to -current, Nate Lawson <nate@root.org> noted that:
"Your backtrace shows this is not an ACPI problem, it's a problem with
vr(4) (the Via Rhine ethernet driver).  It should check flags to see if
the driver is going away in the device_detach case.  I'm not sure how the
intr handler got called since interrupts are disabled before powering off
the system."

While Doug White <dwhite@gumbysoft.com> mentioned:
"Pretty sure this is a bug in the vr driver; I've had the same thing happen
on my KT400. It gets an interrupt at exactly the wrong moment and boom.
I haven't had it happen in months, however."

>How-To-Repeat:
So far I've not been able to repeat this problem. Although Doug White indicates
that is has happened before.
>Fix:
Unknown at this time
>Release-Note:
>Audit-Trail:

From: Andrew Boothman <andrew@mux.org.uk>
To: freebsd-gnats-submit@FreeBSD.org, andrew@mux.org.uk
Cc: silby@FreeBSD.org
Subject: Re: kern/62889: Panic in vr(4): interrupt related
Date: Thu, 11 Mar 2004 03:40:25 +0000

 [CC'ed to silby as the developer with his hands most frequently in vr 
 recently - hope you don't mind]
 
 > A trace then gave me :
 > 
 > vr_rxeof(...) ...
 > vr_intr(...) ...
 > ithread_loop(...) ...
 > fork_exit(...) ...
 > fork_tampoline(...) ...
 > 
 
 Just to add that I had another one of these panics today, in exactly the 
 same place as before, happening just after the "Shutting off power using 
 ACPI" message is printed.
 
 No further information other than that - I'm still happy to supply more 
 information, test patches or whatever. Is there anything that can be 
 done in the debugger to obtain more information once the panic has 
 happened? It's been a month between panics so we might have a wait 
 before I can do anything about it however...
 
 Thanks.
 
 Andrew
State-Changed-From-To: open->analyzed 
State-Changed-By: bms 
State-Changed-When: Sat Jul 3 06:40:29 GMT 2004 
State-Changed-Why:  
I may have a simple answer 


Responsible-Changed-From-To: freebsd-bugs->bms 
Responsible-Changed-By: bms 
Responsible-Changed-When: Sat Jul 3 06:40:29 GMT 2004 
Responsible-Changed-Why:  
I'll take this 

http://www.freebsd.org/cgi/query-pr.cgi?pr=62889 

From: Bruce M Simpson <bms@spc.org>
To: Andrew Boothman <andrew@mux.org.uk>,
	Doug White <dwhite@FreeBSD.org>
Cc: freebsd-net@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject: Re: kern/62889: Panic in vr(4): interrupt related
Date: Sat, 3 Jul 2004 07:40:22 +0100

 --ZYOWEO2dMm2Af3e3
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Please try the attached patches (broadly inspired by fxp(4)) and let me
 know if they solve the problem.
 
 I am inclined to commit as-is as they are quite trivial, but I'm not sure
 how the VIA Rhine chips act if we try to detach them as devices without
 necessarily reading the ISR register to consume an unwanted interrupt.
 
 Regards,
 BMS
 
 --ZYOWEO2dMm2Af3e3
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="if_vrreg.h.diff"
 
 Index: if_vrreg.h
 ===================================================================
 RCS file: /home/ncvs/src/sys/pci/if_vrreg.h,v
 retrieving revision 1.19
 diff -u -p -r1.19 if_vrreg.h
 --- if_vrreg.h	5 Apr 2004 17:39:57 -0000	1.19
 +++ if_vrreg.h	3 Jul 2004 06:39:46 -0000
 @@ -468,6 +468,7 @@ struct vr_softc {
  	struct vr_chain_data	vr_cdata;
  	struct callout_handle	vr_stat_ch;
  	struct mtx		vr_mtx;
 +	int			suspended;	/* if 1, sleeping/detaching */
  #ifdef DEVICE_POLLING
  	int			rxcycles;
  #endif
 
 --ZYOWEO2dMm2Af3e3
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="if_vr.c.diff"
 
 Index: if_vr.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/pci/if_vr.c,v
 retrieving revision 1.89
 diff -u -p -r1.89 if_vr.c
 --- if_vr.c	3 Jul 2004 02:59:02 -0000	1.89
 +++ if_vr.c	3 Jul 2004 06:36:48 -0000
 @@ -755,6 +755,8 @@ vr_attach(dev)
  	/* Call MI attach routine. */
  	ether_ifattach(ifp, eaddr);
  
 +	sc->suspended = 0;
 +
  	/* Hook interrupt last to avoid having to lock softc */
  	error = bus_setup_intr(dev, sc->vr_irq, INTR_TYPE_NET | INTR_MPSAFE,
  	    vr_intr, sc, &sc->vr_intrhand);
 @@ -789,6 +791,8 @@ vr_detach(device_t dev)
  
  	VR_LOCK(sc);
  
 +	sc->suspended = 1;
 +
  	/* These should only be active if attach succeeded */
  	if (device_is_attached(dev)) {
  		vr_stop(sc);
 @@ -1219,6 +1223,10 @@ vr_intr(void *arg)
  	uint16_t		status;
  
  	VR_LOCK(sc);
 +	if (sc->suspended) {
 +		VR_UNLOCK(sc);
 +		return;
 +	}
  #ifdef DEVICE_POLLING
  	if (ifp->if_flags & IFF_POLLING) {
  		VR_UNLOCK(sc);
 
 --ZYOWEO2dMm2Af3e3--

From: Russell Francis <rf358197@ohio.edu>
To: freebsd-gnats-submit@FreeBSD.org
Cc: andrew@mux.org.uk
Subject: Re: kern/62889: panic: in vr(4): interrupt related
Date: Sun, 26 Sep 2004 12:14:10 +0000

 I thought I would let everyone know that I have this same problem too 
 and it is very reproducible for me.  Everytime I shutdown the system, I 
 get the same Fault trap and backtrace reported earlier.
 
 vr_rxeof(...) ...
 vr_intr(...) ...
 ithread_loop(...) ...
 fork_exit(...) ...
 fork_tampoline(...) ...
 
 
 I tried the patch supplied by Bruce Simpson and can report that it 
 didn't fix the issue for me.  I still get the same results on shutdown.
 Since it is reproducible for me, I am happy to try any patches out there 
 and should be able to give a fairly quick response as to whether it 
 addresses the issue or not.
 
 Thanks,
 Russ

From: Russell Francis <rf358197@ohio.edu>
To: freebsd-gnats-submit@FreeBSD.org, andrew@mux.org.uk
Cc: bms@FreeBSD.org
Subject: Re: kern/62889: panic: in vr(4): interrupt related
Date: Sun, 26 Sep 2004 14:48:22 +0000

 The attached patch resolved this issue for me.  I am not sure if this 
 will work for others or have adverse effects with other cards.
 
 I am currently using,
 
 vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xe400-0xe4ff mem 
 0xea001000-0xea0010ff irq 11 at device 18.0 on pci0
 
 Please give this patch a try and let me know if it works for others?
 
 Thanks,
 Russ
 
 --- /usr/src/sys/pci/if_vr.c.orig	Sun Sep 26 13:03:27 2004
 +++ /usr/src/sys/pci/if_vr.c	Sun Sep 26 13:05:02 2004
 @@ -1796,11 +1796,7 @@
   vr_shutdown(dev)
   	device_t		dev;
   {
 -	struct vr_softc		*sc;
 -
 -	sc = device_get_softc(dev);
 -
 -	vr_stop(sc);
 +	vr_detach( dev );
 
   	return;
   }
State-Changed-From-To: analyzed->patched 
State-Changed-By: bms 
State-Changed-When: Tue Oct 19 15:31:10 GMT 2004 
State-Changed-Why:  
Committed to HEAD. Pending MFC request for RELENG_5 and RELENG_5_3. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=62889 

From: Sebastian Schulze Struchtrup <sebastian@struchtrup.de>
To: freebsd-gnats-submit@FreeBSD.org, andrew@mux.org.uk
Cc:  
Subject: Re: kern/62889: panic: in vr(4): interrupt related
Date: Wed, 20 Oct 2004 15:26:04 +0200

 Now I have got the lastest if_vr.c from current (1.97),
 but I am getting another panic:
 
 panic: _mtx_lock_sleep: recursed on non-recursive mutex vr0 @ if_vr.c:1593
 
 vr_detach calls ether_ifdetach with vr mutex locked (l. 801), which calls vr_ioctl (SIOCDELMULTI, thru various other functions). 
 vr_ioctl locks the mutex again (l. 1593) -> panic.
 
 
 if_vr.c, rev. 1.97:
 ===================
 786 static int
 787 vr_detach(device_t dev)
 788 {
 ...
 794         VR_LOCK(sc);
 ...
 798         /* These should only be active if attach succeeded */
 799         if (device_is_attached(dev)) {
 800                 vr_stop(sc);
 801                 ether_ifdetach(ifp);
 802         }
 
 1571 static int
 1572 vr_ioctl(struct ifnet *ifp, u_long command, caddr_t data)
  ...
 1592         case SIOCDELMULTI:
 1593                 VR_LOCK(sc);
 1594                 vr_setmulti(sc);
 1595                 VR_UNLOCK(sc);
 1596                 error = 0;
 1597                 break;
 
 1690 static void
 1691 vr_shutdown(device_t dev)
 1692 {
 1693 
 1694         vr_detach(dev);
 1695 } 
 
 
 SMP/APIC enabled on a UP machine.
 Scheduler is ULE without PREEMPTION, but that should not cause the problem after the shutdown...
 NIC: vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xc000-0xc0ff mem 0xdd001000-0xdd0010ff irq 23 at device 18.0 on pci0
 
 
 kernel backtrace:
 =================
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for details.
 This GDB was configured as "i386-marcel-freebsd".
 doadump () at pcpu.h:159
 (kgdb) backtrace
 #0  doadump () at pcpu.h:159
 #1  0xc0516084 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:397
 #2  0xc0516399 in panic (fmt=0xc06d6301 "_mtx_lock_sleep: recursed on non-recursive mutex %s @ %s:%d\n") at /usr/src/sys/kern/kern_shutdown.c:553
 #3  0xc050e76b in _mtx_lock_sleep (m=0xc1acbb70, td=0xc197f1a0, opts=0, file=0xc06ec07a "/usr/src/sys/pci/if_vr.c", line=1593)
     at /usr/src/sys/kern/kern_mutex.c:456
 #4  0xc050e507 in _mtx_lock_flags (m=0x0, opts=0, file=0xc06ec07a "/usr/src/sys/pci/if_vr.c", line=1593) at /usr/src/sys/kern/kern_mutex.c:273
 #5  0xc0617560 in vr_ioctl (ifp=0xc1acb000, command=3249322864, data=0x0) at /usr/src/sys/pci/if_vr.c:1593
 #6  0xc05752e2 in if_delmulti (ifp=0xc1acb000, sa=0xc1ae6dc0) at /usr/src/sys/net/if.c:1760
 #7  0xc05cf74f in in6_delmulti (in6m=0xc1aec480) at /usr/src/sys/netinet6/mld6.c:564
 #8  0xc05b96d7 in in6_purgeaddr (ifa=0x0) at /usr/src/sys/netinet6/in6.c:1143
 #9  0xc05735d3 in if_detach (ifp=0xc1acb000) at /usr/src/sys/net/if.c:561
 #10 0xc0576f98 in ether_ifdetach (ifp=0xc1acb000) at /usr/src/sys/net/if_ethersubr.c:904
 #11 0xc0615e5f in vr_detach (dev=0xc1a5e280) at /usr/src/sys/pci/if_vr.c:801
 #12 0xc061780b in vr_shutdown (dev=0xc1a5e280) at /usr/src/sys/pci/if_vr.c:1694
 #13 0xc0529a8a in device_shutdown (dev=0x0) at device_if.h:237
 #14 0xc052a0e6 in bus_generic_shutdown (dev=0x0) at /usr/src/sys/kern/subr_bus.c:2682
 #15 0xc0529a8a in device_shutdown (dev=0x0) at device_if.h:237
 #16 0xc052a0e6 in bus_generic_shutdown (dev=0x0) at /usr/src/sys/kern/subr_bus.c:2682
 #17 0xc0529a8a in device_shutdown (dev=0x0) at device_if.h:237
 #18 0xc052a0e6 in bus_generic_shutdown (dev=0x0) at /usr/src/sys/kern/subr_bus.c:2682
 #19 0xc08433e7 in ?? ()
 #20 0xc19eb700 in ?? ()
 #21 0xd41d2c34 in ?? ()
 #22 0xc0529a8a in device_shutdown (dev=0x0) at device_if.h:237
 Previous frame identical to this frame (corrupt stack?)
 
 

From: Sebastian Schulze Struchtrup <sebastian@struchtrup.de>
To: freebsd-gnats-submit@FreeBSD.org, andrew@mux.org.uk
Cc:  
Subject: Re: kern/62889: panic: in vr(4): interrupt related
Date: Tue, 26 Oct 2004 11:36:10 -0700

 --6v9BRtpmy+umdQlo
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Please try this patch (it's quite lame, but it should help).
 
 --6v9BRtpmy+umdQlo
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="if_vr_revised.diff"
 
 Index: if_vr.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/pci/if_vr.c,v
 retrieving revision 1.94
 diff -u -p -r1.94 if_vr.c
 --- if_vr.c	11 Aug 2004 04:30:49 -0000	1.94
 +++ if_vr.c	26 Oct 2004 18:34:33 -0000
 @@ -798,7 +798,9 @@ vr_detach(device_t dev)
  	/* These should only be active if attach succeeded */
  	if (device_is_attached(dev)) {
  		vr_stop(sc);
 +		VR_UNLOCK(sc);		/* XXX: avoid recursive locking */
  		ether_ifdetach(ifp);
 +		VR_LOCK(sc);
  	}
  	if (sc->vr_miibus)
  		device_delete_child(dev, sc->vr_miibus);
 @@ -1690,9 +1692,6 @@ vr_stop(struct vr_softc *sc)
  static void
  vr_shutdown(device_t dev)
  {
 -	struct vr_softc		*sc = device_get_softc(dev);
  
 -	VR_LOCK(sc);
 -	vr_stop(sc);
 -	VR_UNLOCK(sc);
 +	vr_detach(dev);
  }
 
 --6v9BRtpmy+umdQlo--

From: Sebastian Schulze Struchtrup <seb@struchtrup.com>
To: freebsd-gnats-submit@FreeBSD.org, andrew@mux.org.uk
Cc:  
Subject: kern/62889: panic: in vr(4): interrupt related
Date: Wed, 27 Oct 2004 00:10:12 +0200

 I have seen that you have commit an additional fix to if_vr.c.
 The UNLOCK/LOCK around ether_ifdetach was what  I have tested locally.
 I regret that I have not had the time to report anything back to you.
 
 But with this, I have got another panic when an interrupt occurs during 
 ether_ifdetach (tested with a flood ping)
 With no traffic on the network, everything is fine.
 
 Moving the bus_teardown_intr / bus_release_resource in front of 
 vr_stop/ether_ifdetach seems to work for me.
 All I get are strayed irqs. Not really nice, but it seems to work.
 
 Maybe there's some time to fix this for 5.3, as the release has been 
 shifted due to some showstoppers.
 
 
 %cvs diff -u -p -r 1.94 if_vr.c
 Index: if_vr.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/pci/if_vr.c,v
 retrieving revision 1.94
 diff -u -p -r1.94 if_vr.c
 --- if_vr.c     11 Aug 2004 04:30:49 -0000      1.94
 +++ if_vr.c     26 Oct 2004 22:03:01 -0000
 @@ -31,7 +31,7 @@
   */
 
  #include <sys/cdefs.h>
 -__FBSDID("$FreeBSD: src/sys/pci/if_vr.c,v 1.94 2004/08/11 04:30:49 
 scottl Exp $");
 +__FBSDID("$FreeBSD: src/sys/pci/if_vr.c,v 1.97 2004/10/19 20:02:07 bms 
 Exp $");
  /*
   * VIA Rhine fast ethernet PCI NIC driver
 @@ -795,19 +795,25 @@ vr_detach(device_t dev)
 
         sc->suspended = 1;
 
 +
 +       if (sc->vr_intrhand)
 +               bus_teardown_intr(dev, sc->vr_irq, sc->vr_intrhand);
 +       if (sc->vr_irq)
 +               bus_release_resource(dev, SYS_RES_IRQ, 0, sc->vr_irq);
 +
 +
         /* These should only be active if attach succeeded */
         if (device_is_attached(dev)) {
                 vr_stop(sc);
 +               VR_UNLOCK(sc);
                 ether_ifdetach(ifp);
 +               VR_LOCK(sc);
         }
 +
         if (sc->vr_miibus)
                 device_delete_child(dev, sc->vr_miibus);
         bus_generic_detach(dev);
 
 -       if (sc->vr_intrhand)
 -               bus_teardown_intr(dev, sc->vr_irq, sc->vr_intrhand);
 -       if (sc->vr_irq)
 -               bus_release_resource(dev, SYS_RES_IRQ, 0, sc->vr_irq);
         if (sc->vr_res)
                 bus_release_resource(dev, VR_RES, VR_RID, sc->vr_res);
 
 @@ -1690,9 +1696,6 @@ vr_stop(struct vr_softc *sc)
  static void
  vr_shutdown(device_t dev)
  {
 -       struct vr_softc         *sc = device_get_softc(dev);
 
 -       VR_LOCK(sc);
 -       vr_stop(sc);
 -       VR_UNLOCK(sc);
 +       vr_detach(dev);
  }
 

From: Sebastian Schulze Struchtrup <seb@struchtrup.com>
To: Sebastian Schulze Struchtrup <seb@struchtrup.com>
Cc: freebsd-gnats-submit@FreeBSD.org, andrew@mux.org.uk
Subject: kern/62889: panic: in vr(4): interrupt related
Date: Wed, 27 Oct 2004 00:53:18 +0200

 There are another two issues:
 
 - maybe it is wise to call vr_stop first
   this could solve the issues with stray irqs.
   I am going to test this tomorrow.
 
 - acc. to the architecture handbook, bus_teardown_intr must not be 
 called with any mutexes locked
   at least some other drivers do this and it seems to work. The em 
 driver releases his mutex before teardown.
   But I am not deep enough in the code to decide if this is ok or not. 
 (The same with the first point)
 
 
State-Changed-From-To: patched->closed 
State-Changed-By: bms 
State-Changed-When: Sun Oct 31 03:09:50 GMT 2004 
State-Changed-Why:  
5.3 (5-STABLE) is about to be released, and a fix has been committed. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=62889 
>Unformatted:
