From nobody@FreeBSD.org  Mon Apr 29 10:21:55 2002
Return-Path: <nobody@FreeBSD.org>
Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21])
	by hub.freebsd.org (Postfix) with ESMTP id 9170C37B41B
	for <freebsd-gnats-submit@FreeBSD.org>; Mon, 29 Apr 2002 10:21:54 -0700 (PDT)
Received: (from nobody@localhost)
	by freefall.freebsd.org (8.11.6/8.11.6) id g3THLsN36037;
	Mon, 29 Apr 2002 10:21:54 -0700 (PDT)
	(envelope-from nobody)
Message-Id: <200204291721.g3THLsN36037@freefall.freebsd.org>
Date: Mon, 29 Apr 2002 10:21:54 -0700 (PDT)
From: Mike Hibler <mike@cs.utah.edu>
To: freebsd-gnats-submit@FreeBSD.org
Subject: kernel crashes when changing dummynet pipe characteristics
X-Send-Pr-Version: www-1.0

>Number:         37573
>Category:       kern
>Synopsis:       kernel crashes when changing dummynet pipe characteristics
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    maxim
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Apr 29 10:30:01 PDT 2002
>Closed-Date:    Tue May 13 02:32:20 PDT 2003
>Last-Modified:  Tue May 13 02:32:20 PDT 2003
>Originator:     Mike Hibler
>Release:        4.3, 4.5, current
>Organization:
University of Utah
>Environment:
>Description:
There is an obvious race in netinet/ip_dummynet.c:config_pipe().
Interrupts are not blocked when changing the params of an existing
pipe.  The specific crash observed:

... -> config_pipe -> set_fs_parms -> config_red

  malloc a new w_q_lookup table but take an interrupt before
  intializing it, interrupt handler does:

... -> dummynet_io -> red_drops

  red_drops dereferences the uninitialized (zeroed) w_q_lookup table

>How-To-Repeat:
Change the characteristics of an active pipe frequently.
     
>Fix:
In ip_dummynet.c:config_pipe(), in the not-a-new-pipe case, splimp() protect pipe/queue manipulations (primarily the call to set_fs_parms).

>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->luigi 
Responsible-Changed-By: luigi 
Responsible-Changed-When: Sat Jul 13 14:40:00 PDT 2002 
Responsible-Changed-Why:  
dummynet stuff 

http://www.freebsd.org/cgi/query-pr.cgi?pr=37573 

From: David Spindler <spindler@mail.utexas.edu>
To: freebsd-gnats-submit@FreeBSD.org, <darius@dons.net.au>,
	<mike@cs.utah.edu>
Cc:  
Subject: Re: kern/37573: kernel crashes when changing dummynet pipe
 characteristics
Date: Wed, 30 Oct 2002 16:52:46 -0600 (CST)

 I'm having a very similar issue. We have several FreeBSD machines we use on
 public networks to rate-limit users. They seem to freeze every couple of days.
 When frozen the DDB the stack trace shows the processor stuck in dummynet.
 
 IE:
 softclock +0xd1
 dummynet+0x97
 ready_event +ox11e
 
 I've followed the suggestions in this bug report and splimp() protected
 set_fs_parms, but that doesn't seem to help. I can't obtain a core dump, because
 even with a panic/continue from DDB it won't leave dummynet. Any suggestions
 would be helpful, from fixing the problem to forcing a crash so I can debug the
 code with gdb.
 
 --David
 
 

From: Mike Hibler <mike@flux.utah.edu>
To: darius@dons.net.au, freebsd-gnats-submit@FreeBSD.org,
	mike@cs.utah.edu, spindler@mail.utexas.edu
Cc:  
Subject: Re: kern/37573: kernel crashes when changing dummynet pipe characteristics
Date: Wed, 30 Oct 2002 16:33:13 -0700 (MST)

 > Date: Wed, 30 Oct 2002 16:52:46 -0600 (CST)
 > From: David Spindler <spindler@mail.utexas.edu>
 > To: freebsd-gnats-submit@FreeBSD.org, <darius@dons.net.au>, <mike@cs.utah.edu>
 > Subject: Re: kern/37573: kernel crashes when changing dummynet pipe
 >  characteristics
 >
 > I'm having a very similar issue. We have several FreeBSD machines we use on
 > public networks to rate-limit users. They seem to freeze every couple of days.
 > When frozen the DDB the stack trace shows the processor stuck in dummynet.
 >
 > IE:
 > softclock +0xd1
 > dummynet+0x97
 > ready_event +ox11e
 >
 > I've followed the suggestions in this bug report and splimp() protected
 > set_fs_parms, but that doesn't seem to help. I can't obtain a core dump, because
 > even with a panic/continue from DDB it won't leave dummynet. Any suggestions
 > would be helpful, from fixing the problem to forcing a crash so I can debug the
 > code with gdb.
 >
 > --David
 >
 >
 
 You are likely seeing another problem I have seen (but haven't PR'ed
 because I haven't had time to track down exactly what is going on).
 I have included a message I sent to Luigi a while back but unfortunately,
 I have not tracked this any further and without further info, there isn't
 much he can do.
 
 To see if this is affecting you, check in the debugger if q->numbytes
 is negative where q is the dn_flow_queue passed into ready_event().
 If so, this causes an infinite loop at soft interrupt time as described
 in the message.  The conservative work around we use is to clear numbytes
 whenever a config_pipe() call is made (note that the spl fixes and a couple
 of formatting changes are in this as well):
 ====================
 
 *** ip_dummynet.c	2002/10/16 19:30:35	1.1
 --- ip_dummynet.c	2002/10/17 20:41:57	1.2
 ***************
 *** 1510,1516 ****
   static int 
   config_pipe(struct dn_pipe *p)
   {
 !     int s ;
       struct dn_flow_set *pfs = &(p->fs);
   
       /*
 --- 1510,1516 ----
   static int 
   config_pipe(struct dn_pipe *p)
   {
 !     int s = 0;
       struct dn_flow_set *pfs = &(p->fs);
   
       /*
 ***************
 *** 1543,1556 ****
   	     */
   	    x->idle_heap.size = x->idle_heap.elements = 0 ;
   	    x->idle_heap.offset=OFFSET_OF(struct dn_flow_queue, heap_pos);
 ! 	} else
   	    x = b;
   
 ! 	    x->bandwidth = p->bandwidth ;
   	x->numbytes = 0; /* just in case... */
   	bcopy(p->if_name, x->if_name, sizeof(p->if_name) );
   	x->ifp = NULL ; /* reset interface ptr */
 ! 	    x->delay = p->delay ;
   	set_fs_parms(&(x->fs), pfs);
   
   
 --- 1543,1567 ----
   	     */
   	    x->idle_heap.size = x->idle_heap.elements = 0 ;
   	    x->idle_heap.offset=OFFSET_OF(struct dn_flow_queue, heap_pos);
 ! 	} else {
 ! 	    struct dn_flow_queue *q;
 ! 	    int i;
 ! 
   	    x = b;
 + 	    s = splimp(); /* protect mods to active pipe/flow set */
 + 
 + 	    /* flush accumulated credit for all queues */
 + 	    for (i = 0 ; i <= x->fs.rq_size ; i++ )
 + 		for (q = x->fs.rq[i] ; q ; q = q->next ) {
 + 		    q->numbytes = 0;
 + 		}
 + 	}
   
 ! 	x->bandwidth = p->bandwidth ;
   	x->numbytes = 0; /* just in case... */
   	bcopy(p->if_name, x->if_name, sizeof(p->if_name) );
   	x->ifp = NULL ; /* reset interface ptr */
 ! 	x->delay = p->delay ;
   	set_fs_parms(&(x->fs), pfs);
   
   
 ***************
 *** 1566,1573 ****
   		all_pipes = x ;
   	    else
   		a->next = x ;
 - 	    splx(s);
   	}
       } else { /* config queue */
   	struct dn_flow_set *x, *a, *b ;
   
 --- 1577,1584 ----
   		all_pipes = x ;
   	    else
   		a->next = x ;
   	}
 + 	splx(s);
       } else { /* config queue */
   	struct dn_flow_set *x, *a, *b ;
   
 ***************
 *** 1595,1600 ****
 --- 1606,1612 ----
   	    if (pfs->parent_nr != 0 && b->parent_nr != pfs->parent_nr)
   		return EINVAL ;
   	    x = b;
 + 	    s = splimp(); /* protect mods to active pipe/flow set */
   	}
   	set_fs_parms(x, pfs);
   
 ***************
 *** 1610,1617 ****
   		all_flow_sets = x;
   	    else
   		a->next = x;
 - 	    splx(s);
   	}
       }
       return 0 ;
   }
 --- 1622,1629 ----
   		all_flow_sets = x;
   	    else
   		a->next = x;
   	}
 + 	splx(s);
       }
       return 0 ;
   }
 
 ====================
 A more aggressive fix, which I am not yet convinced is correct, is to
 clear numbytes whenever there are no more packets queued in ready_event(),
 i.e., in the else clause that includes the comment:
 	/* RED needs to know when the queue becomes empty */
 add a:
 	q->numbytes = 0;
 
 Hope this helps!
 
 Mike
 
 ====================
 > Date: Mon, 29 Apr 2002 13:30:33 -0600 (MDT)
 > From: Mike Hibler <mike@fast.cs.utah.edu>
 > To: luigi@iet.unipi.it
 > Subject: bug in dummynet code + possible fixes
 > Cc: mike@fast.cs.utah.edu
 > 
 > I submitted a FreeBSD bug report earlier for a relatively simple-to-fix
 > race condition in dummynet, but we also have a more complex problem.
 > The problem is that a FreeBSD 4.3 kernel appears to hang when changing the
 > characteristics of a dummynet pipe.  This also happens in 4.5 and I've
 > checked the "current" sources and there have been no obvious changes that
 > would affect this.
 > 
 > The setup:
 > 
 > [ See www.emulab.net for the overall context of what we are doing. ]
 > 
 > Two machines connected by a third, application-transparent "delay" node.
 > The delay node runs dummynet and bridging code and is used to create a
 > full-duplex link with specific bandwidth/delay/plr characteristics between
 > the other two machines.  So the delay node starts with two pipes setup:
 > 
 >   ipfw add pipe 100 ip from any to any in recv fxp1  # in
 >   ipfw add pipe 110 ip from any to any in recv fxp4  # out
 > 
 > We then open them up for 100Mbs traffic:
 > 
 >   ipfw pipe 100 config delay 0ms bw 100000Kbit/s plr 0.000 queue 50
 >   ipfw pipe 110 config delay 0ms bw 100000Kbit/s plr 0.000 queue 50
 > 
 > and start a flood ping from one machine to the other.  After anywhere
 > from 10 seconds to a minute later, with the ping running, we drop the
 > bandwidth to 10Mbs:
 > 
 >   ipfw pipe 100 config delay 0ms bw 10000Kbit/s plr 0.000 queue 50
 >   ipfw pipe 110 config delay 0ms bw 10000Kbit/s plr 0.000 queue 50
 > 
 > Quite reliably, the delay (dummynet) node will hang.
 > 
 > The problem:
 > 
 > The kernel ends up looping in the dummynet() softclock handler trying
 > to process an event at the head of the ready_heap.  Hard interrupts are
 > not disabled so the console echos keystrokes, but network protocol processing
 > is not happening so the machine will not ping and appears dead.
 > 
 > The infinite loop is triggered by the dn_flow_queue "numbytes" field going
 > negative.  The scenario is that dummynet() sees an event on the ready_heap
 > that should be triggered and calls ready_event().  ready_event() increments
 > q->numbytes causing it to "wrap" and go negative.  Now the code sees
 > that the queue has insufficient credit to send the packet, and doesn't
 > Instead it reschedules it in the heap, but because of the negative value
 > of numbytes, it effectively reschedules the event to happen in the past
 > (technically, we wind up with a very large unsigned value, the dn_key, but
 > the actual value comparison, DN_KEY_LEQ, is signed and we wind up negative
 > again).  So we return to dummynet(), it sees an event still on the queue
 > whose time has past, and it calls ready_event again to process it.  But,
 > because of the negative value, we still have no credit, so it gets
 > rescheduled in the past again, and on and on.
 > 
 > The question is why does numbytes go negative?  It doesn't appear to be
 > a case of it getting clobbered by something, it just seems to grow rapidly
 > (or never get depleted) until it wraps.  And the behavior is triggered by
 > adjusting the pipe bandwidth downward.  I'm guessing it has to do with the
 > pipe bandwidth being a factor in the value of numbytes, thus when bandwidth
 > gets scaled down by a factor of 10, we now effectively have 10 times as
 > much credit at the new bandwidth.
 > 
 > At the heart of the problem is the fact that numbytes never gets reset
 > (the dn_pipe numbytes field does, but not the dn_flow_queue field used
 > by ready_event)  My first thought was to reset the value at config_pipe
 > time, and that seems to fix the problem.  But others here think that we
 > really want to reset the value whenever the pipe is idle (i.e., when
 > the queue is removed from the heap) so that you do not accumulate credit
 > for idle time.  But none of us understand the inner working of dummynet
 > enough to say for sure, and would defer to you on how best to address
 > this problem.
 > 
 > Let me know if you need more info.
 > 
 > Mike
 > 
State-Changed-From-To: open->patched 
State-Changed-By: maxim 
State-Changed-When: Fri Mar 28 03:39:26 PST 2003 
State-Changed-Why:  
Fixed, thank you very much, your problem report is great. 

src/sys/netinet/ip_dummynet.c rev. 1.62 
src/sys/netinet/ip_dummynet.h rev. 1.26 

Will MFC to -STABLE in six weeks. 


Responsible-Changed-From-To: luigi->maxim 
Responsible-Changed-By: maxim 
Responsible-Changed-When: Fri Mar 28 03:39:26 PST 2003 
Responsible-Changed-Why:  
MFC reminder. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=37573 
State-Changed-From-To: patched->closed 
State-Changed-By: maxim 
State-Changed-When: Tue May 13 02:32:00 PDT 2003 
State-Changed-Why:  
Fixed in -STABLE as well. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=37573 
>Unformatted:
