From dpelleg@cs.cmu.edu  Sat Dec  8 04:45:54 2001
Return-Path: <dpelleg@cs.cmu.edu>
Received: from palraz.rem.cmu.edu (PALRAZ.REM.CMU.EDU [128.237.161.212])
	by hub.freebsd.org (Postfix) with ESMTP id 79D7A37B405
	for <FreeBSD-gnats-submit@freebsd.org>; Sat,  8 Dec 2001 04:45:52 -0800 (PST)
Received: from palraz.wburn (palraz [192.168.1.1])
	by palraz.rem.cmu.edu (8.11.6/8.11.4) with ESMTP id fB8Cjfl10632
	(using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified NO)
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 8 Dec 2001 07:45:42 -0500 (EST)
	(envelope-from dpelleg@palraz.rem.cmu.edu)
Received: (from dpelleg@localhost)
	by palraz.wburn (8.11.6/8.11.6) id fB8Cjel13266;
	Sat, 8 Dec 2001 07:45:40 -0500 (EST)
	(envelope-from dpelleg)
Message-Id: <200112081245.fB8Cjel13266@palraz.wburn>
Date: Sat, 8 Dec 2001 07:45:40 -0500 (EST)
From: Dan Pelleg <peldan@yahoo.com>
Reply-To: Dan Pelleg <peldan@yahoo.com>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: [PATCH] incorrect handling of parent rules in ipfw
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         32600
>Category:       kern
>Synopsis:       [PATCH] incorrect handling of parent rules in ipfw
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    luigi
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Dec 08 04:50:00 PST 2001
>Closed-Date:    Mon Sep 02 18:53:27 PDT 2002
>Last-Modified:  Mon Sep 02 18:53:27 PDT 2002
>Originator:     Dan Pelleg
>Release:        FreeBSD 4.4-STABLE i386
>Organization:
>Environment:
System: FreeBSD p 4.4-STABLE FreeBSD 4.4-STABLE #0: Sat Nov 10 15:10:25 EST 2001 d@k:/usr/obj/usr/src/sys/P i386


>Description:

when I started to use limit rules, I noticed ipfw was emitting lots 
of messages:
OUCH! cannot remove rule, count 2

Inspecting ip_fw.c, this is caused when a parent rule with a nonzero count
is detected to have expired.

Here is one scenario how this can legally happen:

 For example, lookup_dyn_parent() increases the expiry by
dyn_short_lifetime, whereas add_dyn_rule() will create it with time_second +
dyn_syn_lifetime.

So, when a second limit rule is created, the parent's expire field is not
extended by enough time to match this second child.

 Luigi Rizzo confirmed this analysis and on his advice I patched to ignore
the expire field altogether for limit rules.




Note that the (userland) ipfw still takes the expire field into account, as
exemplified by the following scenario:

Suppose you have a "limit" rule, and a rule is created for it, and a minute
later 4 more rules are created for it. So the count is 5. Now terminate the
first connection. Suppose you have net.inet.ip.fw.*lifetime set to huge
values like I do. After a while, the LIMIT rule will expire, and "ipfw -d
show" will not show it anymore. However, only when it gets freed will the
count value for its parent be decreased. So, in the meantime, when you do a
"ipfw -d" show, you see "PARENT 5", but only 4 rules that match it.

Another thing that can happen is for the PARENT rule to expire, and then
you see LIMIT rules listed but not their parent.

At that point I added a patch to ipfw to list PARENT rules with count > 0
regardless of their expire value.

(this still doesn't solve the problem of the number of counts for the
parent being larger than the number of children ipfw lists; I started
solving this one by having ipfw update the reference counter in the parent,
and then realized that ip_fw.c doesn't pass parent pointers - at least not
usable ones - the patch includes a comment about this).





Now, running this patched code gave me kernel panics in add_dyn_rule(),
called from install_state(). I came up with this explanation for them:

Suppose you have plenty of rules (ie, conn_limit of them) for a
parent, but they are all expired. This happens to me when I use Mozilla,
which opens dozens of connections, and then leave that window alone for a
while.

In the DYN_LIMIT case of install_state(), the lookup_dyn_parent finds the
parent, which is found out to have too large a count. Then EXPIRE_DYN_CHAIN
is called, the count goes down to zero in the first pass, and the parent is
removed in the second. But install_state is still holding a pointer to the
freed structure, and later passes it on to add_dyn_rule().

I fixed this by re-looking the parent up after the expiry code, which has
solved the problem.
>How-To-Repeat:
 see above.

>Fix:

apply provided patch.

*** sys/netinet/ip_fw.c.orig	Sun Nov 18 18:29:23 2001
--- sys/netinet/ip_fw.c	Mon Nov 26 07:03:08 2001
***************
*** 649,655 ****
  	/* remove a refcount to the parent */				\
  	if (q->dyn_type == DYN_LIMIT)					\
  		q->parent->count--;					\
! 	DEB(printf("-- unlink entry 0x%08x %d -> 0x%08x %d, %d left\n", \
  		(q->id.src_ip), (q->id.src_port),			\
  		(q->id.dst_ip), (q->id.dst_port), dyn_count-1 ); )	\
  	if (prev != NULL)						\
--- 649,656 ----
  	/* remove a refcount to the parent */				\
  	if (q->dyn_type == DYN_LIMIT)					\
  		q->parent->count--;					\
! 	DEB(printf("-- unlink entry %p 0x%08x %d -> 0x%08x %d, %d left\n", \
! 		q,                                                      \
  		(q->id.src_ip), (q->id.src_port),			\
  		(q->id.dst_ip), (q->id.dst_port), dyn_count-1 ); )	\
  	if (prev != NULL)						\
***************
*** 694,710 ****
  	     * and possibly more in the future.
  	     */
  	    int zap = ( rule == NULL || rule == q->rule);
! 	    if (zap)
! 		zap = force || TIME_LEQ( q->expire , time_second );
  	    /* do not zap parent in first pass, record we need a second pass */
  	    if (q->dyn_type == DYN_LIMIT_PARENT) {
  		max_pass = 1; /* we need a second pass */
  		if (zap == 1 && (pass == 0 || q->count != 0) ) {
  		    zap = 0 ;
! 		    if (pass == 1) /* should not happen */
! 			printf("OUCH! cannot remove rule, count %d\n",
! 				q->count);
  		}
  	    }
  	    if (zap) {
  		UNLINK_DYN_RULE(prev, ipfw_dyn_v[i], q);
--- 695,718 ----
  	     * and possibly more in the future.
  	     */
  	    int zap = ( rule == NULL || rule == q->rule);
! 
  	    /* do not zap parent in first pass, record we need a second pass */
  	    if (q->dyn_type == DYN_LIMIT_PARENT) {
  		max_pass = 1; /* we need a second pass */
  		if (zap == 1 && (pass == 0 || q->count != 0) ) {
  			zap = 0 ;
! 			if (force && pass == 1) { /* should not happen */
! 				printf("OUCH! cannot remove rule %p 0x%08x %d -> 0x%08x %d, count %d, bucket %d\n",
! 				 q,
! 				 (q->id.src_ip), (q->id.src_port),
! 				 (q->id.dst_ip), (q->id.dst_port),
! 				 q->count,
! 				 i);
! 			}
  		}
+ 	    } else {
+ 		if (zap)
+ 			zap = force || TIME_LEQ( q->expire , time_second );
  	    }
  	    if (zap) {
  		    UNLINK_DYN_RULE(prev, ipfw_dyn_v[i], q);
***************
*** 882,891 ****
      r->next = ipfw_dyn_v[i] ;
      ipfw_dyn_v[i] = r ;
      dyn_count++ ;
!     DEB(printf("-- add entry 0x%08x %d -> 0x%08x %d, total %d\n",
         (r->id.src_ip), (r->id.src_port),
         (r->id.dst_ip), (r->id.dst_port),
!        dyn_count ); )
      return r;
  }
  
--- 890,901 ----
      r->next = ipfw_dyn_v[i] ;
      ipfw_dyn_v[i] = r ;
      dyn_count++ ;
!     DEB(printf("-- add entry %p 0x%08x %d -> 0x%08x %d, total %d, bucket %d\n",
! 	r,
  	(r->id.src_ip), (r->id.src_port),
  	(r->id.dst_ip), (r->id.dst_port),
! 	dyn_count,
! 	i); )
      return r;
  }
  
***************
*** 988,995 ****
--- 998,1017 ----
  	}
  	if (parent->count >= conn_limit) {
  	    EXPIRE_DYN_CHAIN(rule); /* try to expire some */
+         /*
+           The expiry might have removed the parent too.
+           We lookup again, which will re-create if necessary.
+         */
+ 	    parent = lookup_dyn_parent(&id, rule);
+ 	    if (parent == NULL) {
+           printf("add parent failed\n");
+           return 1;
+         }
  	    if (parent->count >= conn_limit) {
+           if (last_log != time_second) {
+             last_log = time_second ;
              printf("drop session, too many entries\n");
+           }
  		return 1;
  	    }
  	}
***************
*** 1929,1934 ****
--- 1951,1962 ----
  			    bcopy(p, dst, sizeof *p);
                              (int)dst->rule = p->rule->fw_number ;
  			    /*
+ 			     * we should really set the parent field
+ 			     * to the corresponding parent in the new
+ 			     * structure. For now, just set it to NULL.
+ 			     */
+ 			    dst->parent = NULL ;
+ 			    /*
  			     * store a non-null value in "next". The userland
  			     * code will interpret a NULL here as a marker
  			     * for the last dynamic rule.
*** sbin/ipfw/ipfw.c.orig	Sun Nov 18 18:42:51 2001
--- sbin/ipfw/ipfw.c	Sat Nov 24 12:18:27 2001
***************
*** 813,822 ****
  		    (struct ipfw_dyn_rule *)&rules[num];
  		struct in_addr a;
  		struct protoent *pe;
  
              printf("## Dynamic rules:\n");
              for (;; d++) {
! 		if (d->expire == 0 && !do_expired) {
  			if (d->next == NULL)
  				break;
  			continue;
--- 813,830 ----
  		    (struct ipfw_dyn_rule *)&rules[num];
  		struct in_addr a;
  		struct protoent *pe;
+ 		int skip;
  
  		printf("## Dynamic rules:\n");
  		for (;; d++) {
! 			/* determine whether to skip this rule */
! 			skip = !do_expired;
! 			if( d->dyn_type == DYN_LIMIT_PARENT) {
! 				skip = skip && d->count == 0;
! 			} else {
! 				skip = skip && d->expire == 0;                
! 			}
! 			if(skip) {
  				if (d->next == NULL)
  					break;
  				continue;
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->luigi 
Responsible-Changed-By: sheldonh 
Responsible-Changed-When: Mon Dec 10 03:22:34 PST 2001 
Responsible-Changed-Why:  
The attached patch was suggested by Mr Rizzo. 

http://www.FreeBSD.org/cgi/query-pr.cgi?pr=32600 

From: Dan Pelleg <peldan@yahoo.com>
To: freebsd-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: kern/32600: [PATCH] incorrect handling of parent rules in ipfw
Date: Sun, 17 Feb 2002 18:40:20 -0500

 Following up to my own PR, I'm providing a patch that applies cleanly
 against 4.5-R. It doesn't differ from the previous one in any significant
 way (well, except for the fact that it applies cleanly).
 
 --- sys/netinet/ip_fw.c.orig	Mon Feb 11 06:39:33 2002
 +++ sys/netinet/ip_fw.c	Mon Feb 11 06:41:21 2002
 @@ -694,17 +694,23 @@
  	     * and possibly more in the future.
  	     */
  	    int zap = ( rule == NULL || rule == q->rule);
 -	    if (zap)
 -		zap = force || TIME_LEQ( q->expire , time_second );
 +
  	    /* do not zap parent in first pass, record we need a second pass */
  	    if (q->dyn_type == DYN_LIMIT_PARENT) {
  		max_pass = 1; /* we need a second pass */
  		if (zap == 1 && (pass == 0 || q->count != 0) ) {
  		    zap = 0 ;
 -		    if (pass == 1) /* should not happen */
 -			printf("OUCH! cannot remove rule, count %d\n",
 -				q->count);
 +			if (force && pass == 1) { /* should not happen */
 +				printf("OUCH! cannot remove rule 0x%08x %d -> 0x%08x %d, count %d, bucket %d\n",
 +				 (q->id.src_ip), (q->id.src_port),
 +				 (q->id.dst_ip), (q->id.dst_port),
 +				 q->count,
 +				 i);
 +			}
  		}
 +	    } else {
 +		if (zap)
 +			zap = force || TIME_LEQ( q->expire , time_second );
  	    }
  	    if (zap) {
  		UNLINK_DYN_RULE(prev, ipfw_dyn_v[i], q);
 @@ -885,7 +891,7 @@
      DEB(printf("-- add entry 0x%08x %d -> 0x%08x %d, total %d\n",
         (r->id.src_ip), (r->id.src_port),
         (r->id.dst_ip), (r->id.dst_port),
 -       dyn_count ); )
 +               dyn_count); )
      return r;
  }
  
 @@ -988,8 +994,20 @@
  	}
  	if (parent->count >= conn_limit) {
  	    EXPIRE_DYN_CHAIN(rule); /* try to expire some */
 +         /*
 +           The expiry might have removed the parent too.
 +           We lookup again, which will re-create if necessary.
 +         */
 + 	    parent = lookup_dyn_parent(&id, rule);
 + 	    if (parent == NULL) {
 +           printf("add parent failed\n");
 +           return 1;
 +        }
  	    if (parent->count >= conn_limit) {
 -		printf("drop session, too many entries\n");
 +          if (last_log != time_second) {
 +            last_log = time_second ;
 +            printf("drop session, too many entries\n");
 +          }
  		return 1;
  	    }
  	}
 @@ -1928,6 +1946,12 @@
  			for ( p = ipfw_dyn_v[i] ; p != NULL ; p = p->next, dst++ ) {
  			    bcopy(p, dst, sizeof *p);
                              (int)dst->rule = p->rule->fw_number ;
 +			    /*
 +			     * we should really set the parent field
 +			     * to the corresponding parent in the new
 +			     * structure. For now, just set it to NULL.
 +			     */
 +			    dst->parent = NULL ;
  			    /*
  			     * store a non-null value in "next". The userland
  			     * code will interpret a NULL here as a marker
 --- sbin/ipfw/ipfw.c.orig	Mon Feb 11 06:39:56 2002
 +++ sbin/ipfw/ipfw.c	Mon Feb 11 06:42:58 2002
 @@ -813,10 +813,18 @@
  		    (struct ipfw_dyn_rule *)&rules[num];
  		struct in_addr a;
  		struct protoent *pe;
 +		int skip;
  
              printf("## Dynamic rules:\n");
              for (;; d++) {
 -		if (d->expire == 0 && !do_expired) {
 +			/* determine whether to skip this rule */
 +			skip = !do_expired;
 +			if( d->dyn_type == DYN_LIMIT_PARENT) {
 +				skip = skip && d->count == 0;
 +			} else {
 +				skip = skip && d->expire == 0;                
 +			}
 +			if(skip) {
  			if (d->next == NULL)
  				break;
  			continue;
State-Changed-From-To: open->closed 
State-Changed-By: luigi 
State-Changed-When: Mon Sep 2 18:52:55 PDT 2002 
State-Changed-Why:  
this is fixed in ipfw2. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=32600 
>Unformatted:
