From dillon@flea.best.net  Sun Jul 26 16:40:17 1998
Received: from flea.best.net (root@flea.best.net [206.184.139.131])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id QAA27540
          for <FreeBSD-gnats-submit@freebsd.org>; Sun, 26 Jul 1998 16:40:16 -0700 (PDT)
          (envelope-from dillon@flea.best.net)
Received: (from dillon@localhost)
	by flea.best.net (8.9.0/8.9.0/best.fl) id QAA26783;
	Sun, 26 Jul 1998 16:38:57 -0700 (PDT)
Message-Id: <199807262338.QAA26783@flea.best.net>
Date: Sun, 26 Jul 1998 16:38:57 -0700 (PDT)
From: Matt Dillon <dillon@best.net>
Reply-To: dillon@best.net
To: FreeBSD-gnats-submit@freebsd.org
Subject: splimp/splnet interrupt race in ip_drain
X-Send-Pr-Version: 3.2

>Number:         7403
>Category:       kern
>Synopsis:       splimp/splnet interrupt race in ip_drain
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jul 26 16:40:01 PDT 1998
>Closed-Date:    Sun Jul 26 21:00:02 PDT 1998
>Last-Modified:  Sun Jul 26 21:00:41 PDT 1998
>Originator:     Matt Dillon
>Release:        FreeBSD 2.2.6-STABLE i386
>Organization:
Best Internet Communications, Inc.
>Environment:

	FreeBSD-stable / from CVS.  Crash during ip frag attack due
	to corrupted ipq in ip_slowtimo().

>Description:

    #17 0xf01c5393 in trap (frame={tf_es = 0x10, tf_ds = 0x10, tf_edi = 0x3e7, 
	  tf_esi = 0x25, tf_ebp = 0xefbfff84, tf_isp = 0xefbfff60, 
	  tf_ebx = 0xf16f6114, tf_edx = 0xf3aa7790, tf_ecx = 0x1, tf_eax = 0x0, 
	  tf_trapno = 0xc, tf_err = 0x0, tf_eip = 0xf0147534, tf_cs = 0x8, 
	  tf_eflags = 0x10202, tf_esp = 0xf01e82f0, tf_ss = 0xf01e84c4})
	at ../../i386/i386/trap.c:324
    #18 0xf0147534 in ip_slowtimo () at ../../netinet/ip_input.c:871
    #19 0xf0125fdf in pfslowtimo (arg=0x0) at ../../kern/uipc_domain.c:220
    #20 0xf010a23c in softclock () at ../../kern/kern_clock.c:715
    #21 0xf01bd997 in doreti_swi ()

>How-To-Repeat:

	This appears to be a relatively narrow hole.  ipq can get corrupted
	where a particular ipq is empty at the time of the crash:

	    #18 0xf0147534 in ip_slowtimo () at ../../netinet/ip_input.c:871
	    ../../netinet/ip_input.c:871: No such file or directory.
	    (kgdb) print ipq[i]
	    $20 = {
	      next = 0xf0202fac, 
	      prev = 0xf0202fac, 
	      ipq_ttl = 0x0, 
	      ipq_p = 0x0, 
	      ipq_id = 0x0, 
	      ipq_next = 0x0, 
	      ipq_prev = 0x0, 
	      ipq_src = {
		s_addr = 0x0
	      }, 
	      ipq_dst = {
		s_addr = 0x0
	      }
	    }

	BUT the ip_slowtimo() code believes it has a valid fp when, in fact,
	the fp has already been unlinked:

	    (kgdb) print fp
	    $21 = (struct ipq *) 0xf16f6114

	    (kgdb) print *fp
	    $22 = {
	      next = 0xf18f9b94, 
	      prev = 0x0, 

	    (kgdb) print fp->next
	    $23 = (struct ipq *) 0xf18f9b94
	    (kgdb) print fp->next->next
	    $24 = (struct ipq *) 0xf129eb14
	    (kgdb) print fp->next->next->next
	    $25 = (struct ipq *) 0xf1415994
	    (kgdb) print fp->next->next->next->next
	    $26 = (struct ipq *) 0xf1205c14
	    (kgdb) print fp->next->next->next->next->next
	    $27 = (struct ipq *) 0xf13a0814
	    (kgdb) print fp->next->next->next->next->next->next
	    $28 = (struct ipq *) 0xf1a26114
	    (kgdb) print fp->next->next->next->next->next->next->next
	    $29 = (struct ipq *) 0xf1645514
	    (kgdb) print fp->next->next->next->next->next->next->next->next
	    $30 = (struct ipq *) 0xf0202fac

>Fix:

    Looking through the code, the only case I can see where this can happen
    is if ip_drain() is called from m_retry().  ip_drain() can thus be called
    from a network interrupt (splimp()) which, I believe splnet() does NOT 
    mask.

    I changed all references to the ip fragment queue and route table that
    were being modified at splnet() to run at splimp().  I do not know if
    this fixes the problem but I believe it does.

    It is possible that this fix will also fix occassional route table 
    corruption that I've seen on idiom.com, which uses gated heavily.  It
    seems phenominaly dangerous to mess around with the route table at only
    splnet().

    NOTE!!! Please ignore the portions of the diff relating to LOOPBACK_ALLNET,
    they are not related to this bug report.


Only in .: LINK
diff -r -c LINK/in_rmx.c ./in_rmx.c
*** LINK/in_rmx.c	Thu Jun 20 08:41:23 1996
--- ./in_rmx.c	Sun Jul 26 16:24:13 1998
***************
*** 307,313 ****
  	arg.rnh = rnh;
  	arg.nextstop = time.tv_sec + rtq_timeout;
  	arg.draining = arg.updating = 0;
! 	s = splnet();
  	rnh->rnh_walktree(rnh, in_rtqkill, &arg);
  	splx(s);
  
--- 307,313 ----
  	arg.rnh = rnh;
  	arg.nextstop = time.tv_sec + rtq_timeout;
  	arg.draining = arg.updating = 0;
! 	s = splimp();	/* must be splimp() due to ip_drain() called from m_retry() -MATT */
  	rnh->rnh_walktree(rnh, in_rtqkill, &arg);
  	splx(s);
  
***************
*** 334,340 ****
  #endif
  		arg.found = arg.killed = 0;
  		arg.updating = 1;
! 		s = splnet();
  		rnh->rnh_walktree(rnh, in_rtqkill, &arg);
  		splx(s);
  	}
--- 334,340 ----
  #endif
  		arg.found = arg.killed = 0;
  		arg.updating = 1;
! 		s = splimp();	/* must be splimp() due to ip_drain() called from m_retry -MATT */
  		rnh->rnh_walktree(rnh, in_rtqkill, &arg);
  		splx(s);
  	}
***************
*** 355,361 ****
  	arg.nextstop = 0;
  	arg.draining = 1;
  	arg.updating = 0;
! 	s = splnet();
  	rnh->rnh_walktree(rnh, in_rtqkill, &arg);
  	splx(s);
  }
--- 355,361 ----
  	arg.nextstop = 0;
  	arg.draining = 1;
  	arg.updating = 0;
! 	s = splimp();	/* must be splimp() due to ip_drain() called from m_retry -MATT */
  	rnh->rnh_walktree(rnh, in_rtqkill, &arg);
  	splx(s);
  }
diff -r -c LINK/ip_input.c ./ip_input.c
*** LINK/ip_input.c	Tue Jun 30 23:58:28 1998
--- ./ip_input.c	Sun Jul 26 16:25:29 1998
***************
*** 55,60 ****
--- 55,63 ----
  #include <sys/sysctl.h>
  
  #include <net/if.h>
+ #ifdef LOOPBACK_ALLNET
+ #include <net/if_types.h>
+ #endif
  #include <net/if_dl.h>
  #include <net/route.h>
  #include <net/netisr.h>
***************
*** 383,388 ****
--- 386,395 ----
  
  		if (IA_SIN(ia)->sin_addr.s_addr == ip->ip_dst.s_addr)
  			goto ours;
+ #ifdef LOOPBACK_ALLNET
+ 		if (LOOPBACK_CIDR(ia, &ip->ip_dst))
+ 			goto ours;
+ #endif
  #ifdef BOOTP_COMPAT
  		if (IA_SIN(ia)->sin_addr.s_addr == INADDR_ANY)
  			goto ours;
***************
*** 851,857 ****
  ip_slowtimo()
  {
  	register struct ipq *fp;
! 	int s = splnet();
  	int i;
  
  	for (i = 0; i < IPREASS_NHASH; i++) {
--- 858,864 ----
  ip_slowtimo()
  {
  	register struct ipq *fp;
! 	int s = splimp();	/* must be splimp() due to ip_drain code -MATT*/
  	int i;
  
  	for (i = 0; i < IPREASS_NHASH; i++) {
***************
*** 871,877 ****
  }
  
  /*
!  * Drain off all datagram fragments.
   */
  void
  ip_drain()
--- 878,887 ----
  }
  
  /*
!  * Drain off all datagram fragments.  This can be called from
!  * m_retry() and therefore called from an interrupt, interrupting
!  * splnet routines.  This routine is called at splimp().  Thus
!  * anyone who messes with ipq or rtq must also run at splimp(). -MATT
   */
  void
  ip_drain()
>Release-Note:
>Audit-Trail:

From: David Greenman <dg@root.com>
To: dillon@best.net
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/7403: splimp/splnet interrupt race in ip_drain 
Date: Sun, 26 Jul 1998 19:54:45 -0700

    I think the right solution to this problem is to not call m_reclaim() if
 M_DONTWAIT is set. Do you agree?
 
 -DG
 
 David Greenman
 Co-founder/Principal Architect, The FreeBSD Project

From: Matt Dillon <dillon@best.net>
To: David Greenman <dg@root.com>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/7403: splimp/splnet interrupt race in ip_drain 
Date: Sun, 26 Jul 1998 20:29:10 -0700 (PDT)

 :
 :   I think the right solution to this problem is to not call m_reclaim() if
 :M_DONTWAIT is set. Do you agree?
 :
 :-DG
 :
 :David Greenman
 :Co-founder/Principal Architect, The FreeBSD Project
 
     Yah, that makes more sense. 
 
     Refresh my memory please:  Is ip_input() called at splimp() or splnet() ?
     If it's called from an interrupt there are other race conditions beyond 
     m_retry() that we have to worry about.
 
 					-Matt
 
     Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet
                     Communications.
     <dillon@best.net> (Please include portions of article in any response)

From: David Greenman <dg@root.com>
To: Matt Dillon <dillon@best.net>
Cc: FreeBSD-gnats-submit@FreeBSD.ORG
Subject: Re: kern/7403: splimp/splnet interrupt race in ip_drain 
Date: Sun, 26 Jul 1998 20:37:42 -0700

 >    Yah, that makes more sense. 
 >
 >    Refresh my memory please:  Is ip_input() called at splimp() or splnet() ?
 >    If it's called from an interrupt there are other race conditions beyond 
 >    m_retry() that we have to worry about.
 
    splnet(). Packet are only queued at splimp().
 
 -DG
 
 David Greenman
 Co-founder/Principal Architect, The FreeBSD Project
State-Changed-From-To: open->closed 
State-Changed-By: dg 
State-Changed-When: Sun Jul 26 21:00:02 PDT 1998 
State-Changed-Why:  
Believed to be fixed in rev 1.37 of uipc_mbuf.c. Thanks for the bug report. 
>Unformatted:
