From joost@jodocus.org  Fri Aug 18 12:17:04 2006
Return-Path: <joost@jodocus.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A16E616A4DD
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 18 Aug 2006 12:17:04 +0000 (UTC)
	(envelope-from joost@jodocus.org)
Received: from bps.jodocus.org (a198193.upc-a.chello.nl [62.163.198.193])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D483A43D49
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 18 Aug 2006 12:17:03 +0000 (GMT)
	(envelope-from joost@jodocus.org)
Received: from jodocus.org (localhost [127.0.0.1])
	by bps.jodocus.org (8.13.6/8.13.6) with ESMTP id k7ICH2pq025869
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 18 Aug 2006 14:17:02 +0200 (CEST)
	(envelope-from joost@jodocus.org)
Received: (from joost@localhost)
	by jodocus.org (8.13.6/8.13.6/Submit) id k7ICH1F4025868;
	Fri, 18 Aug 2006 14:17:01 +0200 (CEST)
	(envelope-from joost)
Message-Id: <200608181217.k7ICH1F4025868@jodocus.org>
Date: Fri, 18 Aug 2006 14:17:01 +0200 (CEST)
From: Joost Bekkers <joost@jodocus.org>
Reply-To: Joost Bekkers <joost@jodocus.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: dhclient stops working, 100% cpu and logs at ~4000 lines/sec
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         102226
>Category:       bin
>Synopsis:       dhclient stops working, 100% cpu and logs at ~4000 lines/sec
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    brooks
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Aug 18 12:20:15 GMT 2006
>Closed-Date:    Fri Sep 29 03:06:40 GMT 2006
>Last-Modified:  Fri Sep 29 03:10:19 GMT 2006
>Originator:     Joost Bekkers
>Release:        FreeBSD 6.1-RELEASE i386
>Organization:
>Environment:
System: FreeBSD bps.jodocus.org 6.1-RELEASE FreeBSD 6.1-RELEASE #0: Sun May 14 21:49:16 CEST 2006 joost@bps.jodocus.org:/usr/src/sys/i386/compile/bps i386

alloc.c:__FBSDID("$FreeBSD: src/sbin/dhclient/alloc.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
bpf.c:__FBSDID("$FreeBSD: src/sbin/dhclient/bpf.c,v 1.2.2.3 2005/12/20 21:11:16 brooks Exp $");
clparse.c:__FBSDID("$FreeBSD: src/sbin/dhclient/clparse.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
conflex.c:__FBSDID("$FreeBSD: src/sbin/dhclient/conflex.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
convert.c:__FBSDID("$FreeBSD: src/sbin/dhclient/convert.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
dhclient.c:__FBSDID("$FreeBSD: src/sbin/dhclient/dhclient.c,v 1.6.2.4 2006/01/24 05:59:27 brooks Exp $");
dispatch.c:__FBSDID("$FreeBSD: src/sbin/dhclient/dispatch.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
errwarn.c:__FBSDID("$FreeBSD: src/sbin/dhclient/errwarn.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
hash.c:__FBSDID("$FreeBSD: src/sbin/dhclient/hash.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
inet.c:__FBSDID("$FreeBSD: src/sbin/dhclient/inet.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
options.c:__FBSDID("$FreeBSD: src/sbin/dhclient/options.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
packet.c:__FBSDID("$FreeBSD: src/sbin/dhclient/packet.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
parse.c:__FBSDID("$FreeBSD: src/sbin/dhclient/parse.c,v 1.2.2.1 2005/09/10 17:01:16 brooks Exp $");
privsep.c:__FBSDID("$FreeBSD: src/sbin/dhclient/privsep.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");
tables.c:__FBSDID("$FreeBSD: src/sbin/dhclient/tables.c,v 1.1.1.1.2.2 2005/09/10 17:01:16 brooks Exp $");
tree.c:__FBSDID("$FreeBSD: src/sbin/dhclient/tree.c,v 1.1.1.1.2.1 2005/09/10 17:01:16 brooks Exp $");

>Description:
	
	After some time (anything from a day up to a week) dhclient starts logging

	N bad IP checksums seen in N packets

	at a rate of 4k lines/sec. Dhclient effectivly stops functioning.
	Combined cpu load of dhclient and syslogd is 100%

	Lease is not renewed and expires.
>How-To-Repeat:

	/sbin/dhclient <interface>

	and wait.

>Fix:


>Release-Note:
>Audit-Trail:

From: Brooks Davis <brooks@one-eyed-alien.net>
To: Joost Bekkers <joost@jodocus.org>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: bin/102226: dhclient stops working, 100% cpu and logs at ~4000 lines/sec
Date: Fri, 18 Aug 2006 08:56:44 -0500

 > >How-To-Repeat:
 > 
 > 	/sbin/dhclient <interface>
 > 
 > 	and wait.
 
 Since this happends to virtually no one else (your's sounds a lot more
 repeatable than any cases I've heard previously), we'll need more
 information to have any chance of reproducing this.  Realistically it's
 not going to be possible for me to do it since it's probably hardware
 dependent (quite possible a NIC + switch/modem combination).  What I'd
 like you to do is is capture all dhcp traffic on the interface until
 this happens so I can look at the packet stream and see what's coming in
 and how we're mishandling it.  I think the following should do it:
 
 tcpdump -i <interface> -s 0 -w dhcp.pcap port 67 
 
 Once you've got that, compress it and put it somewhere I can download
 it.  Feel free to send the trace in private e-mail if you don't want the
 information to be public.
 
 -- Brooks

From: Joost Bekkers <joost@jodocus.org>
To: Brooks Davis <brooks@one-eyed-alien.net>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: bin/102226: dhclient stops working, 100% cpu and logs at ~4000 lines/sec
Date: Sat, 19 Aug 2006 13:17:41 +0200

 On Fri, Aug 18, 2006 at 08:56:44AM -0500, Brooks Davis wrote:
 > > >How-To-Repeat:
 > > 
 > > 	/sbin/dhclient <interface>
 > > 
 > > 	and wait.
 > 
 > Since this happends to virtually no one else (your's sounds a lot more
 > repeatable than any cases I've heard previously), we'll need more
 > information to have any chance of reproducing this.  Realistically it's
 > not going to be possible for me to do it since it's probably hardware
 > dependent (quite possible a NIC + switch/modem combination).  What I'd
 > like you to do is is capture all dhcp traffic on the interface until
 > this happens so I can look at the packet stream and see what's coming in
 > and how we're mishandling it.  I think the following should do it:
 > 
 > tcpdump -i <interface> -s 0 -w dhcp.pcap port 67 
 > 
 > Once you've got that, compress it and put it somewhere I can download
 > it.  Feel free to send the trace in private e-mail if you don't want the
 > information to be public.
 > 
 
 We're in luck, it already happend again. Aug 19 10:26:25 local time
 
 Capture and log are available at the site mentioned in private mail.
 
 -- 
 greetz Joost
 joost@jodocus.org

From: Joost Bekkers <joost@jodocus.org>
To: Brooks Davis <brooks@one-eyed-alien.net>
Cc: bug-followup@freebsd.org
Subject: Re: bin/102226: dhclient stops working, 100% cpu and logs at ~4000 lines/sec
Date: Mon, 18 Sep 2006 19:05:27 +0200

 I've had dhclient running without problems for 2 weeks now using a single modification:
 
 --- bpf.c.dist  Mon Sep 18 18:55:38 2006
 +++ bpf.c       Mon Sep 18 18:56:46 2006
 @@ -282,7 +282,7 @@
          */
         do {
                 /* If the buffer is empty, fill it. */
 -               if (interface->rbuf_offset == interface->rbuf_len) {
 +               if (interface->rbuf_offset >= interface->rbuf_len) {
                         length = read(interface->rfdesc, interface->rbuf,
                             interface->rbuf_max);
                         if (length <= 0)
 
 
 
 Yesterday I changed back to the original. and the problem occured again.
 (sorry, no tcpdump running at the time)
 
 Sep 17 20:22:15 bps dhclient[13559]: 6927 bad IP checksums seen in 13853 packets
 Sep 17 20:22:15 bps dhclient[13559]: 5 bad IP checksums seen in 5 packets
 Sep 17 20:22:45 bps last message repeated 742794 times
 Sep 17 20:24:46 bps last message repeated 3160822 times
 Sep 17 20:34:47 bps last message repeated 15502818 times
 
 
 gdb(1) got me the following:
 
 (gdb) p *interface
 $1 = {next = 0x0, hw_address = {htype = 1 '\001', hlen = 6 '\006', haddr = "\000`\bZB\t\000\000\000\000\000\000\000\000\000"}, 
   primary_address = {s_addr = 0}, name = "xl0", '\0' <repeats 12 times>, rfdesc = 9, wfdesc = 9, 
   rbuf = 0x8079000 "S\222\rEw\200\006", rbuf_max = 4096, rbuf_offset = 522, rbuf_len = 758, ifp = 0x806c140, client = 0x8072000, 
   noifmedia = 0, errors = 0, dead = 0, index = 1}
 
 interface->rbuf contains the following
 
 00000000  53 92 0d 45 77 80 06 00  68 01 00 00 68 01 00 00  |S..Ew...h...h...|
 00000010  12 00 ff ff ff ff ff ff  00 05 9a d3 f8 21 08 00  |.............!..|
 00000020  45 00 01 5a 5a 62 00 00  ff 11 d6 01 0a 2e 80 01  |E..ZZb..........|
 00000030  ff ff ff ff 00 43 00 44  01 46 00 00 02 01 06 00  |.....C.D.F......|
 00000040  03 98 e8 90 00 00 80 00  00 00 00 00 0a 2e ab bc  |................|
 00000050  d4 8e 27 84 0a 2e 80 01  00 50 94 bc 58 a2 00 00  |..'......P..X...|
 00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 00000120  00 00 00 00 00 00 00 00  63 82 53 63 35 01 02 36  |........c.Sc5..6|
 00000130  04 d4 8e 27 84 33 04 00  00 0e 10 01 04 ff ff f0  |...'.3..........|
 00000140  00 42 0e 32 31 32 2e 31  34 32 2e 33 39 2e 31 33  |.B.212.142.39.13|
 00000150  32 03 04 0a 2e a0 01 02  04 00 00 0e 10 04 04 d4  |2...............|
 00000160  8e 27 84 07 04 d4 8e 27  84 00 00 00 00 00 00 00  |.'.....'........|
 00000170  00 00 00 00 00 00 00 00  00 ff 00 00 53 92 0d 45  |............S..E|
 00000180  f0 80 06 00 68 01 00 00  68 01 00 00 12 00 ff ff  |....h...h.......|
 00000190  ff ff ff ff 00 05 9a d3  f8 21 08 00 45 00 01 5a  |.........!..E..Z|
 000001a0  5a 63 00 00 ff 11 d6 00  0a 2e 80 01 ff ff ff ff  |Zc..............|
 000001b0  00 43 00 44 01 46 00 00  02 01 06 00 30 61 7f 2b  |.C.D.F......0a.+|
 000001c0  00 00 80 00 00 00 00 00  0a 2e a6 16 d4 8e 27 84  |..............'.|
 000001d0  0a 2e 80 01 00 50 94 bc  5f 86 00 00 00 00 00 00  |.....P.._.......|
 000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 000002a0  00 00 00 00 63 82 53 63  35 01 02 36 04 d4 8e 27  |....c.Sc5..6...'|
 000002b0  84 33 04 00 00 0e 10 01  04 ff ff f0 00 42 0e 32  |.3...........B.2|
 000002c0  31 32 2e 31 34 32 2e 33  39 2e 31 33 32 03 04 0a  |12.142.39.132...|
 000002d0  2e a0 01 02 04 00 00 0e  10 04 04 d4 8e 27 84 07  |.............'..|
 000002e0  04 d4 8e 27 84 00 00 00  00 00 00 00 00 00 00 00  |...'............|
 000002f0  00 00 00 00 00 ff 63 68  65 6c 6c 6f 2e 6e 6c 1c  |......chello.nl.|
 00000300  04 ff ff ff ff 00 00 00  00 00 00 00 00 00 00 00  |................|
 00000310  00 00 00 00 00 ff 00 00  00 00 00 00 00 00 00 00  |................|
 00000320  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 00000330  d8 65 f4 44 eb 33 09 00  38 12 03 00 18 03 00 00  |.e.D.3..8.......|
 00000340  6c 6f 30 00 00 00 00 00  00 00 00 00 00 00 00 00  |lo0.............|
 00000350  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 
 -- 
 greetz Joost
 joost@jodocus.org
State-Changed-From-To: open->patched 
State-Changed-By: brooks 
State-Changed-When: Tue Sep 26 01:02:21 UTC 2006 
State-Changed-Why:  
Patched in HEAD.  Sorry for taking so long to get to this. 


Responsible-Changed-From-To: freebsd-bugs->brooks 
Responsible-Changed-By: brooks 
Responsible-Changed-When: Tue Sep 26 01:02:21 UTC 2006 
Responsible-Changed-Why:  
Patched in HEAD.  Sorry for taking so long to get to this. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=102226 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/102226: commit references a PR
Date: Tue, 26 Sep 2006 01:02:17 +0000 (UTC)

 brooks      2006-09-26 01:02:03 UTC
 
   FreeBSD src repository
 
   Modified files:
     sbin/dhclient        bpf.c 
   Log:
   It is possible for bpf to return a length such that:
   
           length != BPF_WORDALIGN(length)
   
   This meeans that it is possible for this to be true:
   
           interface->rbuf_offset > interface->rbuf_len
   
   Handle this case in the test for running out of packets.  While
   OpenBSD's solution of setting interface->rbuf_len to
   BPF_WORDALIGN(length) is safe due to the size of the buffer, I think
   this solution results in less hidden assumptions.
   
   This should fix the problem of dhclient running away and consuming 100%
   CPU.
   
   PR:             bin/102226
   Submitted by:   Joost Bekkers <joost at jodocus.org>
   MFC after:      3 days
   
   Revision  Changes    Path
   1.7       +1 -1      src/sbin/dhclient/bpf.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: brooks 
State-Changed-When: Fri Sep 29 03:06:23 UTC 2006 
State-Changed-Why:  
Merged to RELENG_6. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=102226 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/102226: commit references a PR
Date: Fri, 29 Sep 2006 03:07:56 +0000 (UTC)

 brooks      2006-09-29 03:07:41 UTC
 
   FreeBSD src repository
 
   Modified files:        (Branch: RELENG_6)
     sbin/dhclient        bpf.c 
   Log:
   MFC rev 1.7
   
     It is possible for bpf to return a length such that:
   
            length != BPF_WORDALIGN(length)
   
     This meeans that it is possible for this to be true:
   
            interface->rbuf_offset > interface->rbuf_len
   
     Handle this case in the test for running out of packets.  While
     OpenBSD's solution of setting interface->rbuf_len to
     BPF_WORDALIGN(length) is safe due to the size of the buffer, I think
     this solution results in less hidden assumptions.
   
     This should fix the problem of dhclient running away and consuming 100%
     CPU.
   
   PR:             bin/102226
   Submitted by:   Joost Bekkers <joost at jodocus.org>
   Approved by:    re (ken)
   
   Revision  Changes    Path
   1.2.2.4   +1 -1      src/sbin/dhclient/bpf.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 
>Unformatted:
