From nobody@FreeBSD.org  Tue Nov 23 11:01:50 2010
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A82D81065679
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 23 Nov 2010 11:01:50 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (unknown [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 977878FC1A
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 23 Nov 2010 11:01:50 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id oANB1oVA096731
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 23 Nov 2010 11:01:50 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id oANB1oaA096730;
	Tue, 23 Nov 2010 11:01:50 GMT
	(envelope-from nobody)
Message-Id: <201011231101.oANB1oaA096730@red.freebsd.org>
Date: Tue, 23 Nov 2010 11:01:50 GMT
From: pluknet <pluknet@gmail.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: ntpd on 8.1 loops on select() with EBADF
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         152525
>Category:       bin
>Synopsis:       ntpd(8) on 8.1 loops on select() with EBADF
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue Nov 23 11:10:07 UTC 2010
>Closed-Date:    
>Last-Modified:  Tue Jan 11 11:10:08 UTC 2011
>Originator:     pluknet
>Release:        8.1-RELEASE-p1 amd64
>Organization:
>Environment:
FreeBSD host 8.1-RELEASE-p1 FreeBSD 8.1-RELEASE-p1 #4: Thu Sep 23 08:30:18 UTC 2010     root@host:/usr/obj/usr/src/sys/YYY amd64
>Description:
This is repeatedly observed many times.

 2581 root             1 118    0 11736K  3532K CPU6    6  60.7H 100.00% ntpd

# ntpq -p
localhost: timed out, nothing received
***Request timed out

`/etc/rc.d/ntpd restart` make it work. What can be the reason ?

  2581 ntpd     CALL  select(0x420,0x7fffffffec20,0,0,0)
  2581 ntpd     RET   select -1 errno 9 Bad file descriptor
  2581 ntpd     CALL  stat(0x7fffffffe100,0x7fffffffe080)
  2581 ntpd     NAMI  "/usr/share/nls/C/libc.cat"
  2581 ntpd     RET   stat -1 errno 2 No such file or directory
  2581 ntpd     CALL  stat(0x7fffffffe100,0x7fffffffe080)
  2581 ntpd     NAMI  "/usr/share/nls/libc/C"
  2581 ntpd     RET   stat -1 errno 2 No such file or directory
  2581 ntpd     CALL  stat(0x7fffffffe100,0x7fffffffe080)
  2581 ntpd     NAMI  "/usr/local/share/nls/C/libc.cat"
  2581 ntpd     RET   stat -1 errno 2 No such file or directory
  2581 ntpd     CALL  stat(0x7fffffffe100,0x7fffffffe080)
  2581 ntpd     NAMI  "/usr/local/share/nls/libc/C"
  2581 ntpd     RET   stat -1 errno 2 No such file or directory
  2581 ntpd     CALL  clock_gettime(0xd,0x7fffffffd7e0)
  2581 ntpd     RET   clock_gettime 0
  2581 ntpd     CALL  getpid
  2581 ntpd     RET   getpid 2581/0xa15
  2581 ntpd     CALL  sendto(0x3,0x7fffffffd870,0x43,0,0,0)
  2581 ntpd     GIO   fd 3 wrote 67 bytes
       "<99>Nov 17 18:14:17 ntpd[2581]: select() error: Bad file descriptor"
  2581 ntpd     RET   sendto 67/0x43

>How-To-Repeat:
Wait several day until ntpd starts to loop in select().
>Fix:
`/etc/rc.d/ntpd restart` make it work.

>Release-Note:
>Audit-Trail:

From: Sergey Kandaurov <pluknet@gmail.com>
To: bug-followup@FreeBSD.org, pluknet@gmail.com
Cc:  
Subject: Re: bin/152525: ntpd(8) on 8.1 loops on select() with EBADF
Date: Tue, 11 Jan 2011 14:03:29 +0300

 Some more details and investigations.
 
 This situation reproduces iff there are over 1000 IP assigned on interface.
 I suspect select() behaves incorrectly on such fd_set size of listen sockets.
 
 [unmodified ntpd as in 8]# sockstat | grep ntpd | grep '\:123' | wc -l
      999
 
 [unmodified ntpd as in 8]# top -bI | grep ntpd
  1478 root             1 113    0 11736K  3428K CPU3    3 429.7H 75.78% ntpd
 
 The possible correction is in updating ntpd to the latest stable ntpd
 version that supports listening on a specified subset of sockets.
 
 So I've updated ntpd to 4.2.6p2 on one of the problem boxes,
 and that reduced the number of listened sockets to 6.
 
 [ntpd 4.2.6]# sockstat | grep ntpd | grep '\:123' | wc -l
        6
 
 That seems to fix the reported issue.
 
 -- 
 wbr,
 pluknet
>Unformatted:
