From pavlin@catarina.usc.edu  Fri Mar 10 21:59:11 2000
Return-Path: <pavlin@catarina.usc.edu>
Received: from catarina.usc.edu (catarina.usc.edu [128.125.51.47])
	by hub.freebsd.org (Postfix) with ESMTP id 73BD637B86E
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 10 Mar 2000 21:59:07 -0800 (PST)
	(envelope-from pavlin@catarina.usc.edu)
Received: (from pavlin@localhost)
	by catarina.usc.edu (8.9.3/8.9.3) id VAA78295;
	Fri, 10 Mar 2000 21:59:30 -0800 (PST)
Message-Id: <200003110559.VAA78295@catarina.usc.edu>
Date: Fri, 10 Mar 2000 21:59:30 -0800 (PST)
From: Pavlin Ivanov Radoslavov <pavlin@catarina.usc.edu>
Reply-To: pavlin@catarina.usc.edu
To: FreeBSD-gnats-submit@freebsd.org
Cc: pavlin@catarina.usc.edu
Subject: NIS host name resolving may loop forever
X-Send-Pr-Version: 3.2

>Number:         17310
>Category:       kern
>Synopsis:       [nis] [patch] NIS host name resolving may loop forever
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    remko
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Mar 10 22:00:02 PST 2000
>Closed-Date:    Sun Mar 11 19:39:47 GMT 2007
>Last-Modified:  Sun Mar 11 19:39:47 GMT 2007
>Originator:     Pavlin Radoslavov
>Release:        FreeBSD 3.2-RELEASE i386 (but eventually true for -current)
>Organization:
USC, Dept. of CS
>Environment:

A FreeBSD NIS client. The NIS server is also FreeBSD.
FreeBSD-3.2, but I think that the same problem is in -current as well 

>Description:

	In some cases, resolving a host name may loop forever (inside
	libc): After a NIS client sends the query to the NIS server,
	if the NIS server tries to use a DNS query on its own to
	resolve the name, and if the DNS query doesn't return
	immediately any result (a success or a failure, i.e. it is left 
	to expire), the query of the NIS client itself may expire
	before it gets any answer from the NIS server (which is waiting
	for the DNS query to complete). However, the libc code doesn't
	handle properly this situation, and it will enter an infinite
	loop trying to resolve the host name.
	The situation is very unplesant for programs like sendmail,
	because the result will be that a sendmail process will
	be blocked by this infinite loop and will not be able to
	deliver the email to the rest of the recepients.

>How-To-Repeat:

	Compile and execute the following code. Note that the problem
	can be observed only if the particular IP address is such
	that it takes at least 30-60 seconds for "nslookup" to timeout:

	#include <stdio.h>
	#include <stdlib.h>
	#include <sys/types.h>
	#include <sys/socket.h>
	#include <netinet/in.h>
	#include <arpa/inet.h>
	#include <netdb.h>


	int
	main()
	{
		struct in_addr in;
		struct hostent *hp;
				
	/*
	  Nothing personal regarding address "200.230.88.4". It is just that
	  my DNS query fails for this address with "Server failed"
	  after approx. 30-60 seconds:

	  pavlin@xanadu[11] nslookup 200.230.88.4
	  Server:  catarina.usc.edu
	  Address:  128.125.51.47

	  *** catarina.usc.edu can't find 200.230.88.4: Server failed
	  */

	    	inet_aton("200.230.88.4", &in);
	        hp = gethostbyaddr((char *)&in, sizeof(in), AF_INET);
		if (hp) {
		    printf("OK\n");
		} else {
		    printf("FAILURE\n");
		}
		exit (0);
	}

	If the chosen IP address is appropriate, you should see the loop:
	pavlin@catarina[284] ./a.out 
	yp_match: clnt_call: RPC: Timed out
	yp_match: clnt_call: RPC: Timed out
	yp_match: clnt_call: RPC: Timed out
	yp_match: clnt_call: RPC: Timed out
	yp_match: clnt_call: RPC: Timed out
	...


>Fix:
	
	Apply the following patch to src/lib/libc/yp/yplib.c, then
	recomplile libc and install it. However, this solution
	will just break the infinite loop to "only" 20 loops.
	A better solution is the fix the NIS server to cache
	its recent results, such that after the first timeout
	of the client's request, the second request will hit
	the negative answer at the server, and then the
	client will immediately return an error instead of looping
	20 times. Note that yplib.c has a number of other places
	that introduce the same potential danger for infinitive loop.
	Search for "again:", and add there a similar counter
	that would break the infinitive loop.


--- yplib.c.org	Fri Mar  6 21:06:10 1998
+++ yplib.c	Fri Mar 10 16:17:02 2000
@@ -623,6 +623,7 @@
 	struct timeval tv;
 	struct ypreq_key yprk;
 	int r;
+	int retries = 0;
 
 	*outval = NULL;
 	*outvallen = 0;
@@ -657,6 +658,11 @@
 #endif
 
 again:
+	retries++;
+	if (retries > MAX_RETRIES) {
+	    xdr_free(xdr_ypresp_val, (char *)&yprv);
+	    return YPERR_YPERR;
+	}
 	if( _yp_dobind(indomain, &ysd) != 0)
 		return YPERR_DOMAIN;
 


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->wpaul 
Responsible-Changed-By: dirk 
Responsible-Changed-When: Thu Nov 9 04:21:51 PST 2000 
Responsible-Changed-Why:  
Over to Mr. yp. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=17310 

From: Ulrich Spoerlein <q@uni.de>
To: freebsd-gnats-submit@FreeBSD.org, pavlin@catarina.usc.edu
Cc:  
Subject: Re: misc/17310: NIS host name resolving may loop forever
Date: Thu, 30 Jan 2003 13:30:21 +0100

 I can confirm a similar bug in 4-STABLE. Here is my setup:
 
 coyote - NAT-Gateway, NIS/NFS Server, sendmail with SmartHost=mail.myisp.com, -STABLE
 roadrunner - Workstation, NIS/NFS Client, sendmail with SmartHost=coyote, -STABLE
 
 I was running into this bug when trying to get mutt to send via sendmail. I
 just recently switched to NIS and use NIS for passwd,group and hosts.
 
 Here is a snip from my /etc/hosts on coyote: (132....is the external IF)
 
 192.168.0.146  coyote           coyote.local
 192.168.0.147  roadrunner       roadrunner.local
 132.187.222.7  gb-007  gb-007.galgenberg.net coyote.dnsalias.net
 ..
 
 host.conf on roadrunner _included_ NIS. Trying to send mail with mutt
 results in an infinite loop. /var/log/messages on _coyote_ gets flooded
 with these messages:
 Jan 30 10:00:00 coyote ypserv[103]: res_mkquery failed
 Jan 30 10:00:30 coyote last message repeated 11 times
 Jan 30 10:02:30 coyote last message repeated 36 times
 Jan 30 10:12:32 coyote last message repeated 180 times
 ..
 
 Sending mail with sendmail directly produces these errors:
 yp_match: clnt_call: RPC: Timed out
 yp_match: clnt_call: RPC: Timed out
 yp_match: clnt_call: RPC: Timed out
 ..
 
 and ps aux shows me this line:
 send-mail: ./h0UBv88n011766 localhost.my.domain.: user open (sendmail)
 where localhost.my.domain is 127.0.0.1 of course
 
 Removing "nis" from /etc/host.conf (roadrunner) and restarting sendmail
 "fixes" this Problem, but I'd rather like to use NIS on hosts too.
 
 I stumbled across bin/11666 which says it's a duplicate of bin/5444, but
 i'm not quite sure which PR matches this problem best.
Responsible-Changed-From-To: wpaul->freebsd-bugs 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Mon Oct 24 02:08:44 GMT 2005 
Responsible-Changed-Why:  
With bugmeister hat on, return to the general pool.  I think it is likely 
the current assignee lost interest in this PR a long time ago. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=17310 
State-Changed-From-To: open->feedback 
State-Changed-By: remko 
State-Changed-When: Sat Dec 30 16:36:43 UTC 2006 
State-Changed-Why:  
Hello is this problem still relevant? 


Responsible-Changed-From-To: freebsd-bugs->remko 
Responsible-Changed-By: remko 
Responsible-Changed-When: Sat Dec 30 16:36:43 UTC 2006 
Responsible-Changed-Why:  
grab the pr 

http://www.freebsd.org/cgi/query-pr.cgi?pr=17310 
State-Changed-From-To: feedback->closed 
State-Changed-By: remko 
State-Changed-When: Sun Mar 11 19:39:44 UTC 2007 
State-Changed-Why:  
The submitter mentions that he no longer has access to this setup so he 
cannot test this. Close the ticket, please give me feedback if this is 
still relevant. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=17310 
>Unformatted:
