From yar@comp.chem.msu.su  Sat May 12 12:24:02 2007
Return-Path: <yar@comp.chem.msu.su>
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 3872416A403
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 12 May 2007 12:24:02 +0000 (UTC)
	(envelope-from yar@comp.chem.msu.su)
Received: from jujik.ramtel.ru (jujik.ramtel.ru [81.19.64.112])
	by mx1.freebsd.org (Postfix) with ESMTP id B5E6313C46A
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 12 May 2007 12:24:01 +0000 (UTC)
	(envelope-from yar@comp.chem.msu.su)
Received: from jujik.ramtel.ru (localhost [127.0.0.1])
	by jujik.ramtel.ru (8.14.1/8.13.8) with ESMTP id l4CCO0J7068764
	for <FreeBSD-gnats-submit@freebsd.org>; Sat, 12 May 2007 16:24:00 +0400 (MSD)
	(envelope-from yar@comp.chem.msu.su)
Received: (from yar@localhost)
	by jujik.ramtel.ru (8.14.1/8.13.8/Submit) id l4CCO0uj068763;
	Sat, 12 May 2007 16:24:00 +0400 (MSD)
	(envelope-from yar@comp.chem.msu.su)
Message-Id: <200705121224.l4CCO0uj068763@jujik.ramtel.ru>
Date: Sat, 12 May 2007 16:24:00 +0400 (MSD)
From: Yar Tikhiy <yar@comp.chem.msu.su>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         112612
>Category:       kern
>Synopsis:       [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    andre
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat May 12 12:30:15 GMT 2007
>Closed-Date:    
>Last-Modified:  Thu Sep  6 14:10:03 GMT 2007
>Originator:     Yar Tikhiy
>Release:        FreeBSD 7.0-CURRENT i386
>Organization:
none
>Environment:
System: FreeBSD jujik.ramtel.ru 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Sun Apr 22 15:52:48 MSD 2007 root@jujik.ramtel.ru:/usr/src/sys/i386/compile/JTEST i386

>Description:
	tcpdump(1) on an additional loopback interface shows no
	traffic at all.  The traffic can be seen if tcpdump(1) runs
	on the primary loopback interface instead, which is usually
	lo0.

	I believe that the bug was introduced in rev. 1.111 of
	if_loop.c.  Due to that change, if_simloop() explicitly
	passes any loopback traffic to BPF on behalf of loif, i.e.,
	the primary loopback interface.

>How-To-Repeat:
	1. Create an additional lo(4) interface:
		ifconfig lo1 create
		ifconfig lo1 127.0.0.2
	2. Create ipfw rules to see that traffic actually goes via lo1:
		ipfw add 50 count icmp from any to any via lo0
		ipfw add 50 count icmp from any to any via lo1
	3. Start tcpdump on lo1:
		tcpdump -vpn -i lo1 icmp
	4. Ping via lo1:
		ping -c1 127.0.0.1
	5. See the counter increase on lo1, but no traffic in tcpdump.
	6. Repeat #3,4 running tcpdump on lo0 instead, see the missing
	   traffic showing up as though it flows via lo0.

>Fix:
	Not know yet, but a promising approach is to test IFF_LOOPBACK
	on the actual interface, ifp, and not override it with loif in
	the bpf_mtap2() call if the flag is set.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat May 12 19:48:01 UTC 2007 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=112612 

From: Cristian KLEIN <cristi@net.utcluj.ro>
To: bug-followup@FreeBSD.org,  yar@comp.chem.msu.su
Cc:  
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows
 up on lo0 in bpf(4)
Date: Mon, 16 Jul 2007 17:18:33 +0300

 This is a multi-part message in MIME format.
 --------------070102080802050402090903
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 Content-Transfer-Encoding: 7bit
 
 The following patch is against -CURRENT:
 
 cd /usr/src
 patch < if_loop.patch
 
 Recompile the kernel and it should work.
 
 --------------070102080802050402090903
 Content-Type: text/x-patch;
  name="if_loop.patch"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="if_loop.patch"
 
 --- sys/net/if_loop.c.orig	2007-02-09 02:09:35.000000000 +0200
 +++ sys/net/if_loop.c	2007-07-16 17:02:31.438464106 +0300
 @@ -274,15 +274,15 @@
  			bpf_mtap(ifp->if_bpf, m);
  		}
  	} else {
 -		if (bpf_peers_present(loif->if_bpf)) {
 -			if ((m->m_flags & M_MCAST) == 0 || loif == ifp) {
 +		if (bpf_peers_present(ifp->if_bpf)) {
 +			if ((m->m_flags & M_MCAST) == 0 || (ifp->if_flags == IFF_LOOPBACK)) {
  				/* XXX beware sizeof(af) != 4 */
  				u_int32_t af1 = af;	
  
  				/*
  				 * We need to prepend the address family.
  				 */
 -				bpf_mtap2(loif->if_bpf, &af1, sizeof(af1), m);
 +				bpf_mtap2(ifp->if_bpf, &af1, sizeof(af1), m);
  			}
  		}
  	}
 
 --------------070102080802050402090903--
Responsible-Changed-From-To: freebsd-net->andre 
Responsible-Changed-By: andre 
Responsible-Changed-When: Sat Jul 28 06:25:45 UTC 2007 
Responsible-Changed-Why:  
Take over. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=112612 

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Cristian KLEIN <cristi@net.utcluj.ro>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
Date: Tue, 7 Aug 2007 13:34:15 +0400

 On Mon, Jul 16, 2007 at 05:18:33PM +0300, Cristian KLEIN wrote:
 > The following patch is against -CURRENT:
 > 
 > cd /usr/src
 > patch < if_loop.patch
 > 
 > Recompile the kernel and it should work.
 
 > --- sys/net/if_loop.c.orig	2007-02-09 02:09:35.000000000 +0200
 > +++ sys/net/if_loop.c	2007-07-16 17:02:31.438464106 +0300
 > @@ -274,15 +274,15 @@
 >  			bpf_mtap(ifp->if_bpf, m);
 >  		}
 >  	} else {
 > -		if (bpf_peers_present(loif->if_bpf)) {
 > -			if ((m->m_flags & M_MCAST) == 0 || loif == ifp) {
 > +		if (bpf_peers_present(ifp->if_bpf)) {
 > +			if ((m->m_flags & M_MCAST) == 0 || (ifp->if_flags == IFF_LOOPBACK)) {
 >  				/* XXX beware sizeof(af) != 4 */
 >  				u_int32_t af1 = af;	
 >  
 >  				/*
 >  				 * We need to prepend the address family.
 >  				 */
 > -				bpf_mtap2(loif->if_bpf, &af1, sizeof(af1), m);
 > +				bpf_mtap2(ifp->if_bpf, &af1, sizeof(af1), m);
 >  			}
 >  		}
 >  	}
 
 Your idea looks very good to me, but I'm afraid this particular
 implementation can break the well-known practice of all local
 non-loopback traffic appearing on lo0.  I.e., if my host has IP
 1.2.3.4 assigned to its Ethernet interface and I ping the local IP
 1.2.3.4, I'll see the ping packets on lo0, not on the Ethernet
 interface, in tcpdump.
 
 I think now I can extend your solution to keep the current practice
 intact.
 
 Thank you!
 
 -- 
 Yar

From: Andre Oppermann <andre@freebsd.org>
To: Yar Tikhiy <yar@comp.chem.msu.su>
Cc: bug-followup@freebsd.org
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows
 up on lo0 in bpf(4)
Date: Wed, 08 Aug 2007 11:25:46 +0200

 Yar Tikhiy wrote:
 > The following reply was made to PR kern/112612; it has been noted by GNATS.
 > 
 > From: Yar Tikhiy <yar@comp.chem.msu.su>
 > To: Cristian KLEIN <cristi@net.utcluj.ro>
 > Cc: bug-followup@FreeBSD.org
 > Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
 > Date: Tue, 7 Aug 2007 13:34:15 +0400
 > 
 >  On Mon, Jul 16, 2007 at 05:18:33PM +0300, Cristian KLEIN wrote:
 >  > The following patch is against -CURRENT:
 >  > 
 >  > cd /usr/src
 >  > patch < if_loop.patch
 >  > 
 >  > Recompile the kernel and it should work.
 >  
 >  > --- sys/net/if_loop.c.orig	2007-02-09 02:09:35.000000000 +0200
 >  > +++ sys/net/if_loop.c	2007-07-16 17:02:31.438464106 +0300
 >  > @@ -274,15 +274,15 @@
 >  >  			bpf_mtap(ifp->if_bpf, m);
 >  >  		}
 >  >  	} else {
 >  > -		if (bpf_peers_present(loif->if_bpf)) {
 >  > -			if ((m->m_flags & M_MCAST) == 0 || loif == ifp) {
 >  > +		if (bpf_peers_present(ifp->if_bpf)) {
 >  > +			if ((m->m_flags & M_MCAST) == 0 || (ifp->if_flags == IFF_LOOPBACK)) {
 >  >  				/* XXX beware sizeof(af) != 4 */
 >  >  				u_int32_t af1 = af;	
 >  >  
 >  >  				/*
 >  >  				 * We need to prepend the address family.
 >  >  				 */
 >  > -				bpf_mtap2(loif->if_bpf, &af1, sizeof(af1), m);
 >  > +				bpf_mtap2(ifp->if_bpf, &af1, sizeof(af1), m);
 >  >  			}
 >  >  		}
 >  >  	}
 >  
 >  Your idea looks very good to me, but I'm afraid this particular
 >  implementation can break the well-known practice of all local
 >  non-loopback traffic appearing on lo0.  I.e., if my host has IP
 >  1.2.3.4 assigned to its Ethernet interface and I ping the local IP
 >  1.2.3.4, I'll see the ping packets on lo0, not on the Ethernet
 >  interface, in tcpdump.
 >  
 >  I think now I can extend your solution to keep the current practice
 >  intact.
 
 This patch works correctly as is.  Any traffic to local IP addresses
 is transited through lo0 and is seen there.  This patch doesn't change
 that.  This is based on loif.  You won't see traffic for lo1 and further
 cloned loopback interfaces on lo0 anymore, instead you'll see it on lo1.
 
 -- 
 Andre
 

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Andre Oppermann <andre@freebsd.org>
Cc: bug-followup@freebsd.org
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
Date: Thu, 9 Aug 2007 13:00:57 +0400

 On Wed, Aug 08, 2007 at 11:25:46AM +0200, Andre Oppermann wrote:
 > 
 > This patch works correctly as is.  Any traffic to local IP addresses
 > is transited through lo0 and is seen there.  This patch doesn't change
 > that.  This is based on loif.  You won't see traffic for lo1 and further
 > cloned loopback interfaces on lo0 anymore, instead you'll see it on lo1.
 
 I could be wrong.  I used to think that traffic to local IP gets
 looped back at the if_ethersubr level via if_simloop, but it doesn't
 seem true.  Now my guess is that the route to lo0 plays role here,
 e.g.:
 
 ifconfig fxp0
 ...
         inet 1.2.3.4 ...
 
 netstat -rn
 ...
 1.2.3.4     00:12:34:56:78:9a  UHLW        1      128    lo0
 
 ip_output() follows the route, and therefore the packet to
 1.2.3.4 goes via lo0 instead of fxp0.
 
 Do you think I'm right?
 
 -- 
 Yar

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Andre Oppermann <andre@freebsd.org>
Cc: bug-followup@freebsd.org, Cristian KLEIN <cristi@net.utcluj.ro>
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
Date: Fri, 24 Aug 2007 00:02:44 +0400

 On Wed, Aug 08, 2007 at 11:25:46AM +0200, Andre Oppermann wrote:
 > Yar Tikhiy wrote:
 > > On Mon, Jul 16, 2007 at 05:18:33PM +0300, Cristian KLEIN wrote:
 > > > The following patch is against -CURRENT:
 > > > 
 > > > cd /usr/src
 > > > patch < if_loop.patch
 > > > 
 > > > Recompile the kernel and it should work.
 > > 
 > > > --- sys/net/if_loop.c.orig	2007-02-09 02:09:35.000000000 +0200
 > > > +++ sys/net/if_loop.c	2007-07-16 17:02:31.438464106 +0300
 > > > @@ -274,15 +274,15 @@
 > > >  			bpf_mtap(ifp->if_bpf, m);
 > > >  		}
 > > >  	} else {
 > > > -		if (bpf_peers_present(loif->if_bpf)) {
 > > > -			if ((m->m_flags & M_MCAST) == 0 || loif == ifp) {
 > > > +		if (bpf_peers_present(ifp->if_bpf)) {
 > > > +			if ((m->m_flags & M_MCAST) == 0 || (ifp->if_flags == 
 > > IFF_LOOPBACK)) {
 > > >  				/* XXX beware sizeof(af) != 4 */
 > > >  				u_int32_t af1 = af;	
 > > >  
 > > >  				/*
 > > >  				 * We need to prepend the address family.
 > > >  				 */
 > > > -				bpf_mtap2(loif->if_bpf, &af1, sizeof(af1), 
 > > m);
 > > > +				bpf_mtap2(ifp->if_bpf, &af1, sizeof(af1), m);
 > > >  			}
 > > >  		}
 > > >  	}
 > > 
 > > Your idea looks very good to me, but I'm afraid this particular
 > > implementation can break the well-known practice of all local
 > > non-loopback traffic appearing on lo0.  I.e., if my host has IP
 > > 1.2.3.4 assigned to its Ethernet interface and I ping the local IP
 > > 1.2.3.4, I'll see the ping packets on lo0, not on the Ethernet
 > > interface, in tcpdump.
 > > 
 > > I think now I can extend your solution to keep the current practice
 > > intact.
 > 
 > This patch works correctly as is.  Any traffic to local IP addresses
 > is transited through lo0 and is seen there.  This patch doesn't change
 > that.  This is based on loif.  You won't see traffic for lo1 and further
 > cloned loopback interfaces on lo0 anymore, instead you'll see it on lo1.
 
 As I've just found, the patch from Cristian Klein effectively undoes
 if_loop.c#1.111: local IPv6 traffic to a local address on a
 non-looback, e.g., Ethernet, interface cannot be seen via BPF.  More
 precisely, the local IPv6 packets show up on the Ethernet interface,
 but they appear totally broken due to BPF link-type mismatch, NULL
 vs. EN10MB.  The issue stems from nd6_output() deliberately calling
 if_simloop() with ifp pointing to the Ethernet interface.  I'm not
 ready to tell why nd6_output() does so, but if_simloop() has to
 allow for that for the time being.
 
 -- 
 Yar

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Andre Oppermann <andre@freebsd.org>
Cc: bug-followup@freebsd.org, Cristian KLEIN <cristi@net.utcluj.ro>
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
Date: Sat, 25 Aug 2007 18:48:05 +0400

 After some research, I've worked out the following patch.
 My findings and decisions are documented in the comments.
 Granting how many rounds it took to fix this small piece
 of code, I added quite a lot of comments.  Please review.
 Thanks!
 
 -- 
 Yar
 
 --- if_loop.c.orig	2007-03-05 19:08:18.000000000 +0300
 +++ if_loop.c	2007-08-25 17:59:56.000000000 +0400
 @@ -267,23 +267,46 @@
  	 *  - IPv4/v6 multicast packet loopback (netinet(6)/ip(6)_output.c)
  	 *	-> not passes it to any BPF
  	 *  - Normal packet loopback from myself to myself (net/if_loop.c)
 -	 *	-> passes to lo0's BPF (even in case of IPv6, where ifp!=lo0)
 +	 *	-> passes it to ifp's BPF if *ifp is a loopback interface
 +	 *	-> otherwise passes it to lo0's BPF (the IPv6 case)
  	 */
  	if (hlen > 0) {
  		if (bpf_peers_present(ifp->if_bpf)) {
  			bpf_mtap(ifp->if_bpf, m);
  		}
 -	} else {
 -		if (bpf_peers_present(loif->if_bpf)) {
 -			if ((m->m_flags & M_MCAST) == 0 || loif == ifp) {
 -				/* XXX beware sizeof(af) != 4 */
 -				u_int32_t af1 = af;	
 -
 -				/*
 -				 * We need to prepend the address family.
 -				 */
 -				bpf_mtap2(loif->if_bpf, &af1, sizeof(af1), m);
 -			}
 +	} else if ((m->m_flags & M_MCAST) == 0) {
 +		struct ifnet *bifp;
 +
 +		/*
 +		 * IPv6 (nd6_output()) can call looutput() with ifp pointing
 +		 * to the originating interface with link encapsulation
 +		 * different from that of BSD loopback.  IPv6 must distinguish
 +		 * equal numerical addresses from scope zones on different
 +		 * interfaces, so it just passes a pointer to the originating
 +		 * interface here in the hope that it will reach ip6_input().
 +		 *
 +		 * Instead of trying to fake every possible link encapsulation
 +		 * here, we just pretend the packet was received on lo0 if *ifp
 +		 * isn't a loopback interface.  It gives behavior consistent
 +		 * with that of IPv4, where all local non-loopback traffic
 +		 * appears on lo0.
 +		 *
 +		 * The following condition may be somewhat redundant, but it
 +		 * accurately tells what we rely on here.
 +		 */
 +		bifp = (ifp->if_flags & IFF_LOOPBACK) &&
 +		       (ifp->if_bpf->bif_dlt == DLT_NULL) ? ifp : loif;
 +
 +		if (bpf_peers_present(bifp->if_bpf)) {
 +			/*
 +			 * DLT_NULL dictates that a pseudo link header
 +			 * be prepended with a single 32-bit field in it:
 +			 * the address family in host byte order.
 +			 * Note well that int isn't always 32 bits wide.
 +			 */
 +			u_int32_t af32 = af;
 +
 +			bpf_mtap2(bifp->if_bpf, &af32, sizeof(af32), m);
  		}
  	}
  

From: Cristian KLEIN <cristi@net.utcluj.ro>
To: Yar Tikhiy <yar@comp.chem.msu.su>
Cc: Andre Oppermann <andre@freebsd.org>,  bug-followup@freebsd.org
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows
 up on lo0 in bpf(4)
Date: Thu, 30 Aug 2007 21:36:17 +0300

 Yar Tikhiy wrote:
 > After some research, I've worked out the following patch.
 > My findings and decisions are documented in the comments.
 > Granting how many rounds it took to fix this small piece
 > of code, I added quite a lot of comments.  Please review.
 > Thanks!
 > 
 
 Hi,
 
 Sorry for my lack of answer.
 
 I don't really like the patch. IMHO, using the ifp parameter of looutput
 as nd6_output() does is incorrect.
 
 It is my opinion that, if IPv6 really needs the receiving interface for
 scoped addresses in ip6_input() it should use mbuf_tags(9). nd6_output()
 can tag the packet just before sending it to looutput. ip6_input() can
 first check for that tag, and adjust rcvif accordingly.
 
 I currently don't have a FreeBSD-CURRENT box, but if anybody is
 interested in this solution, I could start working on it.
 

From: Yar Tikhiy <yar@comp.chem.msu.su>
To: Cristian KLEIN <cristi@net.utcluj.ro>
Cc: Andre Oppermann <andre@freebsd.org>, qingli@freebsd.org,
        bug-followup@freebsd.org
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows up on lo0 in bpf(4)
Date: Fri, 31 Aug 2007 01:22:54 +0400

 On Thu, Aug 30, 2007 at 09:36:17PM +0300, Cristian KLEIN wrote:
 > Yar Tikhiy wrote:
 > > After some research, I've worked out the following patch.
 > > My findings and decisions are documented in the comments.
 > > Granting how many rounds it took to fix this small piece
 > > of code, I added quite a lot of comments.  Please review.
 > > Thanks!
 > > 
 > 
 > Hi,
 > 
 > Sorry for my lack of answer.
 > 
 > I don't really like the patch. IMHO, using the ifp parameter of looutput
 > as nd6_output() does is incorrect.
 > 
 > It is my opinion that, if IPv6 really needs the receiving interface for
 > scoped addresses in ip6_input() it should use mbuf_tags(9). nd6_output()
 > can tag the packet just before sending it to looutput. ip6_input() can
 > first check for that tag, and adjust rcvif accordingly.
 > 
 > I currently don't have a FreeBSD-CURRENT box, but if anybody is
 > interested in this solution, I could start working on it.
 
 Honestly, I know very little of the IPv6 implementation details,
 just peeked in the IPv6 Core Protocols Implementation book.  One
 of its authors, Qing Li, is a FreeBSD developer, so perhaps we
 should ask for his opinion.  I'm adding him to Cc:.
 
 Qing, would you mind taking a look at the audit trail of this PR?
 Thank you!
 
 From my own experience I can only tell that mbuf tags are quite
 expensive to apply them massively.  That's why they were dropped
 from our VLAN code.  On the one hand, nd6_output() can add a tag
 only if ifp != origifp; but, on the other hand, ip6_input() will
 have to invoke m_tag_locate() on each packet received.
 
 -- 
 Yar

From: Cristian KLEIN <cristi@net.utcluj.ro>
To: Yar Tikhiy <yar@comp.chem.msu.su>
Cc: Andre Oppermann <andre@freebsd.org>,  qingli@freebsd.org, 
 bug-followup@freebsd.org
Subject: Re: kern/112612: [lo] Traffic via additional lo(4) interface shows
 up on lo0 in bpf(4)
Date: Thu, 06 Sep 2007 17:03:27 +0300

 This is a multi-part message in MIME format.
 --------------020709080601050408000909
 Content-Type: text/plain; charset=UTF-8
 Content-Transfer-Encoding: 7bit
 
 Yar Tikhiy wrote:
 > On Thu, Aug 30, 2007 at 09:36:17PM +0300, Cristian KLEIN wrote:
 >> Yar Tikhiy wrote:
 >>> After some research, I've worked out the following patch.
 >>> My findings and decisions are documented in the comments.
 >>> Granting how many rounds it took to fix this small piece
 >>> of code, I added quite a lot of comments.  Please review.
 >>> Thanks!
 >>>
 >> Hi,
 >>
 >> Sorry for my lack of answer.
 >>
 >> I don't really like the patch. IMHO, using the ifp parameter of looutput
 >> as nd6_output() does is incorrect.
 >>
 >> It is my opinion that, if IPv6 really needs the receiving interface for
 >> scoped addresses in ip6_input() it should use mbuf_tags(9). nd6_output()
 >> can tag the packet just before sending it to looutput. ip6_input() can
 >> first check for that tag, and adjust rcvif accordingly.
 >>
 >> I currently don't have a FreeBSD-CURRENT box, but if anybody is
 >> interested in this solution, I could start working on it.
 > 
 > Honestly, I know very little of the IPv6 implementation details,
 > just peeked in the IPv6 Core Protocols Implementation book.  One
 > of its authors, Qing Li, is a FreeBSD developer, so perhaps we
 > should ask for his opinion.  I'm adding him to Cc:.
 > 
 > Qing, would you mind taking a look at the audit trail of this PR?
 > Thank you!
 > 
 > From my own experience I can only tell that mbuf tags are quite
 > expensive to apply them massively.  That's why they were dropped
 > from our VLAN code.  On the one hand, nd6_output() can add a tag
 > only if ifp != origifp; but, on the other hand, ip6_input() will
 > have to invoke m_tag_locate() on each packet received.
 > 
 
 Hi,
 
 Would you please take a look at the following patches. It is my idea of
 how the loopback thing should once and for all be solved.
 
 First, I had to patch nd6.c, so the interface route created for loX
 (X>0) does not point towards lo0 but towards the actual loopback
 interface. This bug can be seen on an unpatched kernel by adding an
 address on lo1. You will see that 'netstat -rn -f inet6' will point that
 address towards lo0 and not lo1.
 
 Next, I changed the behaviour of nd6_output, so it will send packets to
 the correct destination interface (according to routing table) no matter
 what. The scope interface will be stored in rcvif, so ip6_input() can
 retrieve it.
 
 Last, I modified if_loop.c, so that it does not change rcvif (unless
 rcvif is not set). The rest of if_loop.c.diff is just a cleanup.
 
 test-lo.sh is the script I used to test the actual outcome of these
 patches (results are in comments). I pretty much like the results. Only
 the last two test did not come out as some people might expect, but I
 don't think they are killer.
 
 If you don't like the rcvif solution, I can change it to embed the scope
 inside the IPv6 addresses, or use m_tags.
 
 Please tell me what you think of these patches.
 
 Regards,
 Cristi.
 
 --------------020709080601050408000909
 Content-Type: text/x-patch;
  name="nd6.c.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="nd6.c.diff"
 
 --- nd6.c.orig	Sun Jun  3 20:54:18 2007
 +++ nd6.c	Thu Sep  6 11:03:08 2007
 @@ -1331,7 +1331,14 @@
  				SDL(gate)->sdl_alen = ifp->if_addrlen;
  			}
  			if (nd6_useloopback) {
 -				rt->rt_ifp = &loif[0];	/* XXX */
 +				/*
 +				 * The user wants us to use the loopback
 +				 * interface for local traffic. However,
 +				 * if the interface is already a loopback
 +				 * interface (maybe lo3) then use that instead.
 +				 */
 +				if ((rt->rt_ifp->if_flags & IFF_LOOPBACK) == 0)
 +					rt->rt_ifp = &loif[0];
  				/*
  				 * Make sure rt_ifa be equal to the ifaddr
  				 * corresponding to the address.
 @@ -2114,9 +2121,12 @@
  #ifdef MAC
  	mac_create_mbuf_linklayer(ifp, m);
  #endif
 +	/*
 +	 * Set rcvif to origifp, so that ip6_input() can
 +	 * see the correct scope.
 +	 */
  	if ((ifp->if_flags & IFF_LOOPBACK) != 0) {
 -		return ((*ifp->if_output)(origifp, m, (struct sockaddr *)dst,
 -		    rt));
 +		m->m_pkthdr.rcvif = origifp;
  	}
  	return ((*ifp->if_output)(ifp, m, (struct sockaddr *)dst, rt));
  
 
 --------------020709080601050408000909
 Content-Type: text/x-patch;
  name="if_loop.c.diff"
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline;
  filename="if_loop.c.diff"
 
 --- if_loop.c.orig	Fri Feb  9 00:09:35 2007
 +++ if_loop.c	Thu Sep  6 11:23:10 2007
 @@ -244,6 +244,8 @@
   * would normally receive via a hardware loopback.
   *
   * This function expects the packet to include the media header of length hlen.
 + * This function should be called with the interface that does the loopback
 + * as argument.
   */
  
  int
 @@ -257,33 +259,26 @@
  
  	M_ASSERTPKTHDR(m);
  	m_tag_delete_nonpersistent(m);
 -	m->m_pkthdr.rcvif = ifp;
 +	/*
 +	 * IPv6 requires the input interface to remain the same
 +	 * in order to preserve the scope.
 +	 */
 +	if (m->m_pkthdr.rcvif == NULL)
 +		m->m_pkthdr.rcvif = ifp;
  
  	/*
 -	 * Let BPF see incoming packet in the following manner:
 -	 *  - Emulated packet loopback for a simplex interface 
 -	 *    (net/if_ethersubr.c)
 -	 *	-> passes it to ifp's BPF
 -	 *  - IPv4/v6 multicast packet loopback (netinet(6)/ip(6)_output.c)
 -	 *	-> not passes it to any BPF
 -	 *  - Normal packet loopback from myself to myself (net/if_loop.c)
 -	 *	-> passes to lo0's BPF (even in case of IPv6, where ifp!=lo0)
 +	 * Let BPF see packets in the following manner:
 +	 * - IFF_SIMPLEX loopback: send it to that interface's BPF
 +	 * - loX loopback: add AF and send it to that interface's BPF
  	 */
 -	if (hlen > 0) {
 -		if (bpf_peers_present(ifp->if_bpf)) {
 -			bpf_mtap(ifp->if_bpf, m);
 +	if (bpf_peers_present(ifp->if_bpf)) {
 +		if (ifp->if_bpf->bif_dlt == DLT_NULL) {
 +			/* XXX beware sizeof(af) != 4 */
 +			u_int32_t af1 = af;
 +			bpf_mtap2(ifp->if_bpf, &af1, sizeof(af1), m);
  		}
 -	} else {
 -		if (bpf_peers_present(loif->if_bpf)) {
 -			if ((m->m_flags & M_MCAST) == 0 || loif == ifp) {
 -				/* XXX beware sizeof(af) != 4 */
 -				u_int32_t af1 = af;	
 -
 -				/*
 -				 * We need to prepend the address family.
 -				 */
 -				bpf_mtap2(loif->if_bpf, &af1, sizeof(af1), m);
 -			}
 +		else {
 +			bpf_mtap(ifp->if_bpf, m);
  		}
  	}
  
 
 --------------020709080601050408000909
 Content-Type: application/x-shellscript;
  name="test-lo.sh"
 Content-Transfer-Encoding: base64
 Content-Disposition: inline;
  filename="test-lo.sh"
 
 IyEvYmluL3NoCgppZmNvbmZpZyBsbzAgaW5ldDYgMjAwMDo6MQoKaWZjb25maWcgbG8xIGNy
 ZWF0ZQppZmNvbmZpZyBsbzEgMTI3LjAuMC4yLzMyCmlmY29uZmlnIGxvMSBpbmV0NiBmZTgw
 OjoxCmlmY29uZmlnIGxvMSBpbmV0NiAyMDAwOjoyCgppZmNvbmZpZyBlbTAgaW5ldCAxMjcu
 MC4wLjMvMzIKaWZjb25maWcgZW0wIGluZXQ2IGZlODA6OjMKaWZjb25maWcgZW0wIGluZXQ2
 IDIwMDA6OjMKCnBpbmcgLWMgMiAxMjcuMC4wLjEJCSMgdmlzaWJsZSBvbiBsbzAKcGluZyAt
 YyAyIDEyNy4wLjAuMgkJIyB2aXNpYmxlIG9uIGxvMQpwaW5nIC1jIDIgMTI3LjAuMC4zCQkj
 IHZpc2libGUgb24gbG8wCnBpbmcgLWMgMiB3d3cuZ29vZ2xlLmNvbQkjIHZpc2libGUgb24g
 ZW0wCgpwaW5nNiAtYyAyIGZlODA6OjElbG8wCQkjIHZpc2libGUgb24gbG8wCnBpbmc2IC1j
 IDIgZmU4MDo6MSVsbzEJCSMgdmlzaWJsZSBvbiBsbzEKcGluZzYgLWMgMiBmZTgwOjoxJWVt
 MAkJIyBuZWlnaHNvbCB2aXNpYmxlIG9uIGVtMCwgbm8gcmVzcG9uc2UKcGluZzYgLWMgMiBm
 ZTgwOjozJWVtMAkJIyB2aXNpYmxlIG9uIGxvMAoKcGluZzYgLWMgMiAyMDAwOjoxCQkjIHZp
 c2libGUgb24gbG8wCnBpbmc2IC1jIDIgMjAwMDo6MgkJIyB2aXNpYmxlIG9uIGxvMQpwaW5n
 NiAtYyAyIDIwMDA6OjMJCSMgdmlzaWJsZSBvbiBsbzAKcGluZzYgLWMgMiB3d3cuZnJlZWJz
 ZC5vcmcJIyB2aXNpYmxlIG9uIGVtMAoKcGluZzYgLWMgMiBmZjAyOjoxJWxvMAkJIyB2aXNp
 YmxlIG9uIGxvMApwaW5nNiAtYyAyIGZmMDI6OjElbG8xCQkjIHZpc2libGUgb24gbG8xCnBp
 bmc2IC1jIDIgZmYwMjo6MSVlbTAJCSMgcmVxdWVzdCB2aXNpYmxlIG9uIGVtMCwgb3duIHJl
 cGx5IHZpc2libGUgb24gbG8wCgojIFRoaXMgcHJvZ3JhbSB3aWxsIHNlbmQgYW4gYWxsLW9u
 ZXMgVURQIGJyb2FkY2FzdAouL2ZvcmNlYmNhc3QJCQkjIHZpc2libGUgb24gZW0wIHR3aWNl
 CgkJCQkjIG9uY2UgZnJvbSBldGhlcl9vdXRwdXQKCQkJCSMgb25jZSBmcm9tIGlmX3NpbWxv
 b3AK
 --------------020709080601050408000909--
>Unformatted:
