From nobody@FreeBSD.org  Thu Sep 10 05:52:46 2009
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2A3C2106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 10 Sep 2009 05:52:46 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 188758FC18
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 10 Sep 2009 05:52:46 +0000 (UTC)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.3/8.14.3) with ESMTP id n8A5qjaZ057882
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 10 Sep 2009 05:52:45 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.3/8.14.3/Submit) id n8A5qjjW057881;
	Thu, 10 Sep 2009 05:52:45 GMT
	(envelope-from nobody)
Message-Id: <200909100552.n8A5qjjW057881@www.freebsd.org>
Date: Thu, 10 Sep 2009 05:52:45 GMT
From: Stef Walter <stef@memberwebs.com>
To: freebsd-gnats-submit@FreeBSD.org
Subject: [patch] Multicast: IP_DROP_MEMBERSHIP should return EADDRNOTAVAIL for invalid address
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         138689
>Category:       kern
>Synopsis:       [netinet] patch] Multicast: IP_DROP_MEMBERSHIP should return EADDRNOTAVAIL for invalid address
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-net
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Sep 10 06:00:09 UTC 2009
>Closed-Date:    Tue Sep 29 06:14:34 UTC 2009
>Last-Modified:  Tue Sep 29 06:14:34 UTC 2009
>Originator:     Stef Walter
>Release:        8.0-BETA3
>Organization:
>Environment:
FreeBSD portillo-gate.ws.local 8.0-BETA4 FreeBSD 8.0-BETA4 #10: Wed Sep  9 22:49:39 UTC 2009     root@portillo-gate.ws.local:/usr/obj/usr/src/sys/MESHNODE  i386

>Description:
After an interface goes down and its addresses go away, if a caller
calls setsockopt/IP_DROP_MEMBERSHIP with a simple in_mreq input
containing the address that no longer exists, the kernel should return
EADDRNOTAVAIL.


>How-To-Repeat:

However the current behavior in 8.0-BETA3 is to remove a membership to
the same multicast group from the 'first' interface instead. You can see
the results below in the ifmcstat output below.

Before northstar1 (tunnel) interface goes away, both bge0 and northstar1
 are on the 224.0.0.5 (ie: OSPF-ALL.MCAST.NET) group.

> > bge0:
> > 	inet 172.27.5.18
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.5 mode exclude
> > 			mcast-macaddr 01:00:5e:00:00:05
> > 		group 224.0.0.1 mode exclude
> > 			mcast-macaddr 01:00:5e:00:00:01
> > rl0:
> > 	inet 192.168.1.70
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.1 mode exclude
> > 			mcast-macaddr 01:00:5e:00:00:01
> > lo0:
> > 	inet 127.0.0.1
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.1 mode exclude
> > 	inet6 fe80::1%lo0
> > 	mldv2 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group ff01::1%lo0 mode exclude
> > 		group ff02::2:e78c:f513%lo0 mode exclude
> > 		group ff02::1%lo0 mode exclude
> > 		group ff02::1:ff00:1%lo0 mode exclude
> > northstar1:
> > 	inet 172.28.1.66
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.5 mode exclude
> > 		group 224.0.0.1 mode exclude

After northstar1 goes down, and setsockopt(..., IP_DROP_MEMBERSHIP, ...)
is called for the 172.28.1.66 address, we see that the group has been
dropped from bge0 instead. No error was returned from setsockopt.

> > bge0:
> > 	inet 172.27.5.18
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.1 mode exclude
> > 			mcast-macaddr 01:00:5e:00:00:01
> > rl0:
> > 	inet 192.168.1.70
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.1 mode exclude
> > 			mcast-macaddr 01:00:5e:00:00:01
> > lo0:
> > 	inet 127.0.0.1
> > 	igmpv3 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group 224.0.0.1 mode exclude
> > 	inet6 fe80::1%lo0
> > 	mldv2 flags=0<> rv 2 qi 125 qri 10 uri 3
> > 		group ff01::1%lo0 mode exclude
> > 		group ff02::2:e78c:f513%lo0 mode exclude
> > 		group ff02::1%lo0 mode exclude
> > 		group ff02::1:ff00:1%lo0 mode exclude
> > northstar1:

>Fix:

Patch is attached which fixes the problem. Is this the right approach?

BTW, the behavior of FreeBSD has always been that after northstar1 comes
back up with the same address, it is a member of 224.0.0.5 group.
Memberships are retained across interfaces and addresses going away. Not
sure if this is the best behavior, but it has been the historical
behavior. One can see people coding against this in routing software [1].

Besides fixing the problem of dropping membership on the first
interface, the effect of this patch is to restore the previous freebsd
behavior.

[1]
http://code.quagga.net/cgi-bin/gitweb.cgi?p=quagga.git;a=blob;f=lib/sockopt.c;h=55c6226b711e6386ef0378eb6def992af281082e;hb=HEAD#l196


Patch attached with submission follows:

--- sys/netinet/in_mcast.c.orig	2009-08-03 08:13:06.000000000 +0000
+++ sys/netinet/in_mcast.c	2009-09-09 01:35:06.000000000 +0000
@@ -2139,6 +2143,9 @@
 		}
 
-		if (!in_nullhost(gsa->sin.sin_addr))
+		if (!in_nullhost(gsa->sin.sin_addr)) {
 			INADDR_TO_IFP(mreqs.imr_interface, ifp);
+			if (ifp == NULL)
+				return (EADDRNOTAVAIL);
+		}
 
 		CTR3(KTR_IGMPV3, "%s: imr_interface = %s, ifp = %p",


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-net 
Responsible-Changed-By: linimon 
Responsible-Changed-When: Sat Sep 12 03:36:23 UTC 2009 
Responsible-Changed-Why:  
Over to maintainer(s). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=138689 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/138689: commit references a PR
Date: Sat, 12 Sep 2009 18:55:27 +0000 (UTC)

 Author: bms
 Date: Sat Sep 12 18:55:15 2009
 New Revision: 197129
 URL: http://svn.freebsd.org/changeset/base/197129
 
 Log:
   Fix an API issue in leave processing for IPv4 multicast groups.
    * Do not assume that the group lookup performed by imo_match_group()
      is valid when ifp is NULL in this case.
    * Instead, return EADDRNOTAVAIL if the ifp cannot be resolved for the
      membership we are being asked to leave.
   
   Caveat user:
    * The way IPv4 multicast memberships are implemented in the inpcb layer
      at the moment, has the side-effect that struct ip_moptions will
      still hold the membership, under the old ifp, until ip_freemoptions()
      is called for the parent inpcb.
    * The underlying issue is: the inpcb layer does not get notification
      of ifp being detached going away in a thread-safe manner.
      This is non-trivial to fix.
   
   But hey, at least the kernel should't panic when you unplug a card.
   
   PR:		138689
   Submitted by:	Stef Walter
   MFC after:	5 days
 
 Modified:
   head/sys/netinet/in_mcast.c
 
 Modified: head/sys/netinet/in_mcast.c
 ==============================================================================
 --- head/sys/netinet/in_mcast.c	Sat Sep 12 18:24:31 2009	(r197128)
 +++ head/sys/netinet/in_mcast.c	Sat Sep 12 18:55:15 2009	(r197129)
 @@ -2189,6 +2189,9 @@ inp_leave_group(struct inpcb *inp, struc
  	if (!IN_MULTICAST(ntohl(gsa->sin.sin_addr.s_addr)))
  		return (EINVAL);
  
 +	if (ifp == NULL)
 +		return (EADDRNOTAVAIL);
 +
  	/*
  	 * Find the membership in the membership array.
  	 */
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: bms 
State-Changed-When: Sat 12 Sep 2009 19:00:58 UTC 
State-Changed-Why:  
Committed to HEAD as rev 197129. 
Thanks for your work in tracking down and fixing this issue. 

This fix deals with the POLA violation in the userland API, 
but doesn't fix the underlying issue, which requires a bit more thought. 

The inpcb layer will not learn about interfaces going down, even 
though the netinet stack cleans up its references to link-layer 
structures. 

To be frank: this situation comes about largely because getting the BSD 
network stack right for hot-swappable interfaces, requires swinging a big 
hammer around in a number of APIs and structures. 
In short, it's the sort of thing which gets chewed over at developer 
summits. ;-) 

IPv4 multicast structures are keyed by ifp. Should ifp be invalidated 
due to an interface being detached at runtime, and a userland consumer 
later tries to delete a membership, the lookup will fail. 

The membership will however still exist. There isn't really an easy 
way to deal with this, without implementing a full walk of all socket(s) 
with memberships on the ifp, and invalidating the inpcb's reference 
to the group, when in_ifdetach() is actually called on interface detach. 

The membership is eventually cleaned up by the call to inp_freemoptions() 
during the PCB cleanup when the socket is closed. It is a non-trivial 
issue to resolve, because it would involve taking socket-layer locks 
from a lower level path, leading to a lock order violation. 

Coding around issues in the stack is not really the right approach-- 
the better approach is to fix problems at source. Unfortunately, 
the project(s) involved are all separate, and communication hasn't 
really been that great between them in the past. That, and it takes 
a while for fixes to percolate into kernels because of release schedules. 

The code in Quagga looks really ugly, but I suppose this is what 
ends up happening in situations like this, where the development of 
stack components is not that cohesive. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=138689 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/138689: commit references a PR
Date: Thu, 17 Sep 2009 13:42:09 +0000 (UTC)

 Author: bms
 Date: Thu Sep 17 13:41:59 2009
 New Revision: 197280
 URL: http://svn.freebsd.org/changeset/base/197280
 
 Log:
   MFC revs 197129,197130,197132:
    Fixes to mcast userland API.
   --
     Fix an API issue in leave processing for IPv4 multicast groups.
      * Do not assume that the group lookup performed by imo_match_group()
        is valid when ifp is NULL in this case.
      * Instead, return EADDRNOTAVAIL if the ifp cannot be resolved for the
        membership we are being asked to leave.
   
     Caveat user:
      * The way IPv4 multicast memberships are implemented in the inpcb layer
        at the moment, has the side-effect that struct ip_moptions will
        still hold the membership, under the old ifp, until ip_freemoptions()
        is called for the parent inpcb.
      * The underlying issue is: the inpcb layer does not get notification
        of ifp being detached going away in a thread-safe manner.
        This is non-trivial to fix.
   --
     Fix an obvious logic error in the IPv4 multicast leave processing,
     where the filter mode vector was not updated correctly after the leave.
   --
     Tighten input checking in inp_join_group():
      * Don't try to use the source address, when its family is unspecified.
      * If we get a join without a source, on an existing inclusive
        mode group, this is an error, as it would change the filter mode.
   
     Fix a problem with the handling of in_mfilter for new memberships:
      * Do not rely on imf being NULL; it is explicitly initialized to a
        non-NULL pointer when constructing a membership.
      * Explicitly initialize *imf to EX mode when the source address
        is unspecified.
     This fixes a problem with in_mfilter slot recycling in the join path.
   --
     Don't allow joins w/o source on an existing group.
     This is almost always pilot error.
   
     We don't need to check for group filter UNDEFINED state at t1,
     because we only ever allocate filters with their groups, so we
     unconditionally reject such calls with EINVAL.
     Trying to change the active filter mode w/o going through IP_MSFILTER
     is also disallowed.
   
     Deals with the case described in PR 137164 upfront, cumulative
     with the fix in svn rev 197132 which only calls imo_match_source()
     if the source address family was not unspecified.
   --
   
   Revision 197136 has a text conflict, however it is a comment only change.
   
   PR:		137164, 138689, 138690, 138691
   Submitted by:	Stef Walter (with fixups)
   Approved by:	re (kib)
 
 Modified:
   stable/8/sys/   (props changed)
   stable/8/sys/amd64/include/xen/   (props changed)
   stable/8/sys/cddl/contrib/opensolaris/   (props changed)
   stable/8/sys/contrib/dev/acpica/   (props changed)
   stable/8/sys/contrib/pf/   (props changed)
   stable/8/sys/dev/ciss/   (props changed)
   stable/8/sys/dev/xen/xenpci/   (props changed)
   stable/8/sys/netinet/in_mcast.c
 
 Modified: stable/8/sys/netinet/in_mcast.c
 ==============================================================================
 --- stable/8/sys/netinet/in_mcast.c	Thu Sep 17 13:33:40 2009	(r197279)
 +++ stable/8/sys/netinet/in_mcast.c	Thu Sep 17 13:41:59 2009	(r197280)
 @@ -1957,11 +1957,6 @@ inp_join_group(struct inpcb *inp, struct
  	if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0)
  		return (EADDRNOTAVAIL);
  
 -	/*
 -	 * MCAST_JOIN_SOURCE on an exclusive membership is an error.
 -	 * On an existing inclusive membership, it just adds the
 -	 * source to the filter list.
 -	 */
  	imo = inp_findmoptions(inp);
  	idx = imo_match_group(imo, ifp, &gsa->sa);
  	if (idx == -1) {
 @@ -1969,15 +1964,33 @@ inp_join_group(struct inpcb *inp, struct
  	} else {
  		inm = imo->imo_membership[idx];
  		imf = &imo->imo_mfilters[idx];
 -		if (ssa->ss.ss_family != AF_UNSPEC &&
 -		    imf->imf_st[1] != MCAST_INCLUDE) {
 -			error = EINVAL;
 -			goto out_inp_locked;
 -		}
 -		lims = imo_match_source(imo, idx, &ssa->sa);
 -		if (lims != NULL) {
 -			error = EADDRNOTAVAIL;
 -			goto out_inp_locked;
 +		if (ssa->ss.ss_family != AF_UNSPEC) {
 +			/*
 +			 * MCAST_JOIN_SOURCE on an exclusive membership
 +			 * is an error. On an existing inclusive membership,
 +			 * it just adds the source to the filter list.
 +			 */
 +			if (imf->imf_st[1] != MCAST_INCLUDE) {
 +				error = EINVAL;
 +				goto out_inp_locked;
 +			}
 +			/* Throw out duplicates. */
 +			lims = imo_match_source(imo, idx, &ssa->sa);
 +			if (lims != NULL) {
 +				error = EADDRNOTAVAIL;
 +				goto out_inp_locked;
 +			}
 +		} else {
 +			/*
 +			 * MCAST_JOIN_GROUP on an existing inclusive
 +			 * membership is an error; if you want to change
 +			 * filter mode, you must use the userland API
 +			 * setsourcefilter().
 +			 */
 +			if (imf->imf_st[1] == MCAST_INCLUDE) {
 +				error = EINVAL;
 +				goto out_inp_locked;
 +			}
  		}
  	}
  
 @@ -2010,7 +2023,8 @@ inp_join_group(struct inpcb *inp, struct
  	/*
  	 * Graft new source into filter list for this inpcb's
  	 * membership of the group. The in_multi may not have
 -	 * been allocated yet if this is a new membership.
 +	 * been allocated yet if this is a new membership, however,
 +	 * the in_mfilter slot will be allocated and must be initialized.
  	 */
  	if (ssa->ss.ss_family != AF_UNSPEC) {
  		/* Membership starts in IN mode */
 @@ -2027,6 +2041,12 @@ inp_join_group(struct inpcb *inp, struct
  			error = ENOMEM;
  			goto out_imo_free;
  		}
 +	} else {
 +		/* No address specified; Membership starts in EX mode */
 +		if (is_new) {
 +			CTR1(KTR_IGMPV3, "%s: new join w/o source", __func__);
 +			imf_init(imf, MCAST_UNDEFINED, MCAST_EXCLUDE);
 +		}
  	}
  
  	/*
 @@ -2189,6 +2209,9 @@ inp_leave_group(struct inpcb *inp, struc
  	if (!IN_MULTICAST(ntohl(gsa->sin.sin_addr.s_addr)))
  		return (EINVAL);
  
 +	if (ifp == NULL)
 +		return (EADDRNOTAVAIL);
 +
  	/*
  	 * Find the membership in the membership array.
  	 */
 @@ -2275,9 +2298,11 @@ out_imf_rollback:
  	imf_reap(imf);
  
  	if (is_final) {
 -		/* Remove the gap in the membership array. */
 -		for (++idx; idx < imo->imo_num_memberships; ++idx)
 +		/* Remove the gap in the membership and filter array. */
 +		for (++idx; idx < imo->imo_num_memberships; ++idx) {
  			imo->imo_membership[idx-1] = imo->imo_membership[idx];
 +			imo->imo_mfilters[idx-1] = imo->imo_mfilters[idx];
 +		}
  		imo->imo_num_memberships--;
  	}
  
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: patched->closed 
State-Changed-By: bms 
State-Changed-When: Tue 29 Sep 2009 06:14:24 UTC 
State-Changed-Why:  
MFCed and in 8.0-RC1 


http://www.freebsd.org/cgi/query-pr.cgi?pr=138689 
>Unformatted:
