From nobody@FreeBSD.org  Thu Mar 13 16:44:46 2008
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id ADD5D1065671
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Mar 2008 16:44:46 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21])
	by mx1.freebsd.org (Postfix) with ESMTP id 976B58FC1D
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Mar 2008 16:44:46 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from www.freebsd.org (localhost [127.0.0.1])
	by www.freebsd.org (8.14.2/8.14.2) with ESMTP id m2DGfRFo027337
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 13 Mar 2008 16:41:27 GMT
	(envelope-from nobody@www.freebsd.org)
Received: (from nobody@localhost)
	by www.freebsd.org (8.14.2/8.14.1/Submit) id m2DGfRqr027336;
	Thu, 13 Mar 2008 16:41:27 GMT
	(envelope-from nobody)
Message-Id: <200803131641.m2DGfRqr027336@www.freebsd.org>
Date: Thu, 13 Mar 2008 16:41:27 GMT
From: Laurent Frigault <lfrigault@agneau.org>
To: freebsd-gnats-submit@FreeBSD.org
Subject: connect randomly fails with EPERM with some pf rules
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         121668
>Category:       kern
>Synopsis:       [pf] connect randomly fails with EPERM with some pf rules
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-pf
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Mar 13 16:50:01 UTC 2008
>Closed-Date:    Fri Mar 14 11:34:52 UTC 2008
>Last-Modified:  Fri Mar 14 11:34:52 UTC 2008
>Originator:     Laurent Frigault
>Release:        6.2-RELEASE-p10 , 7.0-RELEASE
>Organization:
>Environment:
FreeBSD troll.free.org 6.2-RELEASE-p10 FreeBSD 6.2-RELEASE-p10 #0: Wed Jan 16 14:22:17 CET 2008     lolo@troll.free.org:/usr/src/sys/i386/compile/SMP  i386

FreeBSD surt.free.org 7.0-RELEASE FreeBSD 7.0-RELEASE #1: Wed Feb 27 18:29:25 CET 2008     root@surt.free.org:/usr/src/sys/amd64/compile/GENERIC  amd64

>Description:
From times to times, connect fails with EPERM when using pf statefully.

I discover this problem when investigating the cause of unexpected mysql connection failure between a web php script and and mysql server running on an other server. This leads me to a connect(2) problem failing without reason with EPERM (there is no EPERM failure cause in connect manual) .

ruleset1 (no state was the default before 7.0):
==============================================
scrub in all fragment reassemble
 
pass out quick on lo0 all no state
pass in quick on lo0 all no state
..
==============================================
ruleset 2

==============================================
scrub in all fragment reassemble

pass out quick on lo0 proto tcp from any to any port 9 flags S/SA keep state
pass out quick on lo0 all no state
pass in quick on lo0 all no state
==============================================

With ruleset 1 => no problem
With ruleset 2 => connect fails sometimes with EPERM

There is no reject information in pf logs which is logical because pf rules authorize those connections

>How-To-Repeat:
sysctl net.inet.tcp.nolocaltimewait=1
not needed, but helps to reproduce the problem with client and server on the same computer.

start inetd with discard/tcp service enabled :

inetd_enable="YES"
inetd_flags="-wl -R 0"

% grep ^discard /etc/inetd.conf 
discard stream  tcp     nowait  root    internal

pf rules:
scrub in all fragment reassemble

pass out quick on lo0 proto tcp from any to any port 9 flags S/SA keep state
pass out quick on lo0 all no state
pass in quick on lo0 all no state

Lauch the following perl script.

Sometimes, connect will wrongly fail with EPERM
==============================================================
#!/usr/bin/perl -w

use strict;

use Socket;
use Errno;

$|=1;

sub con($$$)
{
        my ($sin,$port,$proto) = @_;

        socket(Socket_Handle, PF_INET, SOCK_STREAM, $proto);
        if(connect(Socket_Handle,$sin))
        {
                print "ok\t";
                print Socket_Handle "hello\n";
                close (Socket_Handle);
        }
        else
        {
                print "$!\n";
        };
}
  
my $proto =  getprotobyname('tcp');
my $port = getservbyname('discard', 'tcp');
my $sin = sockaddr_in($port,inet_aton("127.1"));
   
for (my $cpt=0;$cpt<=2000;++$cpt)
{
        print "$cpt\t";
        con($sin,$port,$proto);
};
==============================================================

>Fix:


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->freebsd-pf 
Responsible-Changed-By: remko 
Responsible-Changed-When: Thu Mar 13 16:53:53 UTC 2008 
Responsible-Changed-Why:  
reassign to pf team. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=121668 

From: Kian Mohageri <kian@restek.wwu.edu>
To: bug-followup@FreeBSD.org, lfrigault@agneau.org
Cc:  
Subject: Re: kern/121668: connect randomly fails with EPERM with some pf rules
Date: Thu, 13 Mar 2008 11:29:52 -0700

 This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
 --------------enigB37C85A9359B7920117FA840
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: quoted-printable
 
 Does state-mismatch counter increase when this happens (pfctl -si)?
 
 I remember similar behavior and it was caused by source port reuse on
 the client (so the new connection caused a state mismatch on an old state=
 ).
 
 
 
 --------------enigB37C85A9359B7920117FA840
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: OpenPGP digital signature
 Content-Disposition: attachment; filename="signature.asc"
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.8 (MingW32)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
 
 iEYEARECAAYFAkfZcqMACgkQfLazdIP7nIMD5gCfU0eN8zZ9mOpIzd5e365sukEW
 Zn4An3w78DG1Fv3kRWMJdFAEgsyxwbD/
 =yDIr
 -----END PGP SIGNATURE-----
 
 --------------enigB37C85A9359B7920117FA840--

From: Laurent Frigault <lfrigault@agneau.org>
To: Kian Mohageri <kian@restek.wwu.edu>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/121668: connect randomly fails with EPERM with some pf rules
Date: Thu, 13 Mar 2008 20:16:58 +0100

 On Thu, Mar 13, 2008 at 11:29:52AM -0700, Kian Mohageri wrote:
 > Does state-mismatch counter increase when this happens (pfctl -si)?
 
 I re-run the teste and yes and the state-mismatch counter increase is
 exactly the number of connect failling with EPERM.
 
 > I remember similar behavior and it was caused by source port reuse on
 > the client (so the new connection caused a state mismatch on an old
 > state).
 
 The previous connection are closed.
 If the source port can't be reused yet, then the kernel should use an
 other one for the new connection. If it can, then pf should allow it.
 
 If the connect (SYN) does not match an existing state, The pf rule
 should create a new state. 
 
 Am I wrong ?
 
 I don't fixe the source port in my sample and mysql client don't either.
 
 How can I work around this ?
 
 Regards,
 -- 
 Laurent Frigault | <url:http://www.agneau.org/>

From: Max Laier <max@love2party.net>
To: bug-followup@freebsd.org,
 lfrigault@agneau.org
Cc:  
Subject: Re: kern/121668: connect randomly fails with EPERM with some pf rules
Date: Thu, 13 Mar 2008 20:26:39 +0100

 > sysctl net.inet.tcp.nolocaltimewait=1
 > not needed, but helps to reproduce the problem with client and server
 > on the same computer.
 
 Okay, now this is just asking for trouble.  pf does thorough checks on TCP 
 states, one of which is to enforce the 2MSL quite time before port reuse.  
 If you set above sysctl you specificly ask FreeBSD to break that rule and 
 thus cause pf to bark.
 
 You can also hit the issue if you have a large number of (consecutive) 
 connections between two hosts (e.g. [poorly configured] squid -> 
 www-backends, mysql, ...).  The sollution is to:
 
  1) Reduce the connection spree and use one permanent connection
  2) Increase the ephemeral port range net.inet.ip.portrange.hi{first,last}
  3) Decrease the pf state timeout tcp.{closing,closed} in order to relax 
 the check.  You can do this globaly and on a per-rule basis.
 
 -- Max

From: Kian Mohageri <kian@restek.wwu.edu>
To: Laurent Frigault <lfrigault@agneau.org>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/121668: connect randomly fails with EPERM with some pf rules
Date: Thu, 13 Mar 2008 12:44:48 -0700

 This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
 --------------enig1FD5631B7DA864ECD09DF906
 Content-Type: text/plain; charset=ISO-8859-15
 Content-Transfer-Encoding: quoted-printable
 
 Laurent Frigault wrote:
 > On Thu, Mar 13, 2008 at 11:29:52AM -0700, Kian Mohageri wrote:
 >> Does state-mismatch counter increase when this happens (pfctl -si)?
 >=20
 > I re-run the teste and yes and the state-mismatch counter increase is
 > exactly the number of connect failling with EPERM.
 >=20
 >> I remember similar behavior and it was caused by source port reuse on
 >> the client (so the new connection caused a state mismatch on an old
 >> state).
 >=20
 > The previous connection are closed.
 > If the source port can't be reused yet, then the kernel should use an
 > other one for the new connection. If it can, then pf should allow it.
 >=20
 > If the connect (SYN) does not match an existing state, The pf rule
 > should create a new state.=20
 >=20
 
 It does "match" a state (source/dest is same), which is the problem.
 Even though the connection is closed, the state hasn't yet been purged.
  Refer to pf.conf(5) for how to adjust tcp.closed so the state is purged
 sooner, or adjust the available dynamic port range (sysctl
 net.inet.ip.portrange).
 
 I don't know if this is intended behavior or not.  I've never run into
 it on OpenBSD, but pf is integrated much more tightly into their system
 obviously and I'm guessing their port reuse code is pretty different too.=
 
 
 
 --------------enig1FD5631B7DA864ECD09DF906
 Content-Type: application/pgp-signature; name="signature.asc"
 Content-Description: OpenPGP digital signature
 Content-Disposition: attachment; filename="signature.asc"
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.8 (MingW32)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
 
 iEYEARECAAYFAkfZhDMACgkQfLazdIP7nIPoxwCcCpBWdXiAgDzZaVFoT0kDXTu/
 8HkAn2PZMIDfks+DWYOxg26SMe3knOOO
 =uZ0y
 -----END PGP SIGNATURE-----
 
 --------------enig1FD5631B7DA864ECD09DF906--

From: Laurent Frigault <lfrigault@agneau.org>
To: Kian Mohageri <kian@restek.wwu.edu>
Cc: bug-followup@FreeBSD.org
Subject: Re: kern/121668: connect randomly fails with EPERM with some pf rules
Date: Thu, 13 Mar 2008 23:49:43 +0100

 On Thu, Mar 13, 2008 at 12:44:48PM -0700, Kian Mohageri wrote:
 > >> I remember similar behavior and it was caused by source port reuse
 > >> on the client (so the new connection caused a state mismatch on an
 > >> old state).
 > > 
 > > The previous connection are closed.
 > > If the source port can't be reused yet, then the kernel should use an
 > > other one for the new connection. If it can, then pf should allow it.
 > > 
 > > If the connect (SYN) does not match an existing state, The pf rule
 > > should create a new state. 
 > 
 > It does "match" a state (source/dest is same), which is the problem.
 
 ok.
 
 > Even though the connection is closed, the state hasn't yet been purged.
 > Refer to pf.conf(5) for how to adjust tcp.closed so the state is purged
 > sooner, or adjust the available dynamic port range (sysctl
 > net.inet.ip.portrange).
 
 I try to disable net.inet.ip.portrange.randomized  and set tcp.closed
 timeout to 0.
 
 That seems to work arround the problem in most cases.
 
 Are there any risk at setting the timeout to 0 ?
 
 > I don't know if this is intended behavior or not.  I've never run into
 > it on OpenBSD, but pf is integrated much more tightly into their
 > system obviously and I'm guessing their port reuse code is pretty
 > different too.
 
 Maybe the port randomization is different too.
 
 -- 
 Laurent Frigault | <url:http://www.agneau.org/>

From: Laurent Frigault <lfrigault@agneau.org>
To: Max Laier <max@love2party.net>
Cc: bug-followup@freebsd.org
Subject: Re: kern/121668: connect randomly fails with EPERM with some pf rules
Date: Fri, 14 Mar 2008 00:20:00 +0100

 On Thu, Mar 13, 2008 at 08:26:39PM +0100, Max Laier wrote:
 > > sysctl net.inet.tcp.nolocaltimewait=1
 > > not needed, but helps to reproduce the problem with client and server
 > > on the same computer.
 > 
 > Okay, now this is just asking for trouble.  pf does thorough checks on TCP 
 > states, one of which is to enforce the 2MSL quite time before port reuse.  
 > If you set above sysctl you specificly ask FreeBSD to break that rule and 
 > thus cause pf to bark.
 
 The nolocaltimewait=1  was only to help to reproduce the problem.
 
 > You can also hit the issue if you have a large number of (consecutive) 
 > connections between two hosts (e.g. [poorly configured] squid -> 
 > www-backends, mysql, ...).  The sollution is to:
 
 I discover this problem with connection between CGI scripts and a mysql
 server.
 
 >  1) Reduce the connection spree and use one permanent connection
 
 Not allways possible with CGI.
 
 >  2) Increase the ephemeral port range net.inet.ip.portrange.hi{first,last}
 
 Interesting point. Lowering first seems to help. Disabeling
 net.inet.ip.portrange.randomized helps a lot too.
 
 >  3) Decrease the pf state timeout tcp.{closing,closed} in order to relax 
 > the check.  You can do this globaly and on a per-rule basis.
 
 I've set closed to 1 and closing to 30
 
 That helps too.
 
 It does not seems possible to set tcp.closed to 0 on a per rule basis :
 This is accepted :
 pass out quick on lo0 proto tcp from any to any port 9 flags S/SA keep state ( tcp.closing 30 , tcp.closed 0 )
 
 But pfctl -srules -vvv prints :
 @0 pass out quick on lo0 proto tcp from any to any port = discard flags
 S/SA keep state (tcp.closing 30)
   [ Evaluations: 1         Packets: 0         Bytes: 0           States: 0     ]
   [ Inserted: uid 0 pid 51151 ]
 
 the tcp.closed seems to be ignored
 
 It works with tcp.closed set to 1
 
 Regards,
 -- 
 Laurent Frigault | <url:http://www.agneau.org/>
State-Changed-From-To: open->closed 
State-Changed-By: mlaier 
State-Changed-When: Fri Mar 14 11:33:49 UTC 2008 
State-Changed-Why:  
Further discussion belongs to freebsd-pf@, thanks. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=121668 
>Unformatted:
