From youngh@brak.caida.org  Thu Jul 27 22:06:43 2006
Return-Path: <youngh@brak.caida.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CD27616A4DA
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 27 Jul 2006 22:06:43 +0000 (UTC)
	(envelope-from youngh@brak.caida.org)
Received: from brak.caida.org (brak.caida.org [192.172.226.88])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7C37843D46
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 27 Jul 2006 22:06:43 +0000 (GMT)
	(envelope-from youngh@brak.caida.org)
Received: from brak.caida.org (localhost.caida.org [127.0.0.1])
	by brak.caida.org (8.13.6/8.13.6) with ESMTP id k6RM6eDR001617
	for <FreeBSD-gnats-submit@freebsd.org>; Thu, 27 Jul 2006 15:06:40 -0700 (PDT)
	(envelope-from youngh@brak.caida.org)
Received: (from youngh@localhost)
	by brak.caida.org (8.13.6/8.13.6/Submit) id k6RM6euW001616;
	Thu, 27 Jul 2006 15:06:40 -0700 (PDT)
	(envelope-from youngh)
Message-Id: <200607272206.k6RM6euW001616@brak.caida.org>
Date: Thu, 27 Jul 2006 15:06:40 -0700 (PDT)
From: Young Hyun <youngh@caida.org>
Reply-To: Young Hyun <youngh@caida.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: passing file descriptor over datagram UNIX domain socket crashes kernel
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         100940
>Category:       kern
>Synopsis:       passing file descriptor over datagram UNIX domain socket crashes kernel
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    rwatson
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Jul 27 22:10:16 GMT 2006
>Closed-Date:    Thu Mar 01 11:13:30 GMT 2007
>Last-Modified:  Thu Mar 01 11:13:30 GMT 2007
>Originator:     Young Hyun
>Release:        FreeBSD 6.1-STABLE-200607 i386
>Organization:
CAIDA/UCSD
>Environment:
System: FreeBSD brak.caida.org 6.1-STABLE-200607 FreeBSD 6.1-STABLE-200607 #0: Wed Jul 26 18:32:47 PDT 2006 root@brak.caida.org:/usr/src/sys/i386/compile/SMP i386


	
>Description:

Passing file descriptors over a SOCK_DGRAM UNIX domain socket crashes
the kernel with "Fatal trap 12: page fault while in kernel mode".
This bug is probably closely related to (but not identical with)
PR kern/93914.  In PR 93914, the sample code passes file descriptors
over a SOCK_STREAM socket, and it no longer causes a kernel crash under
6.1-STABLE.  My own SOCK_STREAM-based test program that crashes the 5.4
kernel also fails to crash the 6.1 kernel.  In the present case, the
kernel crashes in code specific to the handling of SOCK_DGRAM sockets
(though, of course, that doesn't exclude the possibility of the same
underlying problem behind this and PR 93914 and which somehow only got
masked for the SOCK_STREAM case in 6.1-STABLE).

#0  doadump () at pcpu.h:165
#1  0xc06632ca in boot (howto=260) at ../../../kern/kern_shutdown.c:409
#2  0xc0663621 in panic (fmt=0xc08913da "%s")
    at ../../../kern/kern_shutdown.c:565
#3  0xc085bfd6 in trap_fatal (frame=0xde946b54, eva=8)
    at ../../../i386/i386/trap.c:836
#4  0xc085bcdf in trap_pfault (frame=0xde946b54, usermode=0, eva=8)
    at ../../../i386/i386/trap.c:744
#5  0xc085b8d5 in trap (frame=
      {tf_fs = -1017839608, tf_es = -1014300632, tf_ds = 40, tf_edi = -1016513388, tf_esi = -1017187284, tf_ebp = -560698452, tf_isp = -560698496, tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1066768486, tf_cs = 32, tf_eflags = 66118, tf_esp = -560698440, tf_ss = -1065577831})
    at ../../../i386/i386/trap.c:434
#6  0xc08484da in calltrap () at ../../../i386/i386/exception.s:139
#7  0xc06a679a in uipc_send (so=0xc35ef42c, flags=0, m=0xc38f3500, 
    nam=0xc35474f0, control=0xc351b900, td=0xc3552900)
    at ../../../kern/uipc_usrreq.c:432
#8  0xc069dc8f in sosend (so=0xc35ef42c, addr=0xc35474f0, uio=0xde946c3c, 
    top=0xc38f3500, control=0xc351d600, flags=0, td=0xc3552900)
    at ../../../kern/uipc_socket.c:836
#9  0xc06a36a5 in kern_sendit (td=0xc3552900, s=3, mp=0xde946cb4, flags=0, 
    control=0xc351d600, segflg=UIO_USERSPACE)
    at ../../../kern/uipc_syscalls.c:772
#10 0xc06a355f in sendit (td=0xc3552900, s=3, mp=0xde946cb4, flags=0)
    at ../../../kern/uipc_syscalls.c:712
#11 0xc06a3976 in sendmsg (td=0xc3552900, uap=0xde946d04)
    at ../../../kern/uipc_syscalls.c:920
#12 0xc085c31b in syscall (frame=
      {tf_fs = 671350843, tf_es = -1078001605, tf_ds = -1078001605, tf_edi = 671410632, tf_esi = -1077940984, tf_ebp = -1077941096, tf_isp = -560698012, tf_ebx = 1, tf_edx = 17, tf_ecx = 17, tf_eax = 28, tf_trapno = 32, tf_err = 2, tf_eip = 672073267, tf_cs = 51, tf_eflags = 658, tf_esp = -1077941300, tf_ss = 59})
    at ../../../i386/i386/trap.c:981
#13 0xc084852f in Xint0x80_syscall () at ../../../i386/i386/exception.s:200
#14 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) up 7
#7  0xc06a679a in uipc_send (so=0xc35ef42c, flags=0, m=0xc38f3500, 
    nam=0xc35474f0, control=0xc351b900, td=0xc3552900)
    at ../../../kern/uipc_usrreq.c:432
432                     so2 = unp->unp_conn->unp_socket;
(kgdb) p unp->unp_conn
$1 = (struct unpcb *) 0x0
(kgdb) p *unp
$2 = {unp_link = {le_next = 0xc3693a64, le_prev = 0xc09a7098}, 
  unp_socket = 0xc35ef42c, unp_vnode = 0xc38d2660, unp_ino = 0, 
  unp_conn = 0x0, unp_refs = {lh_first = 0x0}, unp_reflink = {le_next = 0x0, 
    le_prev = 0xc36938d8}, unp_addr = 0xc3547390, unp_cc = 0, unp_mbcnt = 0, 
  unp_gencnt = 100, unp_flags = 0, unp_peercred = {cr_version = 0, cr_uid = 0, 
    cr_ngroups = 0, cr_groups = {0 <repeats 16 times>}, _cr_unused1 = 0x0}}
(kgdb)

>How-To-Repeat:

Execute the attached sample code, and let it run for a few minutes.  The
sample code is the same one supplied for PR kern/91224, except there it
caused a crash in a different part of the kernel (PR 91224 is closed).

>Fix:

	

--- AF_UNIX-FreeBSD-crash.c begins here ---
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <fcntl.h>
#include <unistd.h>

#define PATH "/tmp/123"
#define PATH_TMP "/tmp/123.tmp"
#define SOME_FILE "/etc/passwd"

struct mycmsghdr {
struct cmsghdr hdr;
int fd;
};

extern errno;

void server();
void client();

int main()
{
switch ( fork()) {
case -1:
printf( "fork error %d\n",errno);
break;
case 0:
for (;;) client();
default:
server();
}
}

void server()
{
struct sockaddr_un addr;
struct msghdr mymsghdr;
struct mycmsghdr ancdbuf;
char data[ 1];
int sockfd,
len,
fd;

if ( unlink( PATH) == -1)
printf( "unlink error %d\n",errno);

if (( sockfd = socket( AF_UNIX,SOCK_DGRAM,0)) == -1)
printf( "socket error %d\n",errno);

strcpy( addr.sun_path,PATH);
addr.sun_len = sizeof( addr.sun_len) + sizeof( addr.sun_family)
+ strlen( addr.sun_path);
addr.sun_family = AF_UNIX;

if ( bind( sockfd,(struct sockaddr *) &addr,addr.sun_len) == -1)
printf( "bind error %d\n",errno);

for (;;) {

if (( fd = open( SOME_FILE,O_RDONLY)) == -1)
printf( "open file error %d\n",errno);

len = sizeof( addr.sun_path);

if ( recvfrom( sockfd,&data,sizeof( data),0,
(struct sockaddr *) &addr,&len) == -1)
printf( "recvfrom error %d\n",errno);

ancdbuf.hdr.cmsg_len = sizeof( ancdbuf);
ancdbuf.hdr.cmsg_level = SOL_SOCKET;
ancdbuf.hdr.cmsg_type = SCM_RIGHTS;
ancdbuf.fd = fd;

mymsghdr.msg_name = (caddr_t) &addr;
mymsghdr.msg_namelen = len;
mymsghdr.msg_iov = NULL;
mymsghdr.msg_iovlen = 0;
mymsghdr.msg_control = (caddr_t) &ancdbuf;
mymsghdr.msg_controllen = ancdbuf.hdr.cmsg_len;
mymsghdr.msg_flags = 0;

if ( sendmsg( sockfd,&mymsghdr,0) == -1)
printf( "sendmsg error %d\n",errno);

close( fd);
}
}

void client()
{
struct sockaddr_un addr_s,
addr_c;
struct mycmsghdr ancdbuf;
struct msghdr mymsghdr;
char data[ 1];
int sockfd,
len,
fd;

if (( sockfd = socket( AF_UNIX,SOCK_DGRAM,0)) == -1)
printf( "socket error %d\n",errno);

if ( unlink( PATH_TMP) == -1)
printf( "unlink error %d\n",errno);

strcpy( addr_c.sun_path,PATH_TMP);
addr_c.sun_len = sizeof( addr_c.sun_len) + sizeof(addr_c.sun_family)
+ strlen( addr_c.sun_path);
addr_c.sun_family = AF_UNIX;

strcpy( addr_s.sun_path,PATH);
addr_s.sun_len = sizeof( addr_s.sun_len) + sizeof(addr_s.sun_family)
+ strlen( addr_s.sun_path);
addr_s.sun_family = AF_UNIX;

if ( bind( sockfd,(struct sockaddr*) &addr_c,addr_c.sun_len) == -1)
printf( "bind error %d\n",errno);

if ( sendto( sockfd,&data,sizeof( data),0,(struct sockaddr *) &addr_s,
addr_s.sun_len) == -1)
printf( "sendto error %d\n",errno);

len = addr_s.sun_len;

ancdbuf.hdr.cmsg_len = sizeof( ancdbuf);
ancdbuf.hdr.cmsg_level = SOL_SOCKET;
ancdbuf.hdr.cmsg_type = SCM_RIGHTS;

mymsghdr.msg_name = NULL;
mymsghdr.msg_namelen = 0;
mymsghdr.msg_iov = NULL;
mymsghdr.msg_iovlen = 0;
mymsghdr.msg_control = (caddr_t) &ancdbuf;
mymsghdr.msg_controllen = ancdbuf.hdr.cmsg_len;
mymsghdr.msg_flags = 0;

if ( recvmsg( sockfd,&mymsghdr,0) == -1)
printf( "recvmsg error %d\n",errno);

fd = ancdbuf.fd;

close(fd);
close( sockfd);

} 
--- AF_UNIX-FreeBSD-crash.c ends here ---


>Release-Note:
>Audit-Trail:

From: Maxim Konovalov <maxim@macomnet.ru>
To: Young Hyun <youngh@caida.org>
Cc: bug-followup@freebsd.org
Subject: Re: kern/100940: passing file descriptor over datagram UNIX domain
 socket crashes kernel
Date: Sun, 30 Jul 2006 00:21:32 +0400 (MSD)

 Hello,
 
 [...]
 > >Release:        FreeBSD 6.1-STABLE-200607 i386
 > >Organization:
 > CAIDA/UCSD
 > >Environment:
 > System: FreeBSD brak.caida.org 6.1-STABLE-200607 FreeBSD
 > 6.1-STABLE-200607 #0: Wed Jul 26 18:32:47 PDT 2006
 > root@brak.caida.org:/usr/src/sys/i386/compile/SMP i386
 >
 >
 >
 > >Description:
 >
 > Passing file descriptors over a SOCK_DGRAM UNIX domain socket crashes
 > the kernel with "Fatal trap 12: page fault while in kernel mode".
 > This bug is probably closely related to (but not identical with)
 > PR kern/93914.  In PR 93914, the sample code passes file descriptors
 > over a SOCK_STREAM socket, and it no longer causes a kernel crash under
 > 6.1-STABLE.  My own SOCK_STREAM-based test program that crashes the 5.4
 > kernel also fails to crash the 6.1 kernel.  In the present case, the
 > kernel crashes in code specific to the handling of SOCK_DGRAM sockets
 > (though, of course, that doesn't exclude the possibility of the same
 > underlying problem behind this and PR 93914 and which somehow only got
 > masked for the SOCK_STREAM case in 6.1-STABLE).
 >
 > #0  doadump () at pcpu.h:165
 > #1  0xc06632ca in boot (howto=260) at ../../../kern/kern_shutdown.c:409
 > #2  0xc0663621 in panic (fmt=0xc08913da "%s")
 >     at ../../../kern/kern_shutdown.c:565
 > #3  0xc085bfd6 in trap_fatal (frame=0xde946b54, eva=8)
 >     at ../../../i386/i386/trap.c:836
 > #4  0xc085bcdf in trap_pfault (frame=0xde946b54, usermode=0, eva=8)
 >     at ../../../i386/i386/trap.c:744
 > #5  0xc085b8d5 in trap (frame=
 >       {tf_fs = -1017839608, tf_es = -1014300632, tf_ds = 40, tf_edi = -1016513388, tf_esi = -1017187284, tf_ebp = -560698452, tf_isp = -560698496, tf_ebx = 0, tf_edx = 0, tf_ecx = 0, tf_eax = 0, tf_trapno = 12, tf_err = 0, tf_eip = -1066768486, tf_cs = 32, tf_eflags = 66118, tf_esp = -560698440, tf_ss = -1065577831})
 >     at ../../../i386/i386/trap.c:434
 > #6  0xc08484da in calltrap () at ../../../i386/i386/exception.s:139
 > #7  0xc06a679a in uipc_send (so=0xc35ef42c, flags=0, m=0xc38f3500,
 >     nam=0xc35474f0, control=0xc351b900, td=0xc3552900)
 >     at ../../../kern/uipc_usrreq.c:432
 > #8  0xc069dc8f in sosend (so=0xc35ef42c, addr=0xc35474f0, uio=0xde946c3c,
 >     top=0xc38f3500, control=0xc351d600, flags=0, td=0xc3552900)
 >     at ../../../kern/uipc_socket.c:836
 > #9  0xc06a36a5 in kern_sendit (td=0xc3552900, s=3, mp=0xde946cb4, flags=0,
 >     control=0xc351d600, segflg=UIO_USERSPACE)
 >     at ../../../kern/uipc_syscalls.c:772
 > #10 0xc06a355f in sendit (td=0xc3552900, s=3, mp=0xde946cb4, flags=0)
 >     at ../../../kern/uipc_syscalls.c:712
 > #11 0xc06a3976 in sendmsg (td=0xc3552900, uap=0xde946d04)
 >     at ../../../kern/uipc_syscalls.c:920
 > #12 0xc085c31b in syscall (frame=
 >       {tf_fs = 671350843, tf_es = -1078001605, tf_ds = -1078001605, tf_edi = 671410632, tf_esi = -1077940984, tf_ebp = -1077941096, tf_isp = -560698012, tf_ebx = 1, tf_edx = 17, tf_ecx = 17, tf_eax = 28, tf_trapno = 32, tf_err = 2, tf_eip = 672073267, tf_cs = 51, tf_eflags = 658, tf_esp = -1077941300, tf_ss = 59})
 >     at ../../../i386/i386/trap.c:981
 > #13 0xc084852f in Xint0x80_syscall () at ../../../i386/i386/exception.s:200
 > #14 0x00000033 in ?? ()
 > Previous frame inner to this frame (corrupt stack?)
 > (kgdb) up 7
 > #7  0xc06a679a in uipc_send (so=0xc35ef42c, flags=0, m=0xc38f3500,
 >     nam=0xc35474f0, control=0xc351b900, td=0xc3552900)
 >     at ../../../kern/uipc_usrreq.c:432
 > 432                     so2 = unp->unp_conn->unp_socket;
 > (kgdb) p unp->unp_conn
 > $1 = (struct unpcb *) 0x0
 > (kgdb) p *unp
 > $2 = {unp_link = {le_next = 0xc3693a64, le_prev = 0xc09a7098},
 >   unp_socket = 0xc35ef42c, unp_vnode = 0xc38d2660, unp_ino = 0,
 >   unp_conn = 0x0, unp_refs = {lh_first = 0x0}, unp_reflink = {le_next = 0x0,
 >     le_prev = 0xc36938d8}, unp_addr = 0xc3547390, unp_cc = 0, unp_mbcnt = 0,
 >   unp_gencnt = 100, unp_flags = 0, unp_peercred = {cr_version = 0, cr_uid = 0,
 >     cr_ngroups = 0, cr_groups = {0 <repeats 16 times>}, _cr_unused1 = 0x0}}
 > (kgdb)
 >
 > >How-To-Repeat:
 >
 > Execute the attached sample code, and let it run for a few minutes.  The
 > sample code is the same one supplied for PR kern/91224, except there it
 > caused a crash in a different part of the kernel (PR 91224 is closed).
 [...]
 
 While 6.1-RELEASE panics on your testcase I could not reproduce the
 panic on today 6.1-STABLE.   Could you please upgrade your system and
 test?  Thanks!
 
 -- 
 Maxim Konovalov
Responsible-Changed-From-To: freebsd-bugs->rwatson 
Responsible-Changed-By: rwatson 
Responsible-Changed-When: Mon Jul 31 17:27:10 UTC 2006 
Responsible-Changed-Why:  
Grab ownership of this PR, since I have a strong interest in the UNIX 
domain socket code.  The problem as seen here with WITNESS in place on 
a 7.x kernel is: 

tiger-1# ./fd_passing  
Kernel page fault with the following non-sleepable locks held: 
exclusive sleep mutex unp r = 0 (0xc0a56714) locked @ kern/uipc_usrreq.c:999 
KDB: stack backtrace: 
kdb_backtrace(1,c7193b04,c,c7164a20,e974fad4,...) at kdb_backtrace+0x29 
witness_warn(5,0,c094452a) at witness_warn+0x192 
trap(c7160008,c76d0028,28,0,c7497578,...) at trap+0x108 
calltrap() at calltrap+0x5 
--- trap 0xc, eip = 0xc06e38ff, esp = 0xe974fb1c, ebp = 0xe974fb44 --- 
uipc_send(c76f4530,0,c73e0200,c7121160,c73e0500,c7164a20) at uipc_send+0xdb 
sosend_generic(c76f4530,c7121160,e974fbe4,c73e0200,c73e0300,...) at sosend_generic+0x3e5 
sosend(c76f4530,c7121160,e974fbe4,0,c73e0300,0,c7164a20) at sosend+0x3c 
kern_sendit(c7164a20,3,e974fc5c,0,c73e0300,0) at kern_sendit+0x101 
sendit(c7164a20,3,e974fc5c,0,c7121170,...) at sendit+0x87 
sendmsg(c7164a20,e974fd04) at sendmsg+0x53 
syscall(3b,3b,3b,bfbfed5c,bfbfed54,...) at syscall+0x256 
Xint0x80_syscall() at Xint0x80_syscall+0x1f 
--- syscall (28, FreeBSD ELF32, sendmsg), eip = 0x2812fab3, esp = 0xbfbfec1c, ebp = 0xbfbfece8 --- 


Fatal trap 12: page fault while in kernel mode 
cpuid = 2; apic id = 06 
fault virtual address   = 0x8 
fault code              = supervisor read, page not present 
instruction pointer     = 0x20:0xc06e38ff 
stack pointer           = 0x28:0xe974fb1c 
frame pointer           = 0x28:0xe974fb44 
code segment            = base 0x0, limit 0xfffff, type 0x1b 
= DPL 0, pres 1, def32 1, gran 1 
processor eflags        = interrupt enabled, resume, IOPL = 0 
current process         = 901 (fd_passing) 
[thread pid 901 tid 100075 ] 
Stopped at      uipc_send+0xdb: movl    0x8(%ecx),%edi 
db> bt 
Tracing pid 901 tid 100075 td 0xc7164a20 
uipc_send(c76f4530,0,c73e0200,c7121160,c73e0500,c7164a20) at uipc_send+0xdb 
sosend_generic(c76f4530,c7121160,e974fbe4,c73e0200,c73e0300,...) at sosend_generic+0x3e5 
sosend(c76f4530,c7121160,e974fbe4,0,c73e0300,0,c7164a20) at sosend+0x3c 
kern_sendit(c7164a20,3,e974fc5c,0,c73e0300,0) at kern_sendit+0x101 
sendit(c7164a20,3,e974fc5c,0,c7121170,...) at sendit+0x87 
sendmsg(c7164a20,e974fd04) at sendmsg+0x53 
syscall(3b,3b,3b,bfbfed5c,bfbfed54,...) at syscall+0x256 
Xint0x80_syscall() at Xint0x80_syscall+0x1f 
--- syscall (28, FreeBSD ELF32, sendmsg), eip = 0x2812fab3, esp = 0xbfbfec1c, ebp = 0xbfbfece8 --- 
db> show alllocks 
Process 901 (fd_passing) thread 0xc7164a20 (100075) 
exclusive sleep mutex unp r = 0 (0xc0a56714) locked @ kern/uipc_usrreq.c:999 

995         if (vp != NULL) 
996                 vput(vp); 
997         mtx_unlock(&Giant); 
998         free(sa, M_SONAME); 
999         UNP_LOCK(); 
1000         unp->unp_flags &= ~UNP_CONNECTING; 
1001         return (error); 
1002 } 
1003 
1004 static int 
1005 unp_connect2(struct socket *so, struct socket *so2, int req) 
1006 { 

(gdb) l *0xc06e38ff 
0xc06e38ff is in uipc_send (../../../kern/uipc_usrreq.c:609). 
604                                     error = ENOTCONN; 
605                                     break; 
606                             } 
607                     } 
608                     unp2 = unp->unp_conn; 
609                     so2 = unp2->unp_socket; 
610                     if (unp->unp_addr != NULL) 
611                             from = (struct sockaddr *)unp->unp_addr; 
612                     else 
613                             from = &sun_noname; 

The problem appears to be that unp_connect() can return with the socket 
disconnected as a result of dropping the UNIX domain socket subsystem 
lock while discarding the vnode reference for the remote socket, so that 
the socket is disconnected before the send can proceed.  Probably the 
answer is to add a check for a NULL unp->unp_conn pointer and return an 
appropriate error, as the connect() and send() cannot be performed 
atomically.  I will follow up with a patch. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=100940 
State-Changed-From-To: open->patched 
State-Changed-By: rwatson 
State-Changed-When: Mon Jul 31 17:58:06 UTC 2006 
State-Changed-Why:  
The attached patch is for 7-CURRENT, but with minor tweaking should be 
applicable to 6-STABLE.  The essense is that sendto() is non-atomic, so 
the connection may be closed between the connect() portion of sendto() 
and the send() portion.  This patch detects that case and returns a 
not-connected error.  I'm not entirely convinced this is the right 
errno, possibly it should be ECONNRESET as the connection has closed. 
Feedback as to whether this patch (modified as necessary to run on 6.x) 
prevents the panic would be most welcome. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=100940 
State-Changed-From-To: patched->feedback 
State-Changed-By: rwatson 
State-Changed-When: Mon Jul 31 18:00:36 UTC 2006 
State-Changed-Why:  
Change to feedback to request feedback; attach patch previously 
omitted. 

Index: uipc_usrreq.c 
=================================================================== 
RCS file: /zoo/cvsup/FreeBSD-CVS/src/sys/kern/uipc_usrreq.c,v 
retrieving revision 1.182 
diff -u -r1.182 uipc_usrreq.c 
--- uipc_usrreq.c	26 Jul 2006 19:16:34 -0000	1.182 
+++ uipc_usrreq.c	31 Jul 2006 17:51:45 -0000 
@@ -605,7 +605,18 @@ 
break; 
} 
} 
+		/* 
+		 * Because connect() and send() are non-atomic in a sendto() 
+		 * with a target address, it's possible that the socket will 
+		 * have disconnected before the send() can run.  In that case 
+		 * return the slightly counter-intuitive but otherwise 
+		 * correct error that the socket is not connected. 
+		 */ 
unp2 = unp->unp_conn; 
+		if (unp2 == NULL) { 
+			error = ENOTCONN; 
+			break; 
+		} 
so2 = unp2->unp_socket; 
if (unp->unp_addr != NULL) 
from = (struct sockaddr *)unp->unp_addr; 
@@ -650,9 +661,18 @@ 
error = EPIPE; 
break; 
} 
+		/* 
+		 * Because connect() and send() are non-atomic in a sendto() 
+		 * with a target address, it's possible that the socket will 
+		 * have disconnected before the send() can run.  In that case 
+		 * return the slightly counter-intuitive but otherwise 
+		 * correct error that the socket is not connected. 
+		 */ 
unp2 = unp->unp_conn; 
-		if (unp2 == NULL) 
-			panic("uipc_send connected but no connection?"); 
+		if (unp2 == NULL) { 
+			error = ENOTCONN; 
+			break; 
+		} 
so2 = unp2->unp_socket; 
SOCKBUF_LOCK(&so2->so_rcv); 
if (unp2->unp_flags & UNP_WANTCRED) { 


This patch can also be downloaded from: 

http://www.watson.org/~robert/freebsd/netperf/20060731-uipc-con-disconn-race.diff 


http://www.freebsd.org/cgi/query-pr.cgi?pr=100940 

From: Robert Watson <rwatson@FreeBSD.org>
To: Young Hyun <youngh@caida.org>
Cc: FreeBSD-gnats-submit@freebsd.org
Subject: Re: kern/100940: passing file descriptor over datagram UNIX domain
 socket crashes kernel
Date: Fri, 10 Nov 2006 17:02:54 +0000 (GMT)

 On Mon, 31 Jul 2006, Young Hyun wrote:
 
 > Thank you very much for looking into this problem and fixing it so quickly! 
 > I'm currently designing a secure distributed network measurement 
 > infrastructure in which file descriptor passing forms the foundation of the 
 > security architecture, so I'm very dependent on this feature working 
 > properly.  I'll try your patch on 6.1-STABLE and follow up on the PR. 
 > Thanks again.
 
 Young,
 
 Any luck checking up on this bug?  I'd like to MFC something to RELENG_6.
 
 Thanks,
 
 Robert N M Watson
 Computer Laboratory
 University of Cambridge
 
 >
 > --Young
 >
 > On Jul 31, 2006, at 6:00 PM, Robert Watson wrote:
 >
 >> Synopsis: passing file descriptor over datagram UNIX domain socket crashes 
 >> kernel
 >> 
 >> State-Changed-From-To: open->patched
 >> State-Changed-By: rwatson
 >> State-Changed-When: Mon Jul 31 17:58:06 UTC 2006
 >> State-Changed-Why:
 >> The attached patch is for 7-CURRENT, but with minor tweaking should be
 >> applicable to 6-STABLE.  The essense is that sendto() is non-atomic, so
 >> the connection may be closed between the connect() portion of sendto()
 >> and the send() portion.  This patch detects that case and returns a
 >> not-connected error.  I'm not entirely convinced this is the right
 >> errno, possibly it should be ECONNRESET as the connection has closed.
 >> Feedback as to whether this patch (modified as necessary to run on 6.x)
 >> prevents the panic would be most welcome.
 >> 
 >> 
 >> http://www.freebsd.org/cgi/query-pr.cgi?pr=100940
 >

From: Young Hyun <youngh@caida.org>
To: bug-followup@FreeBSD.org,
 Young Hyun <youngh@caida.org>
Cc:  
Subject: Re: kern/100940: passing file descriptor over datagram UNIX domain socket crashes kernel
Date: Tue, 14 Nov 2006 10:41:31 -0800

 I tried Robert Watson's patch on FreeBSD 6.1-STABLE-200607 (the  
 kernel for which the PR was submitted), and it seems to have fixed  
 the crash for my test program.  I ran the test program overnight for  
 good measure, and as expected, I got the occasional ENOTCONN which  
 the patched kernel now returns but no crash.
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/100940: commit references a PR
Date: Thu,  1 Mar 2007 11:07:29 +0000 (UTC)

 rwatson     2007-03-01 11:07:18 UTC
 
   FreeBSD src repository
 
   Modified files:        (Branch: RELENG_6)
     sys/kern             uipc_usrreq.c 
   Log:
   Merge uipc_usrreq.c:1.183 from HEAD to RELENG_6:
   
     Close a race that occurs when using sendto() to connect and send on a
     UNIX domain socket at the same time as the remote host is closing the
     new connections as quickly as they open.  Since the connect() and
     send() paths are non-atomic with respect to another, it is possible
     for the second thread's close() call to disconnect the two sockets
     as connect() returns, leading to the consumer (which plans to send())
     with a NULL kernel pointer to its proposed peer.  As a result, after
     acquiring the UNIX domain socket subsystem lock, we need to revalidate
     the connection pointers even though connect() has technically succeed,
     and reurn an error to say that there's no connection on which to
     perform the send.
   
     We might want to rethink the specific errno number, perhaps ECONNRESET
     would be better.
   
     Reported by:    Young Hyun <youngh at caida dot org>
   
   PR:             100940
   
   Revision   Changes    Path
   1.155.2.9  +23 -2     src/sys/kern/uipc_usrreq.c
 _______________________________________________
 cvs-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/cvs-all
 To unsubscribe, send any mail to "cvs-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: feedback->closed 
State-Changed-By: rwatson 
State-Changed-When: Thu Mar 1 11:12:44 UTC 2007 
State-Changed-Why:  
I've now merged the fix to RELENG_6 as uipc_usrreq.c:1.155.2.9.  If you 
experience any further difficulties of this sort, please let me know. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=100940 
>Unformatted:
