From nobody@FreeBSD.org  Thu Apr  4 02:32:16 2013
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115])
	by hub.freebsd.org (Postfix) with ESMTP id 7368DD86
	for <freebsd-gnats-submit@FreeBSD.org>; Thu,  4 Apr 2013 02:32:16 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id 643D2FF2
	for <freebsd-gnats-submit@FreeBSD.org>; Thu,  4 Apr 2013 02:32:16 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.5/8.14.5) with ESMTP id r342WFf8020055
	for <freebsd-gnats-submit@FreeBSD.org>; Thu, 4 Apr 2013 02:32:15 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.5/8.14.5/Submit) id r342WFTC020054;
	Thu, 4 Apr 2013 02:32:15 GMT
	(envelope-from nobody)
Message-Id: <201304040232.r342WFTC020054@red.freebsd.org>
Date: Thu, 4 Apr 2013 02:32:15 GMT
From: Brian Demsky <bdemsky@uci.edu>
To: freebsd-gnats-submit@FreeBSD.org
Subject: Swapcontext can get compiled incorrectly
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         177624
>Category:       kern
>Synopsis:       [libc] [patch] Swapcontext can get compiled incorrectly
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Apr 04 02:40:00 UTC 2013
>Closed-Date:    
>Last-Modified:  Wed Apr 17 21:11:56 UTC 2013
>Originator:     Brian Demsky
>Release:        OS X distribution of libc
>Organization:
UCI
>Environment:
>Description:
Here is the code for swap context:

int
swapcontext(ucontext_t *oucp, const ucontext_t *ucp)
{
      int ret;

      if ((oucp == NULL) || (ucp == NULL)) {
              errno = EINVAL;
              return (-1);
      }
      oucp->uc_flags &= ~UCF_SWAPPED;
      ret = getcontext(oucp);
      if ((ret == 0) && !(oucp->uc_flags & UCF_SWAPPED)) {
              oucp->uc_flags |= UCF_SWAPPED;
              ret = setcontext(ucp);
      }
      return (ret);
}

On the OS X port of libc in Mac OSX 10.7.5, this gets compiled as:

0x00007fff901e86b2 <swapcontext+0>:     push   %r14
0x00007fff901e86b4 <swapcontext+2>:     push   %rbx
0x00007fff901e86b5 <swapcontext+3>:     sub    $0x8,%rsp
0x00007fff901e86b9 <swapcontext+7>:     test   %rdi,%rdi
0x00007fff901e86bc <swapcontext+10>:    je     0x7fff901e86c6 <swapcontext+20>
0x00007fff901e86be <swapcontext+12>:    mov    %rsi,%rbx
0x00007fff901e86c1 <swapcontext+15>:    test   %rbx,%rbx
0x00007fff901e86c4 <swapcontext+18>:    jne    0x7fff901e86d8 <swapcontext+38>
0x00007fff901e86c6 <swapcontext+20>:    callq  0x7fff90262c88 <__error>
0x00007fff901e86cb <swapcontext+25>:    movl   $0x16,(%rax)
0x00007fff901e86d1 <swapcontext+31>:    mov    $0xffffffff,%eax
0x00007fff901e86d6 <swapcontext+36>:    jmp    0x7fff901e86f3 <swapcontext+65>
0x00007fff901e86d8 <swapcontext+38>:    mov    %rdi,%r14
0x00007fff901e86db <swapcontext+41>:    andb   $0x7f,0x3(%r14)
0x00007fff901e86e0 <swapcontext+46>:    mov    %r14,%rdi
0x00007fff901e86e3 <swapcontext+49>:    callq  0x7fff901e87af <getcontext>
0x00007fff901e86e8 <swapcontext+54>:    test   %eax,%eax
0x00007fff901e86ea <swapcontext+56>:    jne    0x7fff901e86f3 <swapcontext+65>
0x00007fff901e86ec <swapcontext+58>:    mov    (%r14),%ecx
0x00007fff901e86ef <swapcontext+61>:    test   %ecx,%ecx
0x00007fff901e86f1 <swapcontext+63>:    jns    0x7fff901e86fb <swapcontext+73>
0x00007fff901e86f3 <swapcontext+65>:    add    $0x8,%rsp
0x00007fff901e86f7 <swapcontext+69>:    pop    %rbx
0x00007fff901e86f8 <swapcontext+70>:    pop    %r14
0x00007fff901e86fa <swapcontext+72>:    retq   
0x00007fff901e86fb <swapcontext+73>:    or     $0x80000000,%ecx
0x00007fff901e8701 <swapcontext+79>:    mov    %ecx,(%r14)
0x00007fff901e8704 <swapcontext+82>:    mov    %rbx,%rdi
0x00007fff901e8707 <swapcontext+85>:    add    $0x8,%rsp
0x00007fff901e870b <swapcontext+89>:    pop    %rbx
0x00007fff901e870c <swapcontext+90>:    pop    %r14
0x00007fff901e870e <swapcontext+92>:    jmpq   0x7fff90262855 <setcontext>

The problem is that rbx is callee saved by compiled version of swapcontext and then reused before getcontext is called.  Getcontext then stores the wrong value for rbx and setcontext later restores the wrong value for rbx.  If the caller had any value in rbx, it has been trashed at this point.

Brian
>How-To-Repeat:

>Fix:


>Release-Note:
>Audit-Trail:

From: Brian Demsky <bdemsky@uci.edu>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Wed, 03 Apr 2013 20:01:14 -0700

 QSBxdWljayBub3RlIHRoYXQgdGhlIHRhaWwgY2FsbCB0byBzZXQgY29udGV4dCBpcyB3aGF0IGFj
 dHVhbGx5IHRyYXNoZXMgdGhlIGNvcGllcyBvZiByYnggYW5kIHIxNCBvbiB0aGUgc3RhY2su
 

From: Eitan Adler <lists@eitanadler.com>
To: bug-followup <bug-followup@freebsd.org>
Cc:  
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Thu, 4 Apr 2013 08:11:34 -0400

 ---------- Forwarded message ----------
 From: Brian Demsky <bdemsky@uci.edu>
 Date: 4 April 2013 01:25
 Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
 To: Eitan Adler <lists@eitanadler.com>
 Cc: freebsd-bugs@freebsd.org
 
 
 > The analysis is a little wrong about the problem.  Ultimately, the tail c=
 all to set context trashes the copies of bx and r14 on the stack=E2=80=A6.
 
 --=20
 Eitan Adler

From: Bruce Evans <brde@optusnet.com.au>
To: Brian Demsky <bdemsky@uci.edu>
Cc: freebsd-gnats-submit@FreeBSD.org, freebsd-bugs@FreeBSD.org
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Fri, 5 Apr 2013 00:46:32 +1100 (EST)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
 
 --0-2020526060-1365083192=:1025
 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
 Content-Transfer-Encoding: QUOTED-PRINTABLE
 
 On Thu, 4 Apr 2013, Brian Demsky wrote:
 
 >> Description:
 > Here is the code for swap context:
 >
 > int
 > swapcontext(ucontext_t *oucp, const ucontext_t *ucp)
 > {
 >      int ret;
 >
 >      if ((oucp =3D=3D NULL) || (ucp =3D=3D NULL)) {
 >              errno =3D EINVAL;
 >              return (-1);
 >      }
 >      oucp->uc_flags &=3D ~UCF_SWAPPED;
 >      ret =3D getcontext(oucp);
 >      if ((ret =3D=3D 0) && !(oucp->uc_flags & UCF_SWAPPED)) {
 >              oucp->uc_flags |=3D UCF_SWAPPED;
 >              ret =3D setcontext(ucp);
 >      }
 >      return (ret);
 > }
 
 > On the OS X port of libc in Mac OSX 10.7.5, this gets compiled as:
 
 > ...
 > 0x00007fff901e870b <swapcontext+89>:    pop    %rbx
 > 0x00007fff901e870c <swapcontext+90>:    pop    %r14
 > 0x00007fff901e870e <swapcontext+92>:    jmpq   0x7fff90262855 <setcontext=
 >
 >
 > The problem is that rbx is callee saved by compiled version of swapcontex=
 t and then reused before getcontext is called.  Getcontext then stores the =
 wrong value for rbx and setcontext later restores the wrong value for rbx. =
  If the caller had any value in rbx, it has been trashed at this point.
 
 Later you wrote:
 
 > The analysis is a little wrong about the problem.  Ultimately, the tail c=
 all to set context trashes the copies of bx and r14 on the stack=85.
 
 The bug seems to be in setcontext().  It must preserve the callee-saved
 registers, not restore them.  This would happen automatically if more
 were written in C.  But setcontext() can't be written entirely in C,
 since it must save all callee-saved registers including ones not used
 and therefore not normally saved by any C function that it might be in,
 and possibly also including callee-saved registers for nonstandard or
 non-C ABIs.  In FreeBSD, it is apparently always a syscall.
 
 In FreeBSD, this bug doesn't occur on at least amd64 or i386 because the
 C version of swapcontext() has never been used on these arches.
 swapcontext() is a syscall too.
 
 If setcontext() is a syscall, then it has a minor problem even knowing
 what the ABI's callee-saved registers are.  At least the FreeBSD amd64
 version doesn't know anything about this.  It uses much the same code
 as for asynchronous signal handling, so it just restores all registers,
 including scratch ones that don't need to be preserved.  It even
 restores the return register to the trap frame, although it can't
 return this to userland.  This can probably be fixed a library wrapper.
 
 In FreeBSD on amd64, getcontext(), setcontext() and swapcontext() are
 all syscalls, but their documenation is misplaced in a section 3 man
 page.  swapcontext() is misplaced together with makecontext(), which
 actually is library function.  Oops, not quite.  getcontext() is
 actually a small wrapper around an undocumented setcontext() syscall
 (this is needed to adjust the instruction pointer register).  Only
 makecontext() is in C, and not a wrapper.  I don't count the wrappers
 that just make a syscall as library functions, since the corresponding
 syscalls can be made easily without using the C library and this should
 be documented.
 
 Bruce
 --0-2020526060-1365083192=:1025--

From: Bruce Evans <brde@optusnet.com.au>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Brian Demsky <bdemsky@uci.edu>, freebsd-bugs@freebsd.org,
        freebsd-gnats-submit@freebsd.org
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Fri, 5 Apr 2013 02:16:02 +1100 (EST)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
 
 --0-1588918552-1365088562=:1350
 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
 Content-Transfer-Encoding: QUOTED-PRINTABLE
 
 On Fri, 5 Apr 2013, Bruce Evans wrote:
 
 > The bug seems to be in setcontext().  It must preserve the callee-saved
 > registers, not restore them.  This would happen automatically if more
 > were written in C.  But setcontext() can't be written entirely in C,
 > since it must save all callee-saved registers including ones not used
 > and therefore not normally saved by any C function that it might be in,
 > and possibly also including callee-saved registers for nonstandard or
 > non-C ABIs.  In FreeBSD, it is apparently always a syscall.
 
 This is more than a little wrong.  When setcontext() succeeds, it
 doesn't return here.  Then it acts like longjmp() and must restore all
 the callee-saved to whatever they were when getcontext() was called.
 Otherwise, it must not clobber any callee-saved registers (then it
 differs from longjmp().  longjmp() just can't fail).
 
 Now I don't see any bug here.  If the saved state is returned to, then
 it is as if getcontext() returned, and the intermediately-saved %rbx
 is correct (we will restore the orginal %rbx if we return).  If
 setcontext() fails, then it should preserve all callee-saved registers.
 In the tail-call case, we have already restored the orginal %rbx and
 the failing setcontext() should preserve that.
 
 Bruce
 --0-1588918552-1365088562=:1350--

From: Brian Demsky <bdemsky@uci.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-bugs@freebsd.org, freebsd-gnats-submit@freebsd.org
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Thu, 4 Apr 2013 09:43:06 -0700

 On Apr 4, 2013, at 8:16 AM, Bruce Evans wrote:
 
 > This is more than a little wrong.  When setcontext() succeeds, it
 > doesn't return here.  Then it acts like longjmp() and must restore all
 > the callee-saved to whatever they were when getcontext() was called.
 > Otherwise, it must not clobber any callee-saved registers (then it
 > differs from longjmp().  longjmp() just can't fail).
 >=20
 > Now I don't see any bug here.  If the saved state is returned to, then
 > it is as if getcontext() returned, and the intermediately-saved %rbx
 > is correct (we will restore the orginal %rbx if we return).  If
 > setcontext() fails, then it should preserve all callee-saved =
 registers.
 > In the tail-call case, we have already restored the orginal %rbx and
 > the failing setcontext() should preserve that.
 >=20
 > Bruce
 
 Take at setcontext:=20
 
 (gdb) disassemble setcontext
 Dump of assembler code for function setcontext:
 0x00007fff90262855 <setcontext+0>:      push   %rbx
 0x00007fff90262856 <setcontext+1>:      lea    0x38(%rdi),%rbx
 0x00007fff9026285a <setcontext+5>:      cmp    0x30(%rdi),%rbx
 0x00007fff9026285e <setcontext+9>:      je     0x7fff90262864 =
 <setcontext+15>
 0x00007fff90262860 <setcontext+11>:     mov    %rbx,0x30(%rdi)
 0x00007fff90262864 <setcontext+15>:     mov    0x4(%rdi),%edi
 0x00007fff90262867 <setcontext+18>:     callq  0x7fff90262998 =
 <sigsetmask>
 0x00007fff9026286c <setcontext+23>:     mov    %rbx,%rdi
 0x00007fff9026286f <setcontext+26>:     pop    %rbx
 0x00007fff90262870 <setcontext+27>:     jmpq   0x7fff90262875 =
 <_setcontext>
 End of assembler dump.
 
 The stack from swapcontext had rbx and r14 popped after getcontext =
 stored everything.  Now we push rbx and then later call sigsetmask.  =
 Those two actions guarantee that the memory locations where rbx and r14 =
 were on the stack have been overwritten.  When we later return to the =
 save context, it will start up swapcontext and pop the wrong values off =
 of the stack for rbx and r14.
 
 Brian
 
 
 

From: Bruce Evans <brde@optusnet.com.au>
To: Brian Demsky <bdemsky@uci.edu>
Cc: Bruce Evans <brde@optusnet.com.au>, freebsd-bugs@FreeBSD.org,
        freebsd-gnats-submit@FreeBSD.org
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Fri, 5 Apr 2013 06:38:51 +1100 (EST)

   This message is in MIME format.  The first part should be readable text,
   while the remaining parts are likely unreadable without MIME-aware tools.
 
 --0-627049122-1365104331=:2557
 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
 Content-Transfer-Encoding: QUOTED-PRINTABLE
 
 On Thu, 4 Apr 2013, Brian Demsky wrote:
 
 > Take at setcontext:
 >
 > (gdb) disassemble setcontext
 > Dump of assembler code for function setcontext:
 > 0x00007fff90262855 <setcontext+0>:      push   %rbx
 > 0x00007fff90262856 <setcontext+1>:      lea    0x38(%rdi),%rbx
 > 0x00007fff9026285a <setcontext+5>:      cmp    0x30(%rdi),%rbx
 > 0x00007fff9026285e <setcontext+9>:      je     0x7fff90262864 <setcontext=
 +15>
 > 0x00007fff90262860 <setcontext+11>:     mov    %rbx,0x30(%rdi)
 > 0x00007fff90262864 <setcontext+15>:     mov    0x4(%rdi),%edi
 > 0x00007fff90262867 <setcontext+18>:     callq  0x7fff90262998 <sigsetmask=
 >
 > 0x00007fff9026286c <setcontext+23>:     mov    %rbx,%rdi
 > 0x00007fff9026286f <setcontext+26>:     pop    %rbx
 > 0x00007fff90262870 <setcontext+27>:     jmpq   0x7fff90262875 <_setcontex=
 t>
 > End of assembler dump.
 >
 > The stack from swapcontext had rbx and r14 popped after getcontext stored=
  everything.  Now we push rbx and then later call sigsetmask.  Those two ac=
 tions guarantee that the memory locations where rbx and r14 were on the sta=
 ck have been overwritten.  When we later return to the save context, it wil=
 l start up swapcontext and pop the wrong values off of the stack for rbx an=
 d r14.
 
 Ah, it is not really rbx and r14, but rsp and the whole stack frame of
 swapcontext() that are mishandled.  Even returning from swapcontext()
 leaves the saved rsp pointing to garbage.  The stack frame could have
 had almost anything on it before it became invalid, but here it has mainly
 the saved rbx and r14 (not rbp; however, when compiled by clang on FreeBSD,
 it also has the saved rbp, and when compiled with -O0 it also has the
 local variable).
 
 Now I think swapcontext() can't be written in C, for the same reason that
 setjmp() can't be written in C -- the stack frame cannot be controlled in
 C, and if it has anything at all on it (even the return address requires
 special handling), then the stack pointer saved in the context becomes
 invalid when the function returns, or even earlier for tail calls and
 other optimizations.  Also, if the C function calls another function like
 the library getcontext(), then there are 2 stack frames below the saved
 stack pointer that are hard to control.  In FreeBSD on at least x86,
 getcontext() is a special non-automatically generated asm function to
 control this.  The automatically generated asm function would have
 only the return address in its stack frame, but even this is too much,
 and there is even more to control.  The comment about this is incomplete/
 partly wrong, but gives a good hint about the problem (*).
 
 This problem is avoided in setjmp() and longjmp() by not leaving anything
 on stack frames except the return address for setjmp() at the point where
 the stack pointer is saved.  The saved stack pointer is still technically
 invalid, since it points to the return address which goes away when
 setjmp() returns.  This is handled by not returning in the usual way in
 lonjmp().  Instead, setjmp() saves the return address and longjmp()
 restores it; the restored stack pointer becomes valid only at the end
 of setjmp() when the top word on it is restored.  FreeBSD getcontext()
 uses a more hackish method (*).
 
 The saved stack pointer becomes more than technically invalid if the
 function that called setjmp() or swapcontext() returns.  Then the
 caller's frame becomes invalid.  Such returns are invalid.  This
 restriction is clearly documented for setjmp() but not for swapcontext()
 in FreeBSD man pages and old C99 and POSIX specs.  The C99 restriction
 is only that longjmp() must not be invoked with the saved state after
 the function that saved the state using setjmp() returns.  Compilers
 must know about this and and not do optimizations (like tail calls?)
 that would invalidate the saved stack pointer before the function
 returns.
 
 (*) Here is the FreeBSD i386 getcontext():
 
 @ /*
 @  * This has to be magic to handle the multiple returns.
 
 Multiple =3D just 2.
 
 @  * Otherwise, the setcontext() syscall will return here and we'll
 @  * pop off the return address and go to the *setcontext* call.
 @  */
 
 Actually, we would pop off garbage and go to neverland, with a lower
 chance of going to setcontext than most places.
 
 @ =09.weak=09_getcontext
 @ =09.set=09_getcontext,__sys_getcontext
 @ =09.weak=09getcontext
 @ =09.set=09getcontext,__sys_getcontext
 @ ENTRY(__sys_getcontext)
 @ =09movl=09(%esp),%ecx=09/* save getcontext return address */
 @ =09mov=09$SYS_getcontext,%eax
 @ =09KERNCALL
 @ =09jb=09HIDENAME(cerror)
 
 When getcontext() fails, the return address on the stack is still valid
 and cerror depends on that.
 
 @ =09addl=09$4,%esp=09=09/* remove stale (setcontext) return address */
 
 Actually, remove the non-stale (our getcontext) return address if the
 return is not from setcontext, and remove stack garbage if the
 return is from setcontext.  The stack contents is not related to
 setcontext in either case.  In the garbage case, it started as the
 return address for another getcontext, but became garbage when that
 returned.
 
 @ =09jmp=09*%ecx=09=09/* restore return address */
 
 We want our return address in both cases.  We don't know which case
 applies and use same code for both.  The comment is imprecise.  We
 don't restore the return address.  What we do is return.
 
 The code should be changed to match the comment (don't adjust %esp,
 but actually restore the return address to it).  This method is used
 without comment by FreeBSD x86 longjmp()).  Optimizing this for speed
 is unimportant, but this is probably faster as well as cleaner.  The
 jmp may have been faster 20 years ago, but now it unbalances call/return
 branch prediction.
 
 The libc swapcontext() can probably be fixed by copying
 setjmp()/longjmp().  Except setjmp()/longjmp() has fundamentally broken
 stack handling too, at least when setjmp() is actually sigsetjmp().
 Then it is necessary to restore the stack pointer atomically with
 restoring the signal mask, and this seems to be impossible with using
 a single syscall that does both.  The 2 different orders of restoring
 them give different problems.  In the above, you seem to have
 setcontext() in userland, with some signal unmasking, so I think it
 has the same races as setjmp()/longjmp().  Perhaps an atomic syscall
 for setcontext() is enough.
 
 Bruce
 --0-627049122-1365104331=:2557--

From: David Xu <davidxu@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Brian Demsky <bdemsky@uci.edu>, freebsd-bugs@FreeBSD.org,
        freebsd-gnats-submit@FreeBSD.org
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Sun, 07 Apr 2013 10:41:29 +0800

 On 2013/04/05 03:38, Bruce Evans wrote:
 > Now I think swapcontext() can't be written in C, for the same reason that
 > setjmp() can't be written in C -- the stack frame cannot be controlled in
 > C, and if it has anything at all on it (even the return address requires
 > special handling), then the stack pointer saved in the context becomes
 > invalid when the function returns, or even earlier for tail calls and
 > other optimizations.
 
 This reminds me that I can not override swapcontext in libthr, I had
 put a wrapper for swapcontext in libthr, I am considering to remove it
 now ...
 
 

From: David Xu <davidxu@freebsd.org>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-gnats-submit@FreeBSD.org, freebsd-bugs@FreeBSD.org,
        Brian Demsky <bdemsky@uci.edu>
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Wed, 17 Apr 2013 17:38:02 +0800

 I have worked out a patch:
 
 http://people.freebsd.org/~davidxu/patch/libc_swapcontext.diff
 
>Unformatted:
