From nobody@FreeBSD.org  Tue Jul 17 05:36:24 2012
Return-Path: <nobody@FreeBSD.org>
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E3A88106566B
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 17 Jul 2012 05:36:24 +0000 (UTC)
	(envelope-from nobody@FreeBSD.org)
Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22])
	by mx1.freebsd.org (Postfix) with ESMTP id C41A28FC0C
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 17 Jul 2012 05:36:24 +0000 (UTC)
Received: from red.freebsd.org (localhost [127.0.0.1])
	by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q6H5aOUs053863
	for <freebsd-gnats-submit@FreeBSD.org>; Tue, 17 Jul 2012 05:36:24 GMT
	(envelope-from nobody@red.freebsd.org)
Received: (from nobody@localhost)
	by red.freebsd.org (8.14.4/8.14.4/Submit) id q6H5aOjI053829;
	Tue, 17 Jul 2012 05:36:24 GMT
	(envelope-from nobody)
Message-Id: <201207170536.q6H5aOjI053829@red.freebsd.org>
Date: Tue, 17 Jul 2012 05:36:24 GMT
From: Ed Alley <wea@llnl.gov>
To: freebsd-gnats-submit@FreeBSD.org
Subject: siginfo, si_code for fpe errors when error occurs using the SSE math processor
X-Send-Pr-Version: www-3.1
X-GNATS-Notify:

>Number:         169927
>Category:       amd64
>Synopsis:       siginfo, si_code for fpe errors when error occurs using the SSE math processor
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kib
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          update
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jul 17 05:40:10 UTC 2012
>Closed-Date:    Mon Jan 28 16:49:24 UTC 2013
>Last-Modified:  Mon Jan 28 16:49:24 UTC 2013
>Originator:     Ed Alley
>Release:        8.2-RELEASE amd64
>Organization:
>Environment:
System: FreeBSD epos.domos.org 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Mon Jun 25 00:07:01 PDT 2012 wea@epos.domos.org:/usr/src/sys/amd64/compile/EPOS.6 amd64
    machine is an Intel i5 x86-64

>Description:
     According to sigaction(2) by choosing SA_SIGINGO as one of the sa_flags
 one can catch sigfpe signals. What is returned to the signal handler
 for the sigfpe is a structure defined in siginfo(3). Within that
 structure: the si_code entry gives the error code as defined in siginfo(3)
 man page. This is useful when de-bugging a large code,
 because one can retrieve not only the actual fpe error: divide by zero,
 overflow or etc, but also the location of the error is also returned.

     For FPE errors using the x87 everything works fine, but when
the SSE is used for floating point calculations the si_code that
is returned is always zero. I have a fix for this which I have
included as a patch for FreeBSD 8.2 release. I have been applying
this fix since I got my 64-bit box since FreeBSD 7.x. I have not
sent this patch in, since I had assumed that the problem would get
fixed in later releases. However, that has not been the case so
here is the patch (upgraded to version 8.2) that I have been using.

  To apply the patch, just cd into /usr/src/sys/amd64 and apply the
patch. It will operate on two files in the directory amd64:
trap.c and fpu.c.

  In trap.c a single line will be replaced as can be seen in the patch.
This line occurs in the user trap switch for the case: T_XMMFLT. The
line ucode = 0; is replaced with ucode = fputrap(); This then will call
the fputrap() code similarly to the T_ARITHTRAP case.

  The process fputrap() is found in file fpu.c which is where the
rest of the patch operates. In function fputrap() I added additional
code to access the mxcsr status bits. These are then ORed into
the status code before the argument to the fpetable[] is calculated.

  Following the x87 case, before I return, I zero out the error flags
in the mxcsr register. Let me know if this is useful, also I have
not found an equivalent instruction to the fnclex (that zeros out
the x87 error flags) for easily zeroing out the mxcsr error flags,
so I have resorted to anding them out of a memory copy of the mxcsr
that I loaded earlier and then storing it back into the register. 

  With these changes in place, my kernel now handles SIMD fpe errors
(trap code 29) and returns the mxcsr decoded error in the si_code entry of the
siginfo_t structure.

>How-To-Repeat:

>Fix:



Patch attached with submission follows:

diff -Naur amd64-orig/fpu.c amd64/fpu.c
--- amd64-orig/fpu.c	2012-06-24 18:59:36.000000000 -0700
+++ amd64/fpu.c	2012-07-16 22:07:19.000000000 -0700
@@ -72,7 +72,8 @@
 #define	fnstsw(addr)		__asm __volatile("fnstsw %0" : "=am" (*(addr)))
 #define	fxrstor(addr)		__asm __volatile("fxrstor %0" : : "m" (*(addr)))
 #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
-#define	ldmxcsr(csr)		__asm __volatile("ldmxcsr %0" : : "m" (csr))
+#define	ldmxcsr(addr)		__asm __volatile("ldmxcsr %0" : : "m" (*(addr)))
+#define stmxcsr(addr)		__asm __volatile("stmxcsr %0" : "=m" (*(addr)))
 #define	start_emulating()	__asm __volatile( \
 				    "smsw %%ax; orb %0,%%al; lmsw %%ax" \
 				    : : "n" (CR0_TS) : "ax")
@@ -87,7 +88,8 @@
 void	fnstsw(caddr_t addr);
 void	fxsave(caddr_t addr);
 void	fxrstor(caddr_t addr);
-void	ldmxcsr(u_int csr);
+void	ldmxcsr(caddr_t addr);
+void	stmxcsr(caddr_t addr);
 void	start_emulating(void);
 void	stop_emulating(void);
 
@@ -95,6 +97,7 @@
 
 #define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
 #define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
+#define GET_MXCSR(thread)  ((thread)->td_pcb->pcb_save->sv_env.en_mxcsr)
 
 typedef u_char bool_t;
 
@@ -126,7 +129,7 @@
 	control = __INITIAL_FPUCW__;
 	fldcw(control);
 	mxcsr = __INITIAL_MXCSR__;
-	ldmxcsr(mxcsr);
+	ldmxcsr(&mxcsr);
 	if (PCPU_GET(cpuid) == 0) {
 		fxsave(&fpu_initialstate);
 		if (fpu_initialstate.sv_env.en_mxcsr_mask)
@@ -356,6 +359,7 @@
 fputrap()
 {
 	u_short control, status;
+        u_int mxcsr;
 
 	critical_enter();
 
@@ -367,13 +371,18 @@
 	if (PCPU_GET(fpcurthread) != curthread) {
 		control = GET_FPU_CW(curthread);
 		status = GET_FPU_SW(curthread);
+                mxcsr   = GET_MXCSR(curthread);
+                status |= (mxcsr & 0x3f);
 	} else {
 		fnstcw(&control);
 		fnstsw(&status);
+                stmxcsr(&mxcsr);
+                status |= (mxcsr & 0x3f);
+                fnclex();        /* Clear the x87 error bits */
+                mxcsr &= ~0x3f;  /* Clear the mxcsr error bits */
+                ldmxcsr(&mxcsr);
 	}
 
-	if (PCPU_GET(fpcurthread) == curthread)
-		fnclex();
 	critical_exit();
 	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
 }
diff -Naur amd64-orig/trap.c amd64/trap.c
--- amd64-orig/trap.c	2012-06-24 23:58:01.000000000 -0700
+++ amd64/trap.c	2012-07-16 21:51:56.000000000 -0700
@@ -435,7 +435,9 @@
 			break;
 
 		case T_XMMFLT:		/* SIMD floating-point exception */
-			ucode = 0; /* XXX */
+                        ucode = fputrap();
+                        if (ucode == -1)
+                             goto userout;
 			i = SIGFPE;
 			break;
 		}


>Release-Note:
>Audit-Trail:

From: Bruce Evans <brde@optusnet.com.au>
To: Ed Alley <wea@llnl.gov>
Cc: freebsd-gnats-submit@freebsd.org, freebsd-amd64@freebsd.org
Subject: Re: amd64/169927: siginfo, si_code for fpe errors when error occurs
 using the SSE math processor
Date: Tue, 17 Jul 2012 23:25:55 +1000 (EST)

 On Tue, 17 Jul 2012, Ed Alley wrote:
 
 >> Description:
 >     According to sigaction(2) by choosing SA_SIGINGO as one of the sa_flags
 > one can catch sigfpe signals. What is returned to the signal handler
 > for the sigfpe is a structure defined in siginfo(3). Within that
 > structure: the si_code entry gives the error code as defined in siginfo(3)
 > man page. This is useful when de-bugging a large code,
 > because one can retrieve not only the actual fpe error: divide by zero,
 > overflow or etc, but also the location of the error is also returned.
 
 It's interesting that anyone even enables SIGFPE for FP exceptions
 (I tried to keep them as the default for critical exceptions, but
 was OBE.  SIGFPE now normally means (!F)PE for integer division by 0.)
 
 >     For FPE errors using the x87 everything works fine, but when
 > the SSE is used for floating point calculations the si_code that
 > is returned is always zero. I have a fix for this which I have
 > included as a patch for FreeBSD 8.2 release. I have been applying
 > this fix since I got my 64-bit box since FreeBSD 7.x. I have not
 > sent this patch in, since I had assumed that the problem would get
 > fixed in later releases. However, that has not been the case so
 > here is the patch (upgraded to version 8.2) that I have been using.
 >
 >  To apply the patch, just cd into /usr/src/sys/amd64 and apply the
 > patch. It will operate on two files in the directory amd64:
 > trap.c and fpu.c.
 
 i386 has the same bug.
 
 >  In trap.c a single line will be replaced as can be seen in the patch.
 > This line occurs in the user trap switch for the case: T_XMMFLT. The
 > line ucode = 0; is replaced with ucode = fputrap(); This then will call
 > the fputrap() code similarly to the T_ARITHTRAP case.
 
 I didn't know that it had a different exception number.
 
 >  The process fputrap() is found in file fpu.c which is where the
 > rest of the patch operates. In function fputrap() I added additional
 > code to access the mxcsr status bits. These are then ORed into
 > the status code before the argument to the fpetable[] is calculated.
 >
 >  Following the x87 case, before I return, I zero out the error flags
 > in the mxcsr register. Let me know if this is useful, also I have
 > not found an equivalent instruction to the fnclex (that zeros out
 > the x87 error flags) for easily zeroing out the mxcsr error flags,
 > so I have resorted to anding them out of a memory copy of the mxcsr
 > that I loaded earlier and then storing it back into the register.
 
 This shouldn't be done.  The fnclex is both a bug and a feature (mostly
 a bug).  The corresponding clearing of the mxcsr bits is just a bug.
 
 Reason for the fnclex: it clears the fault condition, so that buggy
 SIGFPE handlers can return and have the main code not immediately
 fault again.  But the behaviour is still undefined.  In particular,
 the i387 FP stack may be corrupt (unless the SIGFPE handler actually
 understands FP and has fixed up the stack, but if it understands FP
 then it won't depend on the fnclex, and it won't simply return).  It
 is better for the fault to repeat endlessly.  Most faults including
 SIGFPE for integer division by 0 repeat endlessly, which tells you
 that you have a buggy fault handler that returns.
 
 History of the fnclex and of other FP bugs in signal handling:
 - in FreeBSD-[1-4], the i387 exception handling was:
    - save the status word in the "saved exception status word" in the PCB
    - fnclex
    - starting in about FreeBSD-4, encode the saved status word in the
      signal code.  This loses some info.
    - call the signal handler with the same FP context as the normal process,
      except for the saved exception status word.  This was bad, but it was
      easy for a SIGFPE handler to fix up the state.  Fixing up the status
      word was most complicated.  After about FreeBSD-4, the the signal code
      might have been enough (convert it back to a status word).  If not,
      the saved exception status word was recoverable via the sigcontext
      pointer.
    - return to the normal process with the same FP context as left by the
      signal handler.   This was OK for SIGFPE handlers (they could fix up
      everything except the status word without using the complications of
      the sigcontext pointer or the different unportabilities given by the
      signal code and other siginfo things (siginfo was mostly unavailable
      then), but very bad for non-SIGFPE signal handlers, since any FP in
      the signal handler would corrupt the FP context for the normal process.
    - gdb never understood the sigcontext pointer, but it used to understand
      the saved exception status word after I made it do this in FreeBSD 1 or
      2.  It printed status for both the normal status word and the saved one.
 - most of this was broken in FreeBSD-5.  Now, the i387 exception handling
    is (and similarly for amd64):
    - save the status word in a local variable.  It never reaches the PCB.
    - fnclex, as before
    - encode the old status word in the signal code, as before
    - call the signal handler with an independent context.  This is bad,
      but it makes it harders for a SIGFPE handler to fix up the state.
      Especially the status word, since that was clobbered after not
      saving it except in the local variable.  The handler can retrieve
      the FP state using either the sigcontext pointer or siginfo, but
      that gives the clobbered status word.  Now the signal code gives
      the only trace of the exception bits in the old status word.
    - switch back to the normal FP context after returning from the signal
      handler.
    - current gdb no longer understands the saved exception status word,
      on old kernels that have it.  It is much further from understanding
      the full switched state.  In FreeBSD-[1-4], you could easily see the
      normal process's FP state in a signal handler since it wasn't switched,
      and you could fix it up manually by editing it.  Now you have the same
      complications as a SIGFPE handler that does fixups -- it is hard to
      even see this state.  I think you can still do it manually by looking
      at in the stack.  Perhaps this can be done using gdb macros.  An
      average signal handler doesn't declare sigcontext or siginfo pointers
      (or even the signal code), so to debug it you might need to add the
      declarations and recompile, or possibly fake them using macros.  I
      haven't actually tried this.
 
 On why clobbering the mxcsr status is even less useful:
 - for the i387, the trap happens on the next non-control FP instruction
    after the one that caused the exception.  Not repeating the fault on
    this next (hopefully non-problematic) instruction almost makes sense.
 - for SSE, the trap happens on the one that causes the exception.  It's
    good to repeat this fault.
 
 >  With these changes in place, my kernel now handles SIMD fpe errors
 > (trap code 29) and returns the mxcsr decoded error in the si_code entry of the
 > siginfo_t structure.
 
 > diff -Naur amd64-orig/fpu.c amd64/fpu.c
 > --- amd64-orig/fpu.c	2012-06-24 18:59:36.000000000 -0700
 > +++ amd64/fpu.c	2012-07-16 22:07:19.000000000 -0700
 > ...
 > @@ -356,6 +359,7 @@
 > fputrap()
 > {
 > 	u_short control, status;
 > +        u_int mxcsr;
 >
 > 	critical_enter();
 >
 > @@ -367,13 +371,18 @@
 > 	if (PCPU_GET(fpcurthread) != curthread) {
 > 		control = GET_FPU_CW(curthread);
 > 		status = GET_FPU_SW(curthread);
 > +                mxcsr   = GET_MXCSR(curthread);
 > +                status |= (mxcsr & 0x3f);
 
 Lots of tab and other whitespace lossage.
 
 This change and the one to call fputrap() for T_XMMFLT, and similarly
 for i386, might be enough.
 
 > 	} else {
 > 		fnstcw(&control);
 > 		fnstsw(&status);
 > +                stmxcsr(&mxcsr);
 > +                status |= (mxcsr & 0x3f);
 > +                fnclex();        /* Clear the x87 error bits */
 > +                mxcsr &= ~0x3f;  /* Clear the mxcsr error bits */
 > +                ldmxcsr(&mxcsr);
 
 Best not to touch either.
 
 > 	}
 >
 > -	if (PCPU_GET(fpcurthread) == curthread)
 > -		fnclex();
 
 Try removing this too.
 
 > 	critical_exit();
 > 	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
 > }
 
 Here are some of my old changes to npxtrap().  They mainly ifdef out the
 fnclex:
 
 % Index: npx.c
 % ===================================================================
 % RCS file: /home/ncvs/src/sys/i386/isa/npx.c,v
 % retrieving revision 1.152
 % diff -u -2 -r1.152 npx.c
 % --- npx.c	19 Jun 2004 22:24:16 -0000	1.152
 % +++ npx.c	22 Apr 2006 11:58:31 -0000
 % @@ -578,5 +624,5 @@
 %   */
 %  static char fpetable[128] = {
 % -	0,
 % +	-1,		/*  0 - no unmasked exception (probably bogus IRQ13) */
 %  	FPE_FLTINV,	/*  1 - INV */
 %  	FPE_FLTUND,	/*  2 - DNML */
 % @@ -642,5 +688,5 @@
 %  	FPE_FLTDIV,	/* 3E - DNML | DZ | OFL | UFL | IMP */
 %  	FPE_FLTINV,	/* 3F - INV | DNML | DZ | OFL | UFL | IMP */
 % -	FPE_FLTSUB,	/* 40 - STK */
 % +	-1,		/* 40 - STK, but no unmasked exception so no trap */
 %  	FPE_FLTSUB,	/* 41 - INV | STK */
 %  	FPE_FLTUND,	/* 42 - DNML | STK */
 
 These -1's are supposed to give a unique error for cases that shouldn't happen.
 
 % @@ -751,7 +806,16 @@
 %  	}
 % 
 % +	/* Ignore some spurious traps. */
 % +	if ((status & ~control & 0x3f) == 0) {
 % +		intr_restore(saveintr);
 % +		return (-1);
 % +	}
 
 IIRC, this is mainly for old IRQ13 exception handling.
 
 % +
 % +#if 0
 % +	/* XXX this clobbers the status. */
 %  	if (PCPU_GET(fpcurthread) == curthread)
 %  		fnclex();
 % -	intr_restore(savecrit);
 % +#endif
 % +	intr_restore(saveintr);
 
 Kill the fnclex.  Unrelated renaming of savecrit (savecrit is a bogus name,
 since there are no critical sections here).
 
 %  	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
 %  }
 
 The only effect that I noticed from killing the fnclex was that some of
 my old test programs with buggy SIGFPE handling appear to hang (they
 actually fault endlessly).  They were depending on the signal handler to
 return and then fixed up the FP state after it returned.  The signal
 handler was too standards-conforming and just set a flag of type
 sig_atomic_t.  The quick fix was to use a FreeBSD-4-compat signal handler
 so that the signal handler can corrupt the normal process state; then
 just add an fnclex to it (it is now very non-standards-conforming).
 Configuring to use a FreeBSD-4-compat signal handler in FreeBSD-5+ is
 nontrivial, but I force this configuration anyway for portability.
 With FreeBSD-5+ signal handlers, the signal handler would need an enormous
 amount of code to apply the fnclex to the normal process state.
 
 Note that with SSE, everyone has had the endless-faulting behaviour
 for most FP unmasked exceptions on amd64, since T_XMMFLT wasn't
 connected to fputrap() and fputrap() didn't clobber the status anwyay.
 So everyone must have gotten used to this.  Except, hardly anyone
 unmasks FP exceptions.  I only unmask them for debugging.
 
 Bruce

From: Konstantin Belousov <kostikbel@gmail.com>
To: Bruce Evans <brde@optusnet.com.au>
Cc: Ed Alley <wea@llnl.gov>, freebsd-gnats-submit@freebsd.org,
        freebsd-amd64@freebsd.org
Subject: Re: amd64/169927: siginfo, si_code for fpe errors when error occurs using the SSE math processor
Date: Tue, 17 Jul 2012 16:44:26 +0300

 --IJFRpmOek+ZRSQoz
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 It was on my TODO list for long time. Lets handle amd64 first, both for
 native and compat32.
 
 I think the following should be somewhat better variant. I do leave
 the fnclex there for x87.
 
 diff --git a/sys/amd64/amd64/fpu.c b/sys/amd64/amd64/fpu.c
 index a7812b7..34cf8d4 100644
 --- a/sys/amd64/amd64/fpu.c
 +++ b/sys/amd64/amd64/fpu.c
 @@ -73,6 +73,7 @@ __FBSDID("$FreeBSD$");
  #define	fxrstor(addr)		__asm __volatile("fxrstor %0" : : "m" (*(addr)))
  #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=3Dm" (*(addr)))
  #define	ldmxcsr(csr)		__asm __volatile("ldmxcsr %0" : : "m" (csr))
 +#define	stmxcsr(addr)		__asm __volatile("stmxcsr %0" : : "m" (*(addr)))
 =20
  static __inline void
  xrstor(char *addr, uint64_t mask)
 @@ -105,6 +106,7 @@ void	fnstsw(caddr_t addr);
  void	fxsave(caddr_t addr);
  void	fxrstor(caddr_t addr);
  void	ldmxcsr(u_int csr);
 +void	stmxcsr(u_int csr);
  void	xrstor(char *addr, uint64_t mask);
  void	xsave(char *addr, uint64_t mask);
 =20
 @@ -113,9 +115,6 @@ void	xsave(char *addr, uint64_t mask);
  #define	start_emulating()	load_cr0(rcr0() | CR0_TS)
  #define	stop_emulating()	clts()
 =20
 -#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
 -#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
 -
  CTASSERT(sizeof(struct savefpu) =3D=3D 512);
  CTASSERT(sizeof(struct xstate_hdr) =3D=3D 64);
  CTASSERT(sizeof(struct savefpu_ymm) =3D=3D 832);
 @@ -514,11 +513,15 @@ static char fpetable[128] =3D {
  };
 =20
  /*
 - * Preserve the FP status word, clear FP exceptions, then generate a SIGFP=
 E.
 + * Preserve the FP status word, clear FP exceptions for x87, then
 + * generate a SIGFPE.
 + *
 + * Clearing exceptions was necessary mainly to avoid IRQ13 bugs and is
 + * engraved in our i386 ABI.  We now depend on longjmp() restoring a
 + * usable state.  Restoring the state or examining it might fail if we
 + * didn't clear exceptions.
   *
 - * Clearing exceptions is necessary mainly to avoid IRQ13 bugs.  We now
 - * depend on longjmp() restoring a usable state.  Restoring the state
 - * or examining it might fail if we didn't clear exceptions.
 + * For SSE exceptions, the exceptions are not cleared.
   *
   * The error code chosen will be one of the FPE_... macros. It will be
   * sent as the second argument to old BSD-style signal handlers and as
 @@ -531,8 +534,9 @@ static char fpetable[128] =3D {
   * solution for signals other than SIGFPE.
   */
  int
 -fputrap()
 +fputrap_x87(void)
  {
 +	struct savefpu *pcb_save;
  	u_short control, status;
 =20
  	critical_enter();
 @@ -543,19 +547,43 @@ fputrap()
  	 * wherever they are.
  	 */
  	if (PCPU_GET(fpcurthread) !=3D curthread) {
 -		control =3D GET_FPU_CW(curthread);
 -		status =3D GET_FPU_SW(curthread);
 +		pcb_save =3D curthread->td_pcb->pcb_save;
 +		control =3D pcb_save->sv_env.en_cw;
 +		status =3D pcb_save->sv_env.en_sw;
  	} else {
  		fnstcw(&control);
  		fnstsw(&status);
 +		fnclex();
  	}
 =20
 -	if (PCPU_GET(fpcurthread) =3D=3D curthread)
 -		fnclex();
  	critical_exit();
  	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
  }
 =20
 +int
 +fputrap_sse(void)
 +{
 +	u_int mxcsr;
 +	u_short control, status;
 +
 +	critical_enter();
 +
 +	/*
 +	 * Coomparing with the x87 #MF handler, we do not clear
 +	 * exceptions from the mxcsr.
 +	 */
 +	if (PCPU_GET(fpcurthread) !=3D curthread)
 +		mxcsr =3D curthread->td_pcb->pcb_save->sv_env.en_mxcsr;
 +	else
 +		stmxcsr(&mxcsr);
 +
 +	critical_exit();
 +
 +	status =3D mxcsr & 0x3f;
 +	control =3D (mxcsr >> 16) & 0x3f;
 +	return (fpetable[status & (~control | 0x40)]);
 +}
 +
  /*
   * Implement device not available (DNA) exception
   *
 diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
 index 75e15e0..57d1cc2 100644
 --- a/sys/amd64/amd64/trap.c
 +++ b/sys/amd64/amd64/trap.c
 @@ -328,7 +328,7 @@ trap(struct trapframe *frame)
  			break;
 =20
  		case T_ARITHTRAP:	/* arithmetic trap */
 -			ucode =3D fputrap();
 +			ucode =3D fputrap_x87();
  			if (ucode =3D=3D -1)
  				goto userout;
  			i =3D SIGFPE;
 @@ -442,7 +442,9 @@ trap(struct trapframe *frame)
  			break;
 =20
  		case T_XMMFLT:		/* SIMD floating-point exception */
 -			ucode =3D 0; /* XXX */
 +			ucode =3D fputrap_sse();
 +			if (ucode =3D=3D -1)
 +				goto userout;
  			i =3D SIGFPE;
  			break;
  		}
 diff --git a/sys/amd64/include/fpu.h b/sys/amd64/include/fpu.h
 index 98a016b..7d0f0ea 100644
 --- a/sys/amd64/include/fpu.h
 +++ b/sys/amd64/include/fpu.h
 @@ -62,7 +62,8 @@ int	fpusetregs(struct thread *td, struct savefpu *addr,
  	    char *xfpustate, size_t xfpustate_size);
  int	fpusetxstate(struct thread *td, char *xfpustate,
  	    size_t xfpustate_size);
 -int	fputrap(void);
 +int	fputrap_sse(void);
 +int	fputrap_x87(void);
  void	fpuuserinited(struct thread *td);
  struct fpu_kern_ctx *fpu_kern_alloc_ctx(u_int flags);
  void	fpu_kern_free_ctx(struct fpu_kern_ctx *ctx);
 
 --IJFRpmOek+ZRSQoz
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (FreeBSD)
 
 iEYEARECAAYFAlAFbDkACgkQC3+MBN1Mb4gPfgCeI7OF9u6tfuHgPoVp/bUfG1kc
 iksAn1q9GtduJNGtll0dZd2X336LRijE
 =kkdY
 -----END PGP SIGNATURE-----
 
 --IJFRpmOek+ZRSQoz--

From: Mark Linimon <linimon@lonesome.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: amd64/169927: siginfo, si_code for fpe errors when error
 occurs using the SSE math processor
Date: Tue, 17 Jul 2012 16:49:45 -0500

 ----- Forwarded message from Konstantin Belousov <kostikbel@gmail.com> -----
 
 Date: Tue, 17 Jul 2012 20:09:15 +0300
 From: Konstantin Belousov <kostikbel@gmail.com>
 To: Bruce Evans <brde@optusnet.com.au>
 Cc: freebsd-amd64@freebsd.org
 Subject: Re: amd64/169927: siginfo,
 	si_code for fpe errors when error occurs using the SSE math
 	processor
 User-Agent: Mutt/1.4.2.3i
 
 On Wed, Jul 18, 2012 at 02:03:58AM +1000, Bruce Evans wrote:
 > Apart from doing the bogus fnclex for T_XMMFLT and the delayed effect of
 > i387 status bits, merging or not merging the statuses makes little
 > difference, since if a status bit is set and is not masked according
 > to its control word, then it will generate a trap soon if it didn't
 > genearate the current one.
 
 The trap number is available for SA_SIGINFO type of handlers with
 si_trapno member of siginfo_t. I think this is final argument to have
 separate fputrap_{x87,sse} functions.
 
 For amd64, SSE hardware is FPU, so I do not see much wrong with the name.
 
 I changed fputrap_sse() according to your suggestion.
 
 diff --git a/sys/amd64/amd64/fpu.c b/sys/amd64/amd64/fpu.c
 index a7812b7..356b3ac 100644
 --- a/sys/amd64/amd64/fpu.c
 +++ b/sys/amd64/amd64/fpu.c
 @@ -73,6 +73,7 @@ __FBSDID("$FreeBSD$");
  #define	fxrstor(addr)		__asm __volatile("fxrstor %0" : : "m" (*(addr)))
  #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
  #define	ldmxcsr(csr)		__asm __volatile("ldmxcsr %0" : : "m" (csr))
 +#define	stmxcsr(addr)		__asm __volatile("stmxcsr %0" : : "m" (*(addr)))
  
  static __inline void
  xrstor(char *addr, uint64_t mask)
 @@ -105,6 +106,7 @@ void	fnstsw(caddr_t addr);
  void	fxsave(caddr_t addr);
  void	fxrstor(caddr_t addr);
  void	ldmxcsr(u_int csr);
 +void	stmxcsr(u_int csr);
  void	xrstor(char *addr, uint64_t mask);
  void	xsave(char *addr, uint64_t mask);
  
 @@ -113,9 +115,6 @@ void	xsave(char *addr, uint64_t mask);
  #define	start_emulating()	load_cr0(rcr0() | CR0_TS)
  #define	stop_emulating()	clts()
  
 -#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
 -#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
 -
  CTASSERT(sizeof(struct savefpu) == 512);
  CTASSERT(sizeof(struct xstate_hdr) == 64);
  CTASSERT(sizeof(struct savefpu_ymm) == 832);
 @@ -514,11 +513,15 @@ static char fpetable[128] = {
  };
  
  /*
 - * Preserve the FP status word, clear FP exceptions, then generate a SIGFPE.
 + * Preserve the FP status word, clear FP exceptions for x87, then
 + * generate a SIGFPE.
 + *
 + * Clearing exceptions was necessary mainly to avoid IRQ13 bugs and is
 + * engraved in our i386 ABI.  We now depend on longjmp() restoring a
 + * usable state.  Restoring the state or examining it might fail if we
 + * didn't clear exceptions.
   *
 - * Clearing exceptions is necessary mainly to avoid IRQ13 bugs.  We now
 - * depend on longjmp() restoring a usable state.  Restoring the state
 - * or examining it might fail if we didn't clear exceptions.
 + * For SSE exceptions, the exceptions are not cleared.
   *
   * The error code chosen will be one of the FPE_... macros. It will be
   * sent as the second argument to old BSD-style signal handlers and as
 @@ -531,8 +534,9 @@ static char fpetable[128] = {
   * solution for signals other than SIGFPE.
   */
  int
 -fputrap()
 +fputrap_x87(void)
  {
 +	struct savefpu *pcb_save;
  	u_short control, status;
  
  	critical_enter();
 @@ -543,19 +547,40 @@ fputrap()
  	 * wherever they are.
  	 */
  	if (PCPU_GET(fpcurthread) != curthread) {
 -		control = GET_FPU_CW(curthread);
 -		status = GET_FPU_SW(curthread);
 +		pcb_save = curthread->td_pcb->pcb_save;
 +		control = pcb_save->sv_env.en_cw;
 +		status = pcb_save->sv_env.en_sw;
  	} else {
  		fnstcw(&control);
  		fnstsw(&status);
 +		fnclex();
  	}
  
 -	if (PCPU_GET(fpcurthread) == curthread)
 -		fnclex();
  	critical_exit();
  	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
  }
  
 +int
 +fputrap_sse(void)
 +{
 +	u_int mxcsr;
 +
 +	critical_enter();
 +
 +	/*
 +	 * Coomparing with the x87 #MF handler, we do not clear
 +	 * exceptions from the mxcsr.
 +	 */
 +	if (PCPU_GET(fpcurthread) != curthread)
 +		mxcsr = curthread->td_pcb->pcb_save->sv_env.en_mxcsr;
 +	else
 +		stmxcsr(&mxcsr);
 +
 +	critical_exit();
 +
 +	return (fpetable[(mxcsr & (mxcsr >> 16)) & 0x3f]);
 +}
 +
  /*
   * Implement device not available (DNA) exception
   *
 diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
 index 75e15e0..57d1cc2 100644
 --- a/sys/amd64/amd64/trap.c
 +++ b/sys/amd64/amd64/trap.c
 @@ -328,7 +328,7 @@ trap(struct trapframe *frame)
  			break;
  
  		case T_ARITHTRAP:	/* arithmetic trap */
 -			ucode = fputrap();
 +			ucode = fputrap_x87();
  			if (ucode == -1)
  				goto userout;
  			i = SIGFPE;
 @@ -442,7 +442,9 @@ trap(struct trapframe *frame)
  			break;
  
  		case T_XMMFLT:		/* SIMD floating-point exception */
 -			ucode = 0; /* XXX */
 +			ucode = fputrap_sse();
 +			if (ucode == -1)
 +				goto userout;
  			i = SIGFPE;
  			break;
  		}
 diff --git a/sys/amd64/include/fpu.h b/sys/amd64/include/fpu.h
 index 98a016b..7d0f0ea 100644
 --- a/sys/amd64/include/fpu.h
 +++ b/sys/amd64/include/fpu.h
 @@ -62,7 +62,8 @@ int	fpusetregs(struct thread *td, struct savefpu *addr,
  	    char *xfpustate, size_t xfpustate_size);
  int	fpusetxstate(struct thread *td, char *xfpustate,
  	    size_t xfpustate_size);
 -int	fputrap(void);
 +int	fputrap_sse(void);
 +int	fputrap_x87(void);
  void	fpuuserinited(struct thread *td);
  struct fpu_kern_ctx *fpu_kern_alloc_ctx(u_int flags);
  void	fpu_kern_free_ctx(struct fpu_kern_ctx *ctx);
 
 
 
 ----- End forwarded message -----

From: Mark Linimon <linimon@lonesome.com>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: amd64/169927: siginfo, si_code for fpe errors when error
 occurs using the SSE math processor
Date: Tue, 17 Jul 2012 16:50:18 -0500

 ----- Forwarded message from Konstantin Belousov <kostikbel@gmail.com> -----
 
 Date: Tue, 17 Jul 2012 21:26:19 +0300
 From: Konstantin Belousov <kostikbel@gmail.com>
 To: Bruce Evans <brde@optusnet.com.au>
 Cc: freebsd-amd64@freebsd.org
 Subject: Re: amd64/169927: siginfo,
 	si_code for fpe errors when error occurs using the SSE math
 	processor
 
 On Tue, Jul 17, 2012 at 08:09:15PM +0300, Konstantin Belousov wrote:
 > The trap number is available for SA_SIGINFO type of handlers with
 > si_trapno member of siginfo_t. I think this is final argument to have
 > separate fputrap_{x87,sse} functions.
 > 
 > For amd64, SSE hardware is FPU, so I do not see much wrong with the name.
 > 
 > I changed fputrap_sse() according to your suggestion.
 
 Below is the actually tested patch. After it, I put the test program.
 
 diff --git a/sys/amd64/amd64/fpu.c b/sys/amd64/amd64/fpu.c
 index a7812b7..ae241ce 100644
 --- a/sys/amd64/amd64/fpu.c
 +++ b/sys/amd64/amd64/fpu.c
 @@ -73,6 +73,7 @@ __FBSDID("$FreeBSD$");
  #define	fxrstor(addr)		__asm __volatile("fxrstor %0" : : "m" (*(addr)))
  #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
  #define	ldmxcsr(csr)		__asm __volatile("ldmxcsr %0" : : "m" (csr))
 +#define	stmxcsr(addr)		__asm __volatile("stmxcsr %0" : : "m" (*(addr)))
  
  static __inline void
  xrstor(char *addr, uint64_t mask)
 @@ -105,6 +106,7 @@ void	fnstsw(caddr_t addr);
  void	fxsave(caddr_t addr);
  void	fxrstor(caddr_t addr);
  void	ldmxcsr(u_int csr);
 +void	stmxcsr(u_int csr);
  void	xrstor(char *addr, uint64_t mask);
  void	xsave(char *addr, uint64_t mask);
  
 @@ -113,9 +115,6 @@ void	xsave(char *addr, uint64_t mask);
  #define	start_emulating()	load_cr0(rcr0() | CR0_TS)
  #define	stop_emulating()	clts()
  
 -#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
 -#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
 -
  CTASSERT(sizeof(struct savefpu) == 512);
  CTASSERT(sizeof(struct xstate_hdr) == 64);
  CTASSERT(sizeof(struct savefpu_ymm) == 832);
 @@ -514,11 +513,15 @@ static char fpetable[128] = {
  };
  
  /*
 - * Preserve the FP status word, clear FP exceptions, then generate a SIGFPE.
 + * Preserve the FP status word, clear FP exceptions for x87, then
 + * generate a SIGFPE.
 + *
 + * Clearing exceptions was necessary mainly to avoid IRQ13 bugs and is
 + * engraved in our i386 ABI.  We now depend on longjmp() restoring a
 + * usable state.  Restoring the state or examining it might fail if we
 + * didn't clear exceptions.
   *
 - * Clearing exceptions is necessary mainly to avoid IRQ13 bugs.  We now
 - * depend on longjmp() restoring a usable state.  Restoring the state
 - * or examining it might fail if we didn't clear exceptions.
 + * For SSE exceptions, the exceptions are not cleared.
   *
   * The error code chosen will be one of the FPE_... macros. It will be
   * sent as the second argument to old BSD-style signal handlers and as
 @@ -531,8 +534,9 @@ static char fpetable[128] = {
   * solution for signals other than SIGFPE.
   */
  int
 -fputrap()
 +fputrap_x87(void)
  {
 +	struct savefpu *pcb_save;
  	u_short control, status;
  
  	critical_enter();
 @@ -543,19 +547,40 @@ fputrap()
  	 * wherever they are.
  	 */
  	if (PCPU_GET(fpcurthread) != curthread) {
 -		control = GET_FPU_CW(curthread);
 -		status = GET_FPU_SW(curthread);
 +		pcb_save = curthread->td_pcb->pcb_save;
 +		control = pcb_save->sv_env.en_cw;
 +		status = pcb_save->sv_env.en_sw;
  	} else {
  		fnstcw(&control);
  		fnstsw(&status);
 +		fnclex();
  	}
  
 -	if (PCPU_GET(fpcurthread) == curthread)
 -		fnclex();
  	critical_exit();
  	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
  }
  
 +int
 +fputrap_sse(void)
 +{
 +	u_int mxcsr;
 +
 +	critical_enter();
 +
 +	/*
 +	 * Coomparing with the x87 #MF handler, we do not clear
 +	 * exceptions from the mxcsr.
 +	 */
 +	if (PCPU_GET(fpcurthread) != curthread)
 +		mxcsr = curthread->td_pcb->pcb_save->sv_env.en_mxcsr;
 +	else
 +		stmxcsr(&mxcsr);
 +
 +	critical_exit();
 +
 +	return (fpetable[(mxcsr & (~mxcsr >> 7)) & 0x3f]);
 +}
 +
  /*
   * Implement device not available (DNA) exception
   *
 diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c
 index 75e15e0..57d1cc2 100644
 --- a/sys/amd64/amd64/trap.c
 +++ b/sys/amd64/amd64/trap.c
 @@ -328,7 +328,7 @@ trap(struct trapframe *frame)
  			break;
  
  		case T_ARITHTRAP:	/* arithmetic trap */
 -			ucode = fputrap();
 +			ucode = fputrap_x87();
  			if (ucode == -1)
  				goto userout;
  			i = SIGFPE;
 @@ -442,7 +442,9 @@ trap(struct trapframe *frame)
  			break;
  
  		case T_XMMFLT:		/* SIMD floating-point exception */
 -			ucode = 0; /* XXX */
 +			ucode = fputrap_sse();
 +			if (ucode == -1)
 +				goto userout;
  			i = SIGFPE;
  			break;
  		}
 diff --git a/sys/amd64/include/fpu.h b/sys/amd64/include/fpu.h
 index 98a016b..7d0f0ea 100644
 --- a/sys/amd64/include/fpu.h
 +++ b/sys/amd64/include/fpu.h
 @@ -62,7 +62,8 @@ int	fpusetregs(struct thread *td, struct savefpu *addr,
  	    char *xfpustate, size_t xfpustate_size);
  int	fpusetxstate(struct thread *td, char *xfpustate,
  	    size_t xfpustate_size);
 -int	fputrap(void);
 +int	fputrap_sse(void);
 +int	fputrap_x87(void);
  void	fpuuserinited(struct thread *td);
  struct fpu_kern_ctx *fpu_kern_alloc_ctx(u_int flags);
  void	fpu_kern_free_ctx(struct fpu_kern_ctx *ctx);
 
 fpuex.c:
 
 /* $Id: fpuex.c,v 1.2 2012/07/17 18:22:54 kostik Exp kostik $ */
 #include <sys/types.h>
 #include <err.h>
 #include <errno.h>
 #include <signal.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <ucontext.h>
 #if defined(__amd64__)
 #include <machine/fpu.h>
 #elif defined(__i386__)
 #include <machine/npx.h>
 #endif
 
 static void
 handler(int signo __unused, siginfo_t *info, void *v)
 {
 	ucontext_t *uap;
 	mcontext_t *mc;
 #if defined(__amd64__)
 	struct savefpu *sf;
 #elif defined(__i386__)
 	struct savexmm *sf;
 #endif
 
 	uap = v;
 	mc = &uap->uc_mcontext;
 	sf = &mc->mc_fpstate;
 	printf("intr handler: trapno %d code %d cw 0x%04x sw 0x%04x mxcsr 0x%08x ip 0x%lx\n",
 	    info->si_trapno, info->si_code,
 #if defined(__amd64__)
 	    sf->sv_env.en_cw, sf->sv_env.en_sw, sf->sv_env.en_mxcsr,
 	    sf->sv_env.en_rip
 #elif defined(__i386__)
 	    sf->sv_env.en_cw, sf->sv_env.en_sw,
 	    sf->sv_env.en_mxcsr, (unsigned long)sf->sv_env.en_fip
 #endif
 	    );
 	exit(0);
 }
 
 double a[3];
 
 double x(void) __attribute__((noinline));
 double
 x(void)
 {
 
 	a[3] = a[0] / a[1];
 	return (a[3]);
 }
 
 int
 main(void)
 {
 	struct sigaction sa;
 	uint32_t mxcsr;
 	uint16_t cw;
 
 	bzero(&sa, sizeof(sa));
 	sa.sa_sigaction = handler;
 	sa.sa_flags = SA_SIGINFO;
 	if (sigaction(SIGFPE, &sa, NULL) == -1)
 		err(1, "sigaction SIGFPE");
 
 	mxcsr = 0;
 	__asm __volatile("ldmxcsr %0" : : "m" (mxcsr));
 #ifdef __i386__
 	cw = 0;
 	__asm __volatile("fldcw %0" : : "m" (cw));
 #endif
 	a[0] = 1.0;
 	a[1] = 0.0;
 	x();
 	return (0);
 }
 
 
 
 ----- End forwarded message -----
Responsible-Changed-From-To: freebsd-amd64->take 
Responsible-Changed-By: kib 
Responsible-Changed-When: Wed Jul 18 14:57:00 UTC 2012 
Responsible-Changed-Why:  


http://www.freebsd.org/cgi/query-pr.cgi?pr=169927 
Responsible-Changed-From-To: take->kib 
Responsible-Changed-By: kib 
Responsible-Changed-When: Wed Jul 18 14:57:17 UTC 2012 
Responsible-Changed-Why:  
Take. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=169927 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: amd64/169927: commit references a PR
Date: Wed, 18 Jul 2012 15:36:51 +0000 (UTC)

 Author: kib
 Date: Wed Jul 18 15:36:03 2012
 New Revision: 238597
 URL: http://svn.freebsd.org/changeset/base/238597
 
 Log:
   Add stmxcsr.
   
   Submitted by:	Ed Alley <wea llnl gov>
   PR:	  amd64/169927
   MFC after:	3 weeks
 
 Modified:
   head/sys/amd64/amd64/fpu.c
 
 Modified: head/sys/amd64/amd64/fpu.c
 ==============================================================================
 --- head/sys/amd64/amd64/fpu.c	Wed Jul 18 12:41:09 2012	(r238596)
 +++ head/sys/amd64/amd64/fpu.c	Wed Jul 18 15:36:03 2012	(r238597)
 @@ -73,6 +73,7 @@ __FBSDID("$FreeBSD$");
  #define	fxrstor(addr)		__asm __volatile("fxrstor %0" : : "m" (*(addr)))
  #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
  #define	ldmxcsr(csr)		__asm __volatile("ldmxcsr %0" : : "m" (csr))
 +#define	stmxcsr(addr)		__asm __volatile("stmxcsr %0" : : "m" (*(addr)))
  
  static __inline void
  xrstor(char *addr, uint64_t mask)
 @@ -105,6 +106,7 @@ void	fnstsw(caddr_t addr);
  void	fxsave(caddr_t addr);
  void	fxrstor(caddr_t addr);
  void	ldmxcsr(u_int csr);
 +void	stmxcsr(u_int csr);
  void	xrstor(char *addr, uint64_t mask);
  void	xsave(char *addr, uint64_t mask);
  
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: amd64/169927: commit references a PR
Date: Wed, 18 Jul 2012 15:43:58 +0000 (UTC)

 Author: kib
 Date: Wed Jul 18 15:43:47 2012
 New Revision: 238598
 URL: http://svn.freebsd.org/changeset/base/238598
 
 Log:
   On AMD64, provide siginfo.si_code for floating point errors when error
   occurs using the SSE math processor.  Update comments describing the
   handling of the exception status bits in coprocessors control words.
   
   Remove GET_FPU_CW and GET_FPU_SW macros which were used only once.
   Prefer to use curpcb to access pcb_save over the longer path of
   referencing pcb through the thread structure.
   
   Based on the submission by:	Ed Alley <wea llnl gov>
   PR:	  amd64/169927
   Reviewed by:	bde
   MFC after:	3 weeks
 
 Modified:
   head/sys/amd64/amd64/fpu.c
   head/sys/amd64/amd64/trap.c
   head/sys/amd64/include/fpu.h
 
 Modified: head/sys/amd64/amd64/fpu.c
 ==============================================================================
 --- head/sys/amd64/amd64/fpu.c	Wed Jul 18 15:36:03 2012	(r238597)
 +++ head/sys/amd64/amd64/fpu.c	Wed Jul 18 15:43:47 2012	(r238598)
 @@ -115,9 +115,6 @@ void	xsave(char *addr, uint64_t mask);
  #define	start_emulating()	load_cr0(rcr0() | CR0_TS)
  #define	stop_emulating()	clts()
  
 -#define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
 -#define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
 -
  CTASSERT(sizeof(struct savefpu) == 512);
  CTASSERT(sizeof(struct xstate_hdr) == 64);
  CTASSERT(sizeof(struct savefpu_ymm) == 832);
 @@ -516,11 +513,15 @@ static char fpetable[128] = {
  };
  
  /*
 - * Preserve the FP status word, clear FP exceptions, then generate a SIGFPE.
 + * Preserve the FP status word, clear FP exceptions for x87, then
 + * generate a SIGFPE.
 + *
 + * Clearing exceptions was necessary mainly to avoid IRQ13 bugs and is
 + * engraved in our i386 ABI.  We now depend on longjmp() restoring a
 + * usable state.  Restoring the state or examining it might fail if we
 + * didn't clear exceptions.
   *
 - * Clearing exceptions is necessary mainly to avoid IRQ13 bugs.  We now
 - * depend on longjmp() restoring a usable state.  Restoring the state
 - * or examining it might fail if we didn't clear exceptions.
 + * For SSE exceptions, the exceptions are not cleared.
   *
   * The error code chosen will be one of the FPE_... macros. It will be
   * sent as the second argument to old BSD-style signal handlers and as
 @@ -533,8 +534,9 @@ static char fpetable[128] = {
   * solution for signals other than SIGFPE.
   */
  int
 -fputrap()
 +fputrap_x87(void)
  {
 +	struct savefpu *pcb_save;
  	u_short control, status;
  
  	critical_enter();
 @@ -545,19 +547,33 @@ fputrap()
  	 * wherever they are.
  	 */
  	if (PCPU_GET(fpcurthread) != curthread) {
 -		control = GET_FPU_CW(curthread);
 -		status = GET_FPU_SW(curthread);
 +		pcb_save = PCPU_GET(curpcb)->pcb_save;
 +		control = pcb_save->sv_env.en_cw;
 +		status = pcb_save->sv_env.en_sw;
  	} else {
  		fnstcw(&control);
  		fnstsw(&status);
 +		fnclex();
  	}
  
 -	if (PCPU_GET(fpcurthread) == curthread)
 -		fnclex();
  	critical_exit();
  	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
  }
  
 +int
 +fputrap_sse(void)
 +{
 +	u_int mxcsr;
 +
 +	critical_enter();
 +	if (PCPU_GET(fpcurthread) != curthread)
 +		mxcsr = PCPU_GET(curpcb)->pcb_save->sv_env.en_mxcsr;
 +	else
 +		stmxcsr(&mxcsr);
 +	critical_exit();
 +	return (fpetable[(mxcsr & (~mxcsr >> 7)) & 0x3f]);
 +}
 +
  /*
   * Implement device not available (DNA) exception
   *
 
 Modified: head/sys/amd64/amd64/trap.c
 ==============================================================================
 --- head/sys/amd64/amd64/trap.c	Wed Jul 18 15:36:03 2012	(r238597)
 +++ head/sys/amd64/amd64/trap.c	Wed Jul 18 15:43:47 2012	(r238598)
 @@ -328,7 +328,7 @@ trap(struct trapframe *frame)
  			break;
  
  		case T_ARITHTRAP:	/* arithmetic trap */
 -			ucode = fputrap();
 +			ucode = fputrap_x87();
  			if (ucode == -1)
  				goto userout;
  			i = SIGFPE;
 @@ -442,7 +442,9 @@ trap(struct trapframe *frame)
  			break;
  
  		case T_XMMFLT:		/* SIMD floating-point exception */
 -			ucode = 0; /* XXX */
 +			ucode = fputrap_sse();
 +			if (ucode == -1)
 +				goto userout;
  			i = SIGFPE;
  			break;
  		}
 
 Modified: head/sys/amd64/include/fpu.h
 ==============================================================================
 --- head/sys/amd64/include/fpu.h	Wed Jul 18 15:36:03 2012	(r238597)
 +++ head/sys/amd64/include/fpu.h	Wed Jul 18 15:43:47 2012	(r238598)
 @@ -62,7 +62,8 @@ int	fpusetregs(struct thread *td, struct
  	    char *xfpustate, size_t xfpustate_size);
  int	fpusetxstate(struct thread *td, char *xfpustate,
  	    size_t xfpustate_size);
 -int	fputrap(void);
 +int	fputrap_sse(void);
 +int	fputrap_x87(void);
  void	fpuuserinited(struct thread *td);
  struct fpu_kern_ctx *fpu_kern_alloc_ctx(u_int flags);
  void	fpu_kern_free_ctx(struct fpu_kern_ctx *ctx);
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 
State-Changed-From-To: open->patched 
State-Changed-By: linimon 
State-Changed-When: Mon Jul 23 01:07:46 UTC 2012 
State-Changed-Why:  
Note that a fix has been committed. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=169927 
State-Changed-From-To: patched->closed 
State-Changed-By: kib 
State-Changed-When: Mon Jan 28 16:48:17 UTC 2013 
State-Changed-Why:  
Merged to 9. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=169927 
>Unformatted:
