From arundel@h3c.de  Fri Aug 12 13:06:16 2005
Return-Path: <arundel@h3c.de>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 79B5116A41F
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 12 Aug 2005 13:06:16 +0000 (GMT)
	(envelope-from arundel@h3c.de)
Received: from enterprise4.noxa.de (enterprise.noxa.de [212.60.197.71])
	by mx1.FreeBSD.org (Postfix) with ESMTP id ADE4E43D45
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 12 Aug 2005 13:06:15 +0000 (GMT)
	(envelope-from arundel@h3c.de)
Received: (qmail 5346 invoked from network); 12 Aug 2005 15:06:13 +0200
Received: from p508fe702.dip.t-dialin.net (HELO localhost.skatecity) (80.143.231.2)
  by enterprise.noxa.de with AES256-SHA encrypted SMTP; 12 Aug 2005 15:06:13 +0200
Received: from localhost.skatecity (nobody@localhost.skatecity [127.0.0.1])
	by localhost.skatecity (8.13.4/8.13.4) with ESMTP id j7CD6Iom034033
	for <FreeBSD-gnats-submit@freebsd.org>; Fri, 12 Aug 2005 15:06:18 +0200 (CEST)
	(envelope-from arundel@localhost.skatecity)
Received: (from arundel@localhost)
	by localhost.skatecity (8.13.4/8.13.4/Submit) id j7CD6Ir9034032;
	Fri, 12 Aug 2005 15:06:18 +0200 (CEST)
	(envelope-from arundel)
Message-Id: <200508121306.j7CD6Ir9034032@localhost.skatecity>
Date: Fri, 12 Aug 2005 15:06:18 +0200 (CEST)
From: No Name <arundel@h3c.de>
Reply-To: No name <arundel@h3c.de>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: i386_set_ioperm(2) timing issue
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         84842
>Category:       i386
>Synopsis:       i386_set_ioperm(2) timing issue
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    jhb
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Aug 12 13:10:28 GMT 2005
>Closed-Date:    Mon Sep 26 19:41:05 GMT 2005
>Last-Modified:  Tue Jul 27 21:40:21 UTC 2010
>Originator:     Alexander Best
>Release:        FreeBSD 6.0-BETA1 i386
>Organization:
>Environment:
System: FreeBSD skatecity 6.0-BETA1 FreeBSD 6.0-BETA1 #0: Mon Jul 18 03:00:45 CEST 2005 root@skatecity:/usr/obj/usr/src/sys/ARUNDEL i386


	
>Description:
	I'm using the sysarch syscall to set I/O permissions (yes, I know -libc
	is the recommended way of doing this unstead of using syscalls). The
	I/O permissions however don't get set immediately, but need some time
	to get set. This causes a I/O port access that is being issued shortly
	after the sysarch(2) syscall with the i386_set_ioperm(2) argument to
	fail due to missing I/O port permissions:

	Bus error: 10 (core dumped)

	The problem will not occur when there is a heavy system load. Due to
	the mulritasking nature of Unix the code execution will be delayed and
	thus the I/O permissions get set before the actual port access. However
	when the execution time of the app is close to realtime there won't be
	enough time between the success of the sysarch(2) syscall and the port
	request.

	This problem seems to occurs under RELENG_6 as well as under
	5.4-STABLE.

>How-To-Repeat:
	Compile and execute the attached C app. Optimisation settings shouldn't
	have any effect on this matter.

>Fix:
	Maybe usleep(3) or nanosleep(2) in /usr/src/sys/i386/i386/sys_machdep.c
	 could solve the problem?

--- io.c begins here ---
#define PORT    0x378
#include <unistd.h>
#include <machine/sysarch.h>

static inline void outb (unsigned short int port, unsigned char val) {
        __asm__ volatile ("outb %0,%1\n"::"a" (val), "d" (port) );
}

int
main (int argc, char **argv) {

    struct i386_ioperm_args *args;
    struct i386_ioperm_args arg;
    args = &arg;
 
    args->start = PORT;
    args->length = 1;
    args->enable = 1;
 
    if(sysarch(I386_SET_IOPERM,args)) {
    printf("Error during syscall\n");
    exit(1);
    }

    else {

        /* sleep(1); <- With this delay the I/O port access works /*
        outb(0x378,0xFF);
        exit(0);
    }
}

//eof
--- io.c ends here ---


>Release-Note:
>Audit-Trail:

From: alexander <arundel@h3c.de>
To: bug-followup@FreeBSD.org, arundel@h3c.de
Cc:  
Subject: Re: i386/84842: i386_set_ioperm(2) timing issue
Date: Fri, 12 Aug 2005 15:26:37 +0200

 Opps. The app won't compile. Line 27 has syntax errors:
 
  - /* sleep(1); <- With this delay the I/O port access works /*
  + /* sleep(1); <- With this delay the I/O port access works */
 
 Sorry.
State-Changed-From-To: open->analyzed 
State-Changed-By: bde 
State-Changed-When: Fri Aug 12 22:33:10 GMT 2005 
State-Changed-Why:  
The problem seems to be that the TSS is not loaded by the syscall.  The 
i/o permissions bitmap is in the TSS and I think think the TSS must be 
reloaded for the new bitmap to be seen.  The TSS is reloaded on the next 
context switch but doesn't seem to be loaded anywhere else in normal 
execution (it is also loaded at boot time and for vm86 BIOS calls and 
returns). 

Try adding an ltr(gsel_tss) near the end of i386_set_ioperm(). 

http://www.freebsd.org/cgi/query-pr.cgi?pr=84842 

From: Bruce Evans <bde@zeta.org.au>
To: Bruce Evans <bde@FreeBSD.org>
Cc: arundel@h3c.de, freebsd-gnats-submit@FreeBSD.org
Subject: Re: i386/84842: i386_set_ioperm(2) timing issue
Date: Sat, 13 Aug 2005 09:44:33 +1000 (EST)

 On Fri, 12 Aug 2005, Bruce Evans wrote:
 
 > The problem seems to be that the TSS is not loaded by the syscall.  The
 > i/o permissions bitmap is in the TSS and I think think the TSS must be
 > reloaded for the new bitmap to be seen.  The TSS is reloaded on the next
 > context switch but doesn't seem to be loaded anywhere else in normal
 > execution (it is also loaded at boot time and for vm86 BIOS calls and
 > returns).
 >
 > Try adding an ltr(gsel_tss) near the end of i386_set_ioperm().
 
 Further reading and testing showed that the bug is a relatively new one
 and fixing it cleanly is not so easy.  The CPU apparently examines the
 permissions bitmap on every access (that's one reason i/o accesses are
 so slow :-), so the TSS [register] doesn't need to be reloaded after
 every change.  However, for the first access the bitmap is usually
 empty since the loading of the new TSS that holds the bitmap has been
 broken by an optimization.  From an old version of i386/sys_machdep.c:
 
 % int
 % i386_extend_pcb(struct thread *td)
 % {
 % ...
 % 	/* switch to the new TSS after syscall completes */
 % 	td->td_flags |= TDF_NEEDRESCHED;
 
 This was broken by optimizing null context switches.  Now the switch to
 the new TSS doesn't normally actually occur after the syscall completes,
 since the scheduler normally reschedules the same thread and mi_switch()
 now avoids calling cpu_switch() in this case.
 
 % ...
 % }
 % 
 % static int
 % i386_set_ioperm(td, args)
 % 	struct thread *td;
 % 	char *args;
 % {
 % ..
 % 	if (td->td_pcb->pcb_ext == 0)
 % 		if ((error = i386_extend_pcb(td)) != 0)
 % 			return (error);
 % 	iomap = (char *)td->td_pcb->pcb_ext->ext_iomap;
 
 This extends the pcb on the first call to i386_set_ioperm() for a thread,
 so if the TSS were reloaded after the call as intended then the first
 call would be less broken than subsequent calls.
 
 % 
 % 	if (ua.start + ua.length > IOPAGES * PAGE_SIZE * NBBY)
 % 		return (EINVAL);
 % 
 % 	for (i = ua.start; i < ua.start + ua.length; i++) {
 % 		if (ua.enable) 
 % 			iomap[i >> 3] &= ~(1 << (i & 7));
 % 		else
 % 			iomap[i >> 3] |= (1 << (i & 7));
 % 	}
 
 Testing shows than an ltr() is not needed here.  It is sufficient to
 sleep (in the application) after the first call so that the new TSS
 gets loaded.  For subsequent calls, the TSS descriptor doesn't change,
 and the TSS apparently doesn't need to be reloaded to get changes to
 the contents of the TSS seen.
 
 % 	return (error);
 % }
 
 For a quick fix, a tsleep(..., 1) in 386_extend_pcb() should work as well
 as a sleep in the application (unless/until somone modifies short sleeps
 to not switch).
 
 Bruce

From: John Baldwin <jhb@FreeBSD.org>
To: bug-followup@FreeBSD.org,
 arundel@h3c.de
Cc: bde@FreeBSD.org
Subject: Re: i386/84842: i386_set_ioperm(2) timing issue
Date: Tue, 16 Aug 2005 14:12:09 -0400

 What about replacing the setting of TDF_NEEDRESCHED() in i386_extend_pcb() 
 with a call to ltr()?  Actually, it takes more work than a ltr() as you have 
 to update the TSS descriptor in the GDT for the current CPU before you do the 
 ltr().  Maybe something like this:
 
 Index: i386/sys_machdep.c
 ===================================================================
 RCS file: /usr/cvs/src/sys/i386/i386/sys_machdep.c,v
 retrieving revision 1.102
 diff -u -r1.102 sys_machdep.c
 --- i386/sys_machdep.c	23 Jun 2005 21:56:45 -0000	1.102
 +++ i386/sys_machdep.c	16 Aug 2005 18:08:49 -0000
 @@ -267,9 +267,11 @@
  	KASSERT(td->td_pcb->pcb_ext == 0, ("already have a TSS!"));
  	mtx_lock_spin(&sched_lock);
  	td->td_pcb->pcb_ext = ext;
 -	
 -	/* switch to the new TSS after syscall completes */
 -	td->td_flags |= TDF_NEEDRESCHED;
 +
 +	/* Switch to the new TSS. */
 +	private_tss |= PCPU_GET(cpumask);
 +	*PCPU_GET(tss_gdt) = ext->ext_tssd;
 +	ltr(GSEL(GPROC0_SEL, SEL_KPL));
  	mtx_unlock_spin(&sched_lock);
  
  	return 0;
 Index: include/pcb_ext.h
 ===================================================================
 RCS file: /usr/cvs/src/sys/i386/include/pcb_ext.h,v
 retrieving revision 1.9
 diff -u -r1.9 pcb_ext.h
 --- include/pcb_ext.h	20 Mar 2002 05:48:58 -0000	1.9
 +++ include/pcb_ext.h	16 Aug 2005 18:10:11 -0000
 @@ -44,6 +44,7 @@
  };
  
  #ifdef _KERNEL
 +int private_tss;
  
  int i386_extend_pcb(struct thread *);
  
 
 -- 
 John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
 "Power Users Use the Power to Serve"  =  http://www.FreeBSD.org

From: Bruce Evans <bde@zeta.org.au>
To: John Baldwin <jhb@freebsd.org>
Cc: freebsd-gnats-submit@freebsd.org, arundel@h3c.de
Subject: Re: i386/84842: i386_set_ioperm(2) timing issue
Date: Wed, 17 Aug 2005 14:19:23 +1000 (EST)

 On Tue, 16 Aug 2005, John Baldwin wrote:
 
 > What about replacing the setting of TDF_NEEDRESCHED() in i386_extend_pcb()
 > with a call to ltr()?  Actually, it takes more work than a ltr() as you have
 
 That would be best if it works.  I haven't checked if more magic in
 cpu_switch() is needed.
 
 > to update the TSS descriptor in the GDT for the current CPU before you do the
 > ltr().  Maybe something like this:
 
 There is certainly more magic...
 
 > Index: i386/sys_machdep.c
 > ===================================================================
 > RCS file: /usr/cvs/src/sys/i386/i386/sys_machdep.c,v
 > retrieving revision 1.102
 > diff -u -r1.102 sys_machdep.c
 > --- i386/sys_machdep.c	23 Jun 2005 21:56:45 -0000	1.102
 > +++ i386/sys_machdep.c	16 Aug 2005 18:08:49 -0000
 > @@ -267,9 +267,11 @@
 > 	KASSERT(td->td_pcb->pcb_ext == 0, ("already have a TSS!"));
 > 	mtx_lock_spin(&sched_lock);
 > 	td->td_pcb->pcb_ext = ext;
 > -
 > -	/* switch to the new TSS after syscall completes */
 > -	td->td_flags |= TDF_NEEDRESCHED;
 > +
 > +	/* Switch to the new TSS. */
 > +	private_tss |= PCPU_GET(cpumask);
 > +	*PCPU_GET(tss_gdt) = ext->ext_tssd;
 > +	ltr(GSEL(GPROC0_SEL, SEL_KPL));
 > 	mtx_unlock_spin(&sched_lock);
 >
 > 	return 0;
 
 This seeks OK except the comment in it should be moved earlier so that
 the block of code for the switch includes both the lock and the unlock.
 I think the setting of pcb_ext doesn't belong in this block.
 
 The first KASSERT before this code seems to have been broken by KSE.
 We assert that td->td_proc != curproc, but pcb_ext is per-thread so
 we need td != curthread else we clobber another thread's tss.  If
 the setting of pcb_ext actually needs sched_lock, then the second
 KASSERT before this code is broken too since it reads an unlocked
 pcb_ext.
 
 The ltr() call is missing the style bug that all other callers of ltr()
 in *.c have.  The other callers laboriously copy the constant arg to
 a local variable.  This seems to have always been unnecessary.  The
 ltr instruction doesn't work with immediate operands, but assembler
 code has always accessed the arg as a non-immediate.  ltr() used to
 be implemented in support.s; then the access was to a local variable
 on the stack, so there were 2 logical copies of the constant in memory,
 first for the local variable and second for the copy of this on the
 stack, and probably 1 copy in memory in practice (gcc should push the
 constant directly onto the stack).  Now ltr() is implemented in cpufunc.h
 and it is missing support for memory accesses, so the constant is
 probably  loaded directly into a register and accessed from there
 whether or not it is to a local variable.
 
 I use the following fixes for some wrong and incomplete constraints:
 
 %%%
 Index: cpufunc.h
 ===================================================================
 RCS file: /home/ncvs/src/sys/i386/include/cpufunc.h,v
 retrieving revision 1.142
 diff -u -2 -r1.142 cpufunc.h
 --- cpufunc.h	7 Apr 2004 20:46:05 -0000	1.142
 +++ cpufunc.h	8 Apr 2004 13:22:43 -0000
 @@ -476,5 +481,5 @@
   lidt(struct region_descriptor *addr)
   {
 -	__asm __volatile("lidt (%0)" : : "r" (addr));
 +	__asm __volatile("lidt %0" : : "m" (*(char *)(void *)addr)); /* XXX */
   }
 
 @@ -482,5 +487,5 @@
   lldt(u_short sel)
   {
 -	__asm __volatile("lldt %0" : : "r" (sel));
 +	__asm __volatile("lldt %0" : : "rm" (sel));
   }
 
 @@ -488,5 +493,5 @@
   ltr(u_short sel)
   {
 -	__asm __volatile("ltr %0" : : "r" (sel));
 +	__asm __volatile("ltr %0" : : "rm" (sel));
   }
 
 %%%
 
 lidt() is wrong and the others are incomplete.  The lidt instruction
 only takes memory operands but we trick both the compiler and the
 assembler to use a pointer the the memory with the pointer constrained
 to a register.  The operand should be simply (*addr), but
 "struct region descriptor" is normally incompletely declared here so
 I use a bogus cast to avoid dereferencing the whole struct, although
 this defeats most of the point of the fix :(.  I think the __volatile
 declaration ensures no problems in practice.  Here we only (?) need
 asm to not be moved to before the initialization of the full *addr.
 We use __volatile for all asms in cpufunc.h although it is documented
 to be unecessary for asms with no output operands like the above.
 peter@ wonders if depending on this behaviour in similar "*addr"
 asms is what causes mysterious warnings (broken panics) in amd64
 npx^Wfpu context switching.
 
 Bruce
State-Changed-From-To: analyzed->patched 
State-Changed-By: jhb 
State-Changed-When: Thu Sep 15 17:30:24 GMT 2005 
State-Changed-Why:  
Fix committed to HEAD and will be MFC'd to 6.0 hopefully before the release. 


Responsible-Changed-From-To: freebsd-i386->jhb 
Responsible-Changed-By: jhb 
Responsible-Changed-When: Thu Sep 15 17:30:24 GMT 2005 
Responsible-Changed-Why:  
Fix committed to HEAD and will be MFC'd to 6.0 hopefully before the release. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=84842 
State-Changed-From-To: patched->closed 
State-Changed-By: jhb 
State-Changed-When: Mon Sep 26 19:40:50 GMT 2005 
State-Changed-Why:  
Fix merged to 5.x and 6.x. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=84842 
>Unformatted:
