From keith@dino.mithy.org  Mon Oct 23 12:05:24 2000
Return-Path: <keith@dino.mithy.org>
Received: from murphys-outbound.servers.plus.net (unknown [212.159.14.225])
	by hub.freebsd.org (Postfix) with SMTP id A4D1E37B479
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 23 Oct 2000 12:05:23 -0700 (PDT)
Received: (qmail 20234 invoked from network); 23 Oct 2000 19:05:07 -0000
Received: from unknown (HELO dino.mithy.org) (212.159.30.31)
  by murphys with SMTP; 23 Oct 2000 19:05:07 -0000
Received: from celery.mithy.org (celery [10.0.0.3])
	by dino.mithy.org (8.11.1/8.11.1) with ESMTP id e9NJ51h01003
	for <FreeBSD-gnats-submit@freebsd.org>; Mon, 23 Oct 2000 20:05:02 +0100 (BST)
	(envelope-from keith@dino.mithy.org)
Received: (from keith@localhost)
	by celery.mithy.org (8.11.1/8.11.1) id e9NJ52B01139;
	Mon, 23 Oct 2000 20:05:02 +0100 (BST)
	(envelope-from keith)
Message-Id: <200010231905.e9NJ52B01139@celery.mithy.org>
Date: Mon, 23 Oct 2000 20:05:02 +0100 (BST)
From: Keith Jones <keith@mithy.org>
Sender: keith@dino.mithy.org
Reply-To: keith@mithy.org
To: FreeBSD-gnats-submit@freebsd.org
Subject: [RARE] cross-compiled static bins in /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
X-Send-Pr-Version: 3.2

>Number:         22256
>Category:       bin
>Synopsis:       [RARE] cross-compiled static bins in /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    marcel
>State:          closed
>Quarter:        
>Keywords:       
>Date-Required:  
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Oct 23 12:10:01 PDT 2000
>Closed-Date:    Tue Nov 21 00:10:11 PST 2000
>Last-Modified:  Tue Nov 21 00:46:11 PST 2000
>Originator:     Keith Jones
>Release:        FreeBSD 4.1.1-STABLE i386
>Organization:
no
>Environment:

	MACHINE #1

	uname output:
	FreeBSD celery.mithy.org 4.1.1-STABLE
	FreeBSD 4.1.1-STABLE #0: Sat Oct 21 19:24:18 BST 2000
	root@celery.mithy.org:/usr/obj/usr/src/sys/CELERY  i386

	cpu/mem (dmesg output):
	CPU: Pentium II/Pentium II Xeon/Celeron (400.91-MHz 686-class CPU)
	Origin = "GenuineIntel"  Id = 0x665  Stepping = 5
	Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,
		MCA,CMOV,PAT,PSE36,MMX,FXSR>
	real memory  = 268435456 (262144K bytes)
	avail memory = 257970176 (251924K bytes)

	MACHINE #2

	uname output:
	FreeBSD dino.mithy.org 4.1.1-STABLE
	FreeBSD 4.1.1-STABLE #0: Mon Oct 23 03:32:15 BST 2000
	root@celery.mithy.org:/usr/obj/usr/src/sys/DINO.syscons  i386

	cpu/mem (dmesg output):
	CPU: i386DX (386-class CPU)
	real memory  = 8388608 (8192K bytes)
	avail memory = 6225920 (6080K bytes)

>Description:

	I'm not really sure which category this belongs in; 'bin' seems
	to be the likeliest bet. Apologies if this is in error.

	To remake world on #2, it is first built on #1 as compliation time
	is far too long and available memory probably isn't up to the job.

	'make world' was previously run on machine #1 with CFLAGS options
	'-march=i686 -mcpu=i686', for obvious performance reasons.

	Before performing 'make buildworld' with machine #2 as the intended
	target system, these options were changed to '-march=i386 -mcpu=i386'.

	The installation is performed in the following stages (assume
	single-user mode for machine #2):

	1.  a 'make buildworld buildkernel KERNEL=DINO' is run on machine #1.
	2.  /usr/src and /usr/obj are NFS-mounted onto machine #2.
	3.  'make installkernel KERNEL=DINO' is run on machine #2.
	4.  'make installworld' is run on machine #2.

	In the case above (which is admittedly rare), this results in a
	Signal 4 (Illegal Instruction) during 'make installworld' when the
	'strip' command is called (notably by 'install with the '-s' option)
	as 'make installworld' will attempt to use the broken binaries in the
	/usr/obj/usr/src/i386 tree.

	I suspect that that is happening during build time is this: a number
	of static binaries in /usr/obj/usr/src/i386 are being linked with the
	(possibly older, or even incompatible) libraries resident in /usr/lib
	at the time the build took place, rather than the newly-compiled
	libraries in /usr/obj. I suspect therefore that this is a bug in the
	'make buildworld' process.

	This may affect other binaries used during the install process (though
	I must confess I didn't notice any problems with any of the others).
	It is not expected to affect any binaries on the target system once
	installation is complete, as the new binaries will be (dynamically
	or statically) linked with the correct libraries during the
	installation process (they are not statically linked during the
	build process).

>How-To-Repeat:

	See above. (You'll need two machines, I suspect.)

>Fix:

	The workaround is to copy the existing (static) 'strip' command
	from Machine #2's /usr/bin to /usr/obj/usr/src/i386/usr/bin prior
	to performing 'make installworld'.

	A permanent fix may be to include 'strip' in the list of files to be
	copied to '/tmp/install.NNN' (where NNN is the PID) during the install
	phase, and remove it from the list of static binaries built in
	/usr/obj/usr/src/i386 during the build phase.


>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: freebsd-bugs->marcel 
Responsible-Changed-By: johan 
Responsible-Changed-When: Wed Oct 25 11:29:58 PDT 2000 
Responsible-Changed-Why:  
Marcel, can you have a quick look at this and maybe add 'strip' 
to the copied programs in installworld. 

http://www.freebsd.org/cgi/query-pr.cgi?pr=22256 

From: Marcel Moolenaar <marcel@cup.hp.com>
To: johan@FreeBSD.org
Cc: freebsd-bugs@FreeBSD.org
Subject: Re: bin/22256: [RARE] cross-compiled static bins in 
 /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
Date: Wed, 25 Oct 2000 15:30:11 -0400

 johan@FreeBSD.org wrote:
 > 
 > Synopsis: [RARE] cross-compiled static bins in /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
 > 
 > Responsible-Changed-From-To: freebsd-bugs->marcel
 > Responsible-Changed-By: johan
 > Responsible-Changed-When: Wed Oct 25 11:29:58 PDT 2000
 > Responsible-Changed-Why:
 > Marcel, can you have a quick look at this and maybe add 'strip'
 > to the copied programs in installworld.
 > 
 > http://www.freebsd.org/cgi/query-pr.cgi?pr=22256
 
 There's no bug in the build process that I know of. The tools in
 /usr/obj that are specifically compiled to run on the build machine need
 to be linked against the libraries in /usr/lib. In this case the build
 machine is incompatible with the install machine, simply because the
 binaries and libraries were optimized for the build machine and by that
 unusable on the install machine. This is just as bad as building on
 Alpha and installing on IA-32.
 
 To make the build process a bit more resistant, we can force bootstrap,
 build and cross tools to be dynamicly linked instead of staticly linked.
 
 We should not add strip to installworld. Think about what would happen
 if we did a source upgrade on FreeBSD 2.2.5 and basicly install ELF over
 aout. We would be using an aout strip on ELF files...
 
 -- 
 Marcel Moolenaar
   mail: marcel@cup.hp.com / marcel@FreeBSD.org
   tel:  (408) 447-4222


From: Keith Jones <keith@mithy.org>
To: Marcel Moolenaar <marcel@cup.hp.com>
Cc: freebsd-bugs@FreeBSD.ORG, johan@FreeBSD.ORG
Subject: Re: bin/22256: [RARE] cross-compiled static bins in /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
Date: Wed, 25 Oct 2000 22:33:24 +0100 (23:33 CEST)


On Wed, Oct 25, 2000 at 03:30:11PM -0400, Marcel Moolenaar wrote:
> johan@FreeBSD.org wrote:
> > 
> > Synopsis: [RARE] cross-compiled static bins in /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
> > 
> > Responsible-Changed-From-To: freebsd-bugs->marcel
> > Responsible-Changed-By: johan
> > Responsible-Changed-When: Wed Oct 25 11:29:58 PDT 2000
> > Responsible-Changed-Why:
> > Marcel, can you have a quick look at this and maybe add 'strip'
> > to the copied programs in installworld.
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=22256
> 
> There's no bug in the build process that I know of. The tools in
> /usr/obj that are specifically compiled to run on the build machine need
> to be linked against the libraries in /usr/lib. In this case the build
> machine is incompatible with the install machine, simply because the
> binaries and libraries were optimized for the build machine and by that
> unusable on the install machine. This is just as bad as building on
> Alpha and installing on IA-32.

Actually I don't see what's wrong with that, if your target machine is
sufficiently slow that you need to do so. If you're able to supply the
relevant architecture options in /etc/make.conf, the build/install process
(ideally) ought to be able to cope with this. Kernel cross-compilation from
one architecture to another is (AFAICT) simply a case of setting the kernel
config file up correctly. Likewise most of the ports and 'make' options.
It ought to be the same for 'make buildworld'.

Therefore, I would still maintain that it is a bug, albeit a low-priority
one.

> To make the build process a bit more resistant, we can force bootstrap,
> build and cross tools to be dynamicly linked instead of staticly linked.

That's possible, though it would be "nicer" if the binaries could be (either
statically or dynamicaly) linked against the libraries that are built in
/usr/obj during 'make buildworld' by supplying the appropriate library path
flags to 'ld'.

> We should not add strip to installworld. Think about what would happen
> if we did a source upgrade on FreeBSD 2.2.5 and basicly install ELF over
> aout. We would be using an aout strip on ELF files...

That I'll grant; it was only one possible option.

> -- 
> Marcel Moolenaar
>   mail: marcel@cup.hp.com / marcel@FreeBSD.org
>   tel:  (408) 447-4222
> 

Kind regards

Keith


From: Keith Jones <keith@mithy.org>
To: Marcel Moolenaar <marcel@cup.hp.com>
Cc: johan@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG
Subject: Re: bin/22256: [RARE] cross-compiled static bins in /usr/obj/usr/src/i386 can cause Signal 4 during make installworld
Date: Wed, 25 Oct 2000 22:51:33 +0100 (23:51 CEST)

Hi again,


Apologies for the double post, I just had a further thought on this, based
on the fact that 'strip' is required by both the build and install processes.

Would it be possible to add an extra stage to the buildworld process whereby
'strip', and any likewise affected binaries that are currently probably being
copied into /tmp/install.XXX during the install process itself, are recompiled
_specifically for the target system_, but installed in an entirely different
directory (e.g. /usr/obj/tmp.install)?

If this path is then included in the install process _only_, voila! no more
Signal 4 errors. _And_ you might even be able to cross-compile between Alpha
and IA-32.


Kind regards

Keith


From: Bruce Evans <bde@zeta.org.au>
To: Marcel Moolenaar <marcel@cup.hp.com>
Cc: johan@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG
Subject: Re: bin/22256: [RARE] cross-compiled static bins in  /usr/obj/usr/src/i386
 can cause Signal 4 during make installworld
Date: Thu, 26 Oct 2000 17:32:47 +1100 (EST) (08:32 CEST)

On Wed, 25 Oct 2000, Marcel Moolenaar wrote:

> There's no bug in the build process that I know of. The tools in
> /usr/obj that are specifically compiled to run on the build machine need
> to be linked against the libraries in /usr/lib. In this case the build
> machine is incompatible with the install machine, simply because the
> binaries and libraries were optimized for the build machine and by that
> unusable on the install machine. This is just as bad as building on
> Alpha and installing on IA-32.
> 
> To make the build process a bit more resistant, we can force bootstrap,
                                  ^^^^ less
> build and cross tools to be dynamicly linked instead of staticly linked.
> 
> We should not add strip to installworld. Think about what would happen
> if we did a source upgrade on FreeBSD 2.2.5 and basicly install ELF over
> aout. We would be using an aout strip on ELF files...

Think about what would happen if:
(1) the build machine doesn't have any shared libraries.
(2) the install machine doesn't have any shared libraries.  I think it will
    have them by the time strip is run, but they will be the new ones for
    the install machine, and the cross-strip is linked to old ones for the
    build machine.

I think adding the new strip to installworld would work, especially if
it is statically linked.  Similarly for all (?) the other binaries copied
by installworld.  We already depend on the new /bin/sh working as soon as
it is installed, and if /bin/sh works then other non-binary utilities are
likely to work.  It is just considered safer to use the existing utilities.

Bruce


From: Marcel Moolenaar <marcel@cup.hp.com>
To: Bruce Evans <bde@zeta.org.au>
Cc: johan@FreeBSD.ORG, freebsd-bugs@FreeBSD.ORG
Subject: Re: bin/22256: [RARE] cross-compiled static bins in
 /usr/obj/usr/src/i386can cause Signal 4 during make installworld
Date: Thu, 26 Oct 2000 13:48:03 -0400 (19:48 CEST)

Bruce Evans wrote:
> 
> > To make the build process a bit more resistant, we can force bootstrap,
>                                   ^^^^ less
                                         ^^^^ more
> > build and cross tools to be dynamicly linked instead of staticly linked.
                                ^^^^^^^^^ staticly          ^^^^^^^^
dynamicly

I noticed i switched two crucial words :-/

Building dynamicly does make it less resistant.

> I think adding the new strip to installworld would work, especially if
> it is statically linked.  Similarly for all (?) the other binaries copied
> by installworld.  We already depend on the new /bin/sh working as soon as
> it is installed, and if /bin/sh works then other non-binary utilities are
> likely to work.  It is just considered safer to use the existing utilities.

Yes, it is safer to use the existing utilities. That is the reason why
our bootstrap phase is kept to a minimal. There are exceptions. In this
case I think strip is one of them. We can't depend on strip being able
to work on ELF. That would break upgrading from pre-3.0 aout systems. It
would also break our upgrading if for some unexpected reason we don't
use ELF anymore or there's an incompatibility in ELF. Since we already
built strip as part of the cross-tools, the strip in /usr/obj is capable
of running on the build-machine and therefore the safest version to use
(ie it is known to work with the object format of the target-machine).

Another reason why we should use strip from /usr/obj is that we don't
disallow cross-installations that way. For example; we can run "make
installworld MACHINE_ARCH=alpha" on a i386, and with DESTDIR set to some
mountpoint. Normally this probably isn't the preferred way to install
systems, but might be advantagous for installing onto removable media
when there's a single, fast build machine. Making this impossible by
design is more wrong than it being impossible by circumstances :-)

You're right about /bin/sh though. We should not install the new /bin/sh
until after we've installed everything. Doing this will probably be ugly
due to all the special casing. One way to reduce the pressure on /bin/sh
is by having make use the one saved by the install target. Backward
compatibility is an issue IIRC, because changing the shell in the
makefile was broken at some time (still is? was it ever?). Anyway, this
is still an open issue.



-- 
Marcel Moolenaar

 
 
State-Changed-From-To: open->closed 
State-Changed-By: marcel 
State-Changed-When: Tue Nov 21 00:10:11 PST 2000 
State-Changed-Why:  
Mixing libraries and binaries on different processor models 
within the same architecture is only supported if the 
libraries and binaries are not optimized for the specific 
processors. Future enhancements that rebuilt tools necessary 
for installation should allow this to certain extend. No 
plans currently exist to add such enhancements. 

This PR is closed to avoid confusion. The underlying idea is 
good, but the suggested fix is not. Alternatives exist to 
build on machine A and install on machine B, without running 
anything on machine B (ie set DESTDIR to a NFS mounted FS and 
run the install on machine A). 

If you think this PR is closed in error, let me know. 


http://www.freebsd.org/cgi/query-pr.cgi?pr=22256 
>Unformatted:
