Newsgroups: comp.arch
Path: utzoo!utgpu!watserv1!watdragon!rose!ccplumb
From: ccplumb@rose.uwaterloo.ca (Colin Plumb)
Subject: Re: bizarre instructions
Message-ID: <1991Feb23.030101.3164@watdragon.waterloo.edu>
Sender: daemon@watdragon.waterloo.edu (Owner of Many System Processes)
Organization: University of Waterloo
References: <9102210042.AA12291@ucbvax.Berkeley.EDU> <6503@mentor.cc.purdue.edu> <45405@nigel.ee.udel.edu>
Date: Sat, 23 Feb 1991 03:01:01 GMT
Lines: 95

new@ee.udel.edu (Darren New) wrote:
>How about this?  A language in which it is possible to write functions
>in assembler and have them inlined automatically, and to have the
>compiler smart enough to do dataflow analysis on the resultant code and
>such.  Possibly some syntax close to what PCC uses could be used inside
>such functions.  I would imagine that GCC is close to capable of doing
>this if it isn't already. The minor problem of efficient multi-variable
>returns might need to be worked out.

Here's a (sort of) working example using gcc 1.39 on a uVAX.  "g" is
the "general" operand class that covers all the VAX addressing modes.

int main()
{
	int i, j, k, l;
	i = 12;
	for (k =0; k < 10; k++) {
		asm("foo %1,%0": "=g" (j) : "g" (i));
		asm("bar %1,%0": "=g" (l) : "g" (i));
	}
	printf("%d\n",j);
	return 0;
}

Note that foo is loop-invariant, while bar is dead.  gcc -O -fstrength-reduce
produces:

#NO_APP
gcc_compiled.:
.text
LC0:
	.ascii "%d\12\0"
	.align 1
.globl _main
_main:
	.word 0x0
#APP
	foo $12,r1
#NO_APP
	movl $9,r0
L4:
	decl r0
	jgeq L4
	pushl r1
	pushab LC0
	calls $2,_printf
	clrl r0
	ret

I'm slightly annoyed by gcc's failure to delete the empty loop.
Testing reveals it fails to delete even trivially empty loops.  At
least it's not a commo case.  Interestingly, if we make bar depend on
the loop index k, it's still dead code, and still deleted, but it
inhibits the loop optimisation to count-down form.  I guess it's the
order in which optimisations are applied.  The code is deleted after
the loop is generated in the count-up form:

int main()
{
	int i, j, k, l;
	i = 12;
	for (k =0; k < 10; k++) {
		asm("foo %1,%0": "=g" (j) : "g" (i));
		asm("bar %1,%0": "=g" (l) : "g" (k));
	}
	printf("%d\n",j);
	return 0;
}

#NO_APP
gcc_compiled.:
.text
LC0:
	.ascii "%d\12\0"
	.align 1
.globl _main
_main:
	.word 0x0
	clrl r0
#APP
	foo $12,r1
#NO_APP
L4:
	incl r0
	cmpl r0,$9
	jleq L4
	pushl r1
	pushab LC0
	calls $2,_printf
	clrl r0
	ret

Oh, well, 2.0 is coming, and there has to be something to fix...
-- 
	-Colin
