Path: umaxc.weeg.uiowa.edu!ns-mx!hobbes.physics.uiowa.edu!zaphod.mps.ohio-state.edu!usc!sdd.hp.com!mips!darwin.sura.net!Sirius.dfn.de!chx400!decus!mediatex!BEIZ!Shadow
From: Shadow@BEIZ.mediatex.ch
Newsgroups: comp.sys.apple2
Subject: Re: FAST DIVISION ALGORITHM
Message-ID: <2a2cd896.8c9bcf.6@BEIZ.mediatex.ch>
Date: 4 Jun 92 00:19:14 GMT
Organization: MEDIAtex AG, CH-8807 Freienbach/SZ
Lines: 52

This routine is almost the same as yours
, Power, only that is uses the full 16 b
it of the 65c816, is therefore a bit fas
ter on a IIGS.

You could even use one fixed value for D
iv1 (8 bit fraction), e.g.:

Div1    :  Div2   =   Result
44.625  :  5      =   8.925
$2CA0   :  $0005  =   $08EC

       mx  %00
--------------------------
_Divide                                
--------------------------
       stx Div1
       sty Div2
       stz Result
       ldx #16
       lda #0
       clc
]Loop  rol Result
       dex
       bmi :End
       asl Div1
       rol
       cmp Div2
       bcc ]Loop
       sbc Div2
       sec                             
       bra ]Loop
:End   sta Remain
       rts

Result ds  2
Remain ds  2
Div1   ds  2
Div2   ds  2


----------------------------------------
Andre Horstmann          Bright Software
6300 Zug, Switzerland    ===============
GEnie A.HORSTMANN
----------------------------------------


|B                                    B|
|E           CH: VTX *3600#           E|
|I           D:  BTX *28881#          I|
|Z                                    Z|
Path: umaxc.weeg.uiowa.edu!ns-mx!uunet!ukma!wupost!sdd.hp.com!elroy.jpl.nasa.gov!nntp-server.caltech.edu!toddpw
From: toddpw@cco.caltech.edu (Todd P. Whitesel)
Newsgroups: comp.sys.apple2
Subject: Re: FAST DIVISION ALGORITHM
Message-ID: <1992Jun3.190050.18092@cco.caltech.edu>
Date: 3 Jun 92 19:00:50 GMT
References: <2a2cd896.8c9bcf.6@BEIZ.mediatex.ch>
Sender: news@cco.caltech.edu
Organization: California Institute of Technology, Pasadena
Lines: 88
Nntp-Posting-Host: bartman

Shadow@BEIZ.mediatex.ch writes:

>       cmp Div2
>       bcc ]Loop
>       sbc Div2
>       sec                             

A small comment here: the SEC is totally unnecessary. CMP's set the flags
according to the result of the equivalent SBC instruction, so after the
SBC the carry flag will still be set. The whole point of the CMP is to ensure
that the carry flag will in fact be set after the subtraction; if it isn't
then it avoids the subtraction altogether.

Your loops could also use some rearrangement. Here's an algorithm I got from
an odd book called (some kind of weird joke?) "Universal Assembly Language":

	stx	divisor
	sty	dividend
	ldx	#16
	lda	#0
lp	asl	dividend
	rol
	cmp	divisor
	bcc	nextbit
	sbc	divisor
	inc	dividend
nextbit	dex
	bne	lp
;"dividend" now contains the result of the division.
	sta	remainder

It is rather sneaky, and yes, it is correct. What it does is essentially the
same computation, but it shifts the result into the dividend location instead
of a seperate result location. However, I like the carry flag technique used
by Shadow's code, so let's add it in:

	stx	divisor
	sty	dividend
	ldx	#16
	lda	#0
lp	rol	dividend
	rol
	cmp	divisor
	bcc	nextbit
	sbc	divisor
nextbit	dex
	bne	lp
	rol	dividend
;"dividend" now contains the result of the division.
	sta	remainder

Note that now when a 1 bit is entered into the result, we save 7 cycles. We
do perform an extra ROL at the end however. The relative performance is
determined by what kind of answers we expect: result of 0, new code is 7
cycles slower; result is power of two, new code is the same speed; otherwise
the new code is faster (and this is most of the time so we judge this a win).

Note that the first ROL to dividend shifts in a garbage bit, which is shifted
out by the final one. That is what we are spending in order to make part of
the main loop faster, and in this case we expect to save time overall so it's
a good thing. What's interesting is that the effect is that of PRESERVING the
carry flag of the caller. This is sometimes a big concern for assembly routines
because the carry flag is a great place to pass booleans.

Lastly, here's a slightly rearranged version for clarity:

	stx	divisor
	sty	dividend
	ldx	#16
	lda	#0
	rol	dividend
lp	rol
	cmp	divisor
	bcc	nextbit
	sbc	divisor
nextbit	rol	dividend
	dex
	bne	lp
;"dividend" now contains the result of the division.
	sta	remainder

We are now communicating the dividend bit instead of the result bit in the
carry flag when we branch back to lp. Note that we could also change the
initial ROL to an ASL, making the garbage bit a 0 and thereby setting the
carry flag to 0 at the end of the routine.

Todd Whitesel
toddpw @ tybalt.caltech.edu
Newsgroups: comp.sys.apple2.programmer
Path: news.weeg.uiowa.edu!news.uiowa.edu!hobbes.physics.uiowa.edu!newsrelay.iastate.edu!vixen.cso.uiuc.edu!howland.reston.ans.net!cs.utexas.edu!uunet!nevada.edu!jimi!news.cs.unlv.edu.!sknkwrks
From: sknkwrks@matt.cs.unlv.edu (Scott Alfter)
Subject: Re: 16-bit divide in 6502
In-Reply-To: shack@pro-ict.cts.com's message of Thu, 27 Jan 94 14:25:58 CDT
Message-ID: <SKNKWRKS.94Jan28200555@matt.cs.unlv.edu>
Sender: news@unlv.edu (News User)
Organization: Skunk Works Software Co.
References: <ar56046@pro-ict.cts.com>
Date: 29 Jan 1994 03:05:55 GMT
Lines: 70

In article <ar56046@pro-ict.cts.com> shack@pro-ict.cts.com (Randy Shackelford) writes:
>jmk3@crux3.cit.cornell.edu (Jay Krell) writes:
>>Anyone got a 16-bit divide/remainder routine in 6502?
>>Specifically, I'd need to divide by something between
>>91 and 94, the most optimizable being 92, 4 * 23.
>
>Think there's one in Eyes' and Lichty's Programming the 65816 I'll give it a
>look - yep pp. 273-4. Should I post it?

Actually, I think he's looking for 8-bit (6502) code to divide 16-bit
numbers.  Here's something I cobbled together a few years ago that
does the job.  With a few modifications, you could also do 32-bit
division.  The routine assumes that numbers are unsigned.  The four
parameters would probably be best placed in page 0 (you'll get
smaller, faster code), but you can use any address you want.  The last
5 lines create space for the parameters at the end of the program and
can be deleted if you set aside space for the numbers somewhere else.

The pseudo-ops used here are those used by the MicroSPARC/MindCraft
Assembler; those of you who use Merlin, ORCA/M, or something else will
want to change them to whatever your assembler uses.  (Only one
pseudo-op is used: DFS xx, which sets aside xx bytes of memory for
data storage.)

*16-Bit Integer Division Routine
*---------------------------------------------
*Parameter list:
*  DIVIDEND (in, gets scrambled),
*  DIVISOR (in),
*  QUOTIENT (out),
*  REMAINDER (out): Self-explanatory
*Length: $4F (79) bytes
DIVIDE	LDA	#0	;Zero the remainder
	STA	REMAINDER
	STA	REMAINDER+1
	LDY	#16	;We're doing 16-bit division here
]LOOP	ASL	DIVIDEND ;Push a value off the top of the dividend into
	ROL	DIVIDEND+1 ;the remainder
	ROL	REMAINDER
	ROL	REMAINDER+1
	SEC		;Subtract the divisor from the remainder
	LDA	REMAINDER
	SBC	DIVISOR
	STA	]TEMP
	LDA	REMAINDER+1
	SBC	DIVISOR+1
	STA	]TEMP+1
	BMI	]ZERO	;Is the result negative?
]ONE	LDA	]TEMP	;No, so the result of the subtraction becomes
	STA	REMAINDER ;the remainder and 1 gets pushed into the
	LDA	]TEMP+1	;quotient
	STA	REMAINDER+1
	SEC
	BCS	]PUSH
]ZERO	CLC		;Yes, so just push 0 into the quotient
]PUSH	ROL	QUOTIENT ;Push it in
	ROL	QUOTIENT+1
	DEY		;Done yet?
	BNE	[LOOP	;Y-reg is zero when finished
	RTS
]TEMP	DFS	2	;Subtraction initially goes here
DIVIDEND DFS	2	;Next four are self-explanatory
DIVISOR	DFS	2
QUOTIENT DFS	2
REMAINDER DFS	2

  _/_   Scott Alfter (sknkwrks@cs.unlv.edu)       Ask me about SoftDAC--digital
 / v \  Skunk Works BBS offline for maintenance!  audio for your Apple IIe/IIc!
(IIGS(  (702) 894-9619 300-14400 V.32bis 1:209/263 Apple II, IBM, Trek, & more!
 \_^_/  ---===#### Why be politically correct when you can be RIGHT? ####===---