Path: umaxc.weeg.uiowa.edu!ns-mx!hobbes.physics.uiowa.edu!zaphod.mps.ohio-state.edu!usc!sdd.hp.com!mips!darwin.sura.net!Sirius.dfn.de!chx400!decus!mediatex!BEIZ!Shadow From: Shadow@BEIZ.mediatex.ch Newsgroups: comp.sys.apple2 Subject: Re: FAST DIVISION ALGORITHM Message-ID: <2a2cd896.8c9bcf.6@BEIZ.mediatex.ch> Date: 4 Jun 92 00:19:14 GMT Organization: MEDIAtex AG, CH-8807 Freienbach/SZ Lines: 52 This routine is almost the same as yours , Power, only that is uses the full 16 b it of the 65c816, is therefore a bit fas ter on a IIGS. You could even use one fixed value for D iv1 (8 bit fraction), e.g.: Div1 : Div2 = Result 44.625 : 5 = 8.925 $2CA0 : $0005 = $08EC mx %00 -------------------------- _Divide -------------------------- stx Div1 sty Div2 stz Result ldx #16 lda #0 clc ]Loop rol Result dex bmi :End asl Div1 rol cmp Div2 bcc ]Loop sbc Div2 sec bra ]Loop :End sta Remain rts Result ds 2 Remain ds 2 Div1 ds 2 Div2 ds 2 ---------------------------------------- Andre Horstmann Bright Software 6300 Zug, Switzerland =============== GEnie A.HORSTMANN ---------------------------------------- |B B| |E CH: VTX *3600# E| |I D: BTX *28881# I| |Z Z| Path: umaxc.weeg.uiowa.edu!ns-mx!uunet!ukma!wupost!sdd.hp.com!elroy.jpl.nasa.gov!nntp-server.caltech.edu!toddpw From: toddpw@cco.caltech.edu (Todd P. Whitesel) Newsgroups: comp.sys.apple2 Subject: Re: FAST DIVISION ALGORITHM Message-ID: <1992Jun3.190050.18092@cco.caltech.edu> Date: 3 Jun 92 19:00:50 GMT References: <2a2cd896.8c9bcf.6@BEIZ.mediatex.ch> Sender: news@cco.caltech.edu Organization: California Institute of Technology, Pasadena Lines: 88 Nntp-Posting-Host: bartman Shadow@BEIZ.mediatex.ch writes: > cmp Div2 > bcc ]Loop > sbc Div2 > sec A small comment here: the SEC is totally unnecessary. CMP's set the flags according to the result of the equivalent SBC instruction, so after the SBC the carry flag will still be set. The whole point of the CMP is to ensure that the carry flag will in fact be set after the subtraction; if it isn't then it avoids the subtraction altogether. Your loops could also use some rearrangement. Here's an algorithm I got from an odd book called (some kind of weird joke?) "Universal Assembly Language": stx divisor sty dividend ldx #16 lda #0 lp asl dividend rol cmp divisor bcc nextbit sbc divisor inc dividend nextbit dex bne lp ;"dividend" now contains the result of the division. sta remainder It is rather sneaky, and yes, it is correct. What it does is essentially the same computation, but it shifts the result into the dividend location instead of a seperate result location. However, I like the carry flag technique used by Shadow's code, so let's add it in: stx divisor sty dividend ldx #16 lda #0 lp rol dividend rol cmp divisor bcc nextbit sbc divisor nextbit dex bne lp rol dividend ;"dividend" now contains the result of the division. sta remainder Note that now when a 1 bit is entered into the result, we save 7 cycles. We do perform an extra ROL at the end however. The relative performance is determined by what kind of answers we expect: result of 0, new code is 7 cycles slower; result is power of two, new code is the same speed; otherwise the new code is faster (and this is most of the time so we judge this a win). Note that the first ROL to dividend shifts in a garbage bit, which is shifted out by the final one. That is what we are spending in order to make part of the main loop faster, and in this case we expect to save time overall so it's a good thing. What's interesting is that the effect is that of PRESERVING the carry flag of the caller. This is sometimes a big concern for assembly routines because the carry flag is a great place to pass booleans. Lastly, here's a slightly rearranged version for clarity: stx divisor sty dividend ldx #16 lda #0 rol dividend lp rol cmp divisor bcc nextbit sbc divisor nextbit rol dividend dex bne lp ;"dividend" now contains the result of the division. sta remainder We are now communicating the dividend bit instead of the result bit in the carry flag when we branch back to lp. Note that we could also change the initial ROL to an ASL, making the garbage bit a 0 and thereby setting the carry flag to 0 at the end of the routine. Todd Whitesel toddpw @ tybalt.caltech.edu Newsgroups: comp.sys.apple2.programmer Path: news.weeg.uiowa.edu!news.uiowa.edu!hobbes.physics.uiowa.edu!newsrelay.iastate.edu!vixen.cso.uiuc.edu!howland.reston.ans.net!cs.utexas.edu!uunet!nevada.edu!jimi!news.cs.unlv.edu.!sknkwrks From: sknkwrks@matt.cs.unlv.edu (Scott Alfter) Subject: Re: 16-bit divide in 6502 In-Reply-To: shack@pro-ict.cts.com's message of Thu, 27 Jan 94 14:25:58 CDT Message-ID: Sender: news@unlv.edu (News User) Organization: Skunk Works Software Co. References: Date: 29 Jan 1994 03:05:55 GMT Lines: 70 In article shack@pro-ict.cts.com (Randy Shackelford) writes: >jmk3@crux3.cit.cornell.edu (Jay Krell) writes: >>Anyone got a 16-bit divide/remainder routine in 6502? >>Specifically, I'd need to divide by something between >>91 and 94, the most optimizable being 92, 4 * 23. > >Think there's one in Eyes' and Lichty's Programming the 65816 I'll give it a >look - yep pp. 273-4. Should I post it? Actually, I think he's looking for 8-bit (6502) code to divide 16-bit numbers. Here's something I cobbled together a few years ago that does the job. With a few modifications, you could also do 32-bit division. The routine assumes that numbers are unsigned. The four parameters would probably be best placed in page 0 (you'll get smaller, faster code), but you can use any address you want. The last 5 lines create space for the parameters at the end of the program and can be deleted if you set aside space for the numbers somewhere else. The pseudo-ops used here are those used by the MicroSPARC/MindCraft Assembler; those of you who use Merlin, ORCA/M, or something else will want to change them to whatever your assembler uses. (Only one pseudo-op is used: DFS xx, which sets aside xx bytes of memory for data storage.) *16-Bit Integer Division Routine *--------------------------------------------- *Parameter list: * DIVIDEND (in, gets scrambled), * DIVISOR (in), * QUOTIENT (out), * REMAINDER (out): Self-explanatory *Length: $4F (79) bytes DIVIDE LDA #0 ;Zero the remainder STA REMAINDER STA REMAINDER+1 LDY #16 ;We're doing 16-bit division here ]LOOP ASL DIVIDEND ;Push a value off the top of the dividend into ROL DIVIDEND+1 ;the remainder ROL REMAINDER ROL REMAINDER+1 SEC ;Subtract the divisor from the remainder LDA REMAINDER SBC DIVISOR STA ]TEMP LDA REMAINDER+1 SBC DIVISOR+1 STA ]TEMP+1 BMI ]ZERO ;Is the result negative? ]ONE LDA ]TEMP ;No, so the result of the subtraction becomes STA REMAINDER ;the remainder and 1 gets pushed into the LDA ]TEMP+1 ;quotient STA REMAINDER+1 SEC BCS ]PUSH ]ZERO CLC ;Yes, so just push 0 into the quotient ]PUSH ROL QUOTIENT ;Push it in ROL QUOTIENT+1 DEY ;Done yet? BNE [LOOP ;Y-reg is zero when finished RTS ]TEMP DFS 2 ;Subtraction initially goes here DIVIDEND DFS 2 ;Next four are self-explanatory DIVISOR DFS 2 QUOTIENT DFS 2 REMAINDER DFS 2 _/_ Scott Alfter (sknkwrks@cs.unlv.edu) Ask me about SoftDAC--digital / v \ Skunk Works BBS offline for maintenance! audio for your Apple IIe/IIc! (IIGS( (702) 894-9619 300-14400 V.32bis 1:209/263 Apple II, IBM, Trek, & more! \_^_/ ---===#### Why be politically correct when you can be RIGHT? ####===---