[HN Gopher] Reverse-engineering the multiplication algorithm in ...
___________________________________________________________________
Reverse-engineering the multiplication algorithm in the Intel 8086
processor
Author : CoBE10
Score : 117 points
Date : 2023-03-15 16:16 UTC (6 hours ago)
(HTM) web link (www.righto.com)
(TXT) w3m dump (www.righto.com)
| kens wrote:
| Author here. I'm without electricity but can discuss the 8086 if
| anyone has questions :-)
| gregschlom wrote:
| Ken, I've read and enjoyed many of your posts and I always
| wonder: how do you find so much time to do that? Is that your
| full time occupation?
|
| Thanks!
| kens wrote:
| I'm retired, so I have time for various vintage computer
| projects.
| detrites wrote:
| Not a question, just a comment that after having read several
| of your posts now, building up knowledge as I go, this one is
| the first I've understood from start to end. Most satisfying!
| peteri wrote:
| Interesting, any reason for not discussing AAD yet? (oh and
| great series BTW)
| kens wrote:
| AAD is lower on the list since it is both complicated and
| kind of obscure. Also, it depends on multiplication and
| division, so I want to cover them first.
| bonzini wrote:
| AAD is essentially a multiplication and a sum:
| 170 Q -> tmpc 171
| AH -> tmpb CALL CORX 172 AL -> tmpb
| ADD tmpc 173 ZERO -> AH JMP 179
| 179 SIGMA -> AL RNI F
|
| It doesn't depend on division despite the name, it performs
| ASCII to binary conversion.
| jzwinck wrote:
| As far as I know, x86 and x86-64 have never had microcode for
| ASCII (base 10 is what I'm interested in) to binary number
| conversion. Why do you think this is? It seems like a very
| common operation these days, and the reasons you gave for
| microcode multiply seem applicable.
| prirun wrote:
| Prime minicomputers had decimal arithmetic in microcode &
| hardware in the early 80's to support PL/I and COBOL
| "picture" data types and packed decimal. They also had
| sophisticated editing to pictures like ZZZ,ZZ9.99 which would
| display 1024 as 1,024.00
|
| These instructions were also implemented in PMA, the Prime
| Macro Assembler, in the OS so that if a machine didn't have
| these instructions, a UII fault occurred and the instruction
| was simulated in software. I implemented a lot of the easier
| instructions in C in the Prime emulator, but for the more
| complex things like decimal math, I let those fault so the
| Prime OS would handle them.
| kens wrote:
| The IBM System/360 (1964) had instructions to convert a
| character string to packed decimal, and to convert packed
| decimal to binary. This was motivated by the large amount of
| punch-card data that was handled back then, with all the
| numbers in decimal. But I guess Intel figured that string
| conversion wasn't common enough in modern systems to merit
| adding it to the instruction set.
| garganzol wrote:
| CISC operations like 'imul' and 'idiv' were perceived like curse
| words in the RISC world of the day, but they were actually a
| beacon of hope. When x86 architecture matured, those operations
| became super fast thanks to the hardware acceleration.
|
| So take it with a grain of salt when you hear someone claim that
| RISC is invariantly superior to CISC. Times and times again, the
| truth lies somewhere in the middle.
| leni536 wrote:
| Well, idiv being super fast is new to me, but I guess it's
| relative.
| cperciva wrote:
| Integer division is sufficiently famously slow that people
| (and compilers) go to great lengths to avoid it... which
| makes it rare enough that there's very little point
| implementing it.
|
| If you really need integer division, you can typically
| optimize it by doing a FP iteration and then fixing up the
| approximate result.
| Tuna-Fish wrote:
| The worst case integer divide times on newest Intel/AMD
| cpus are 15/18 for 64bit and 12/13 for 32-bit. Both do
| early out for easy values at around 10 cycles.
|
| Integer division is still on the slower side operations on
| the machines, but the advice to avoid division at all costs
| is outdated. In comparison, the FP divide instructions are
| in the ~13/14 cycles, it would be hard for you to shuffle
| data to that side and back and not be slower. (And no, they
| are not doing this internally -- both have separate integer
| divide units.)
| ajross wrote:
| For those confused as to why there's all this signedness handling
| for the 2's complement multiplication algorithm we were all
| taught in school is identical for both signed and unsigned
| numbers: x86 multiplication is _widening_. The 8086 would compute
| the 32 bit product of two 16 bit values[1] (one of which must be
| AX) and place the result in two output registers (DX and AX,
| always).
|
| If you interpret the arguments as unsigned, you can treat the
| input high bits as zeroes. But if it's signed then you need to
| conceptually sign-extend the high bit. So they're different
| operations with different results in the high word.
|
| [1] There was an 8 bit multiply too. It didn't have this property
| and looked more like a normal instruction, kens would have to
| tell us if it shared the same microcode.
| bonzini wrote:
| Also, we use sign-magnitude representation while x86 uses twos
| complement.
|
| > It didn't have this property and looked more like a normal
| instruction, kens would have to tell us if it shared the same
| microcode.
|
| If you're thinking of "IMUL reg,reg" it was added only in
| 80186. If instead you're thinking of AAD, then yes it reuses
| the CORX subroutine; it's only unsigned so it doesn't need the
| parts that deal with negation.
| andrehacker wrote:
| Ken, please tell us you have a book deal, this is just awesome
| stuff.
| kens wrote:
| I've been vaguely thinking about a book on the 8086 die. I'm
| sure I could sell at least 5 copies :-)
| amir wrote:
| If that's all it takes then I pre-order five copies.
| JohnFen wrote:
| For the good of humanity -- or at least our industry -- or at
| least my intellectual curiosity -- or at least so I can have
| another cool book on my shelf to increase my geek cred --
| please do this!
| spyremeown wrote:
| Count six. I'd buy it in a beat.
___________________________________________________________________
(page generated 2023-03-15 23:01 UTC)