[HN Gopher] Intel iAPX 432
___________________________________________________________________
Intel iAPX 432
Author : sebastianconcpt
Score : 39 points
Date : 2022-08-10 15:58 UTC (1 days ago)
(HTM) web link (en.wikipedia.org)
(TXT) w3m dump (en.wikipedia.org)
| nullc wrote:
| iAPX 432's security features would be welcome in the computing
| world we have today, I wonder to what extent its failures doomed
| similar functionality in Intel?
|
| At least there is CHERI now but we still hardly seem close to
| having hardware enforced capabilities-grade security in high
| perfomance server kit.
| kps wrote:
| The i960 MX (nee BiiN) had a similar tagged-memory capability
| system along with a fairly pleasant RISC instruction set.
| twoodfin wrote:
| I keep hoping @bcantrill & the Oxide crew will do a Twitter space
| on the i432, or perhaps failed architectures generally.
| bcantrill wrote:
| We would love to! Maybe we could convince Robert Colwell to
| join us, as his paper on the 432 is one of my favorite systems
| papers of all time![0]
|
| [0] http://dtrace.org/blogs/bmc/2008/07/18/revisiting-the-
| intel-...
| jaykru wrote:
| Rob was happy to chat with me about his 432 paper over
| LinkedIn (cold DM'd him) for a semester project I did on it a
| few months back. He might go for a podcast episode :) I'd
| love to listen to it!
| chasil wrote:
| It is amazing how many failures Intel has survived, and that
| their core competence really emerged from the Datapoint 2200.
| iforgotpassword wrote:
| If you're into some light edutainment-style videos, I enjoyed
| watching RetroBytes recently: https://youtu.be/4o4MXV-d-jQ
| mattst88 wrote:
| I remember reading an article about the iAPX 432 that went into
| extensive detail about the compounding effects of the design--I
| recall it describing how an operation with an small constant
| operand would be slow because the ISA didn't support immediates,
| and as a result you'd have to load it from memory, and there was
| not even a cache to help with that.
|
| Does anyone know this article? I've searched and haven't been
| able to find it, and it was definitely worth a read.
| twoodfin wrote:
| I think you want "Performance Effects of Architectural
| Complexity in the Intel 432"
|
| https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14...
| Lammy wrote:
| Could it be
| https://homes.cs.washington.edu/~levy/capabook/Chapter9.pdf ?
|
| Sorry for huge quote, but it's from a huge article:
|
| =======================================================
|
| From section 9.2, _Segments and Objects_ :
|
| > All objects are addressed through capabilities which, on the
| Intel 432, are called accessdescriptors (ADS). (The vendor's
| terminology is used in this chapter for compatibility with
| Intel literature. The notation "AD" is used throughout for
| "capability.")
|
| > At the lowest level, objects are composed of memory segments,
| and a memory segment is the most fundamental object (called a
| generic object on the Intel 432). Each Intel 432 segment has
| two parts: a data part for scalars and an accesspart for ADS,
| as shown in Figure 9-2. Objects requiring both data and access
| descriptors can be stored in a single segment. Segments are
| addressed through ADS, as the figure illustrates. The data part
| grows upward (in the positive direction) from the boundary
| between the two parts, while the accesspart grows downward (in
| the negative direction) from the dividing line. The hardware
| ensures that only data operations are performed on the data
| part and that AD operations are performed on the accesspart.
|
| =======================================================
|
| From section 9.4.3, _Instruction Operand Addressing_ :
|
| > At any moment during a procedure's execution, ADS specified
| by instructions must be located in one of four environment
| objects. Environment object 0 is the context object itself.
| Instructions can specify any of the ADS within the context
| object's accesspart; for example, to refer to the domain or the
| constants data segment. The three remaining environments,
| environments 1 through 3, are defined dynamically by the
| procedure.
|
| > Instruction objects contain only a data part. Because Intel
| 432 instructions are bit-addressable and can start on arbitrary
| bit boundaries, instructions are addressed as bit offsets into
| instruction objects. The first instruction in each instruction
| object begins at bit displacement 64, following the header of
| four 16-bit predefined fields. The maximum size of an
| instruction segment is 64K bits, or 8K bytes, due to the bit
| addressing. Although there is generally one instruction object
| for each procedure in the domain, procedures larger than 8K
| bytes require additional instruction objects. The BRANCH
| INTERSEGMENT instruction can be used to transfer control to
| another instruction object within the same domain.
|
| > The four environment segments thus provide efficient
| addressing of ADS. An instruction can specify an immediate 4-
| or g-bit access selector describing the location of an AD for
| an operand. Or, it can specify the location of a 16-bit
| accessselector located in memory or on the stack. The short
| direct format efficiently addresses any of the first four ADS
| in any of the four environments. This includes the ADS for the
| global constants, context message (calling parameters), and
| current domain within the current context. All of the
| processor-defined ADS within the context object's accesspart
| can be addressed using an 8-bit accessselector.
|
| =======================================================
|
| Unrelated, but I love how they went for the "As Above, So
| Below" approach for growing the data-vs-access-parts of
| instruction object memory ^
| linksnapzz wrote:
| I think I've read the same article, and also wish I had the
| reference-I do remember that there were no or few registers,
| and reads were from memory almost all the time..
|
| Also-does anyone know of a an actual system that shipped with a
| 432? Like, manufacturer and model #?
| kps wrote:
| > the ISA didn't support immediates
|
| I don't know the article, but have a related story. In the '90s
| I worked for a custom compiler shop, and a company you've heard
| of (not Intel) came to us with a system they wanted tools for.
| They had gone all-in on RISC -- operations were all register-
| to-register, and the only memory addressing was register
| indirect (i.e. through an address in a register). We had to
| point out that it would be rather difficult to get an address
| into a register in the first place.
| andrewf wrote:
| Could you do it with shifts and increments? Constant loads
| would look just like multiplies, a glorious RISC apotheosis..
| kps wrote:
| Yes, you could get 0 by subtracting (or xoring) a register
| with itself, then -1 by complementing, then 1 by negating,
| then adding to itself to get any single bit. Then
| synthesize any constant by adding those. The code would be
| impractically slow and large, though.
| kabdib wrote:
| I was taking a VLSI design course in 1981, and the professor
| teaching it proudly showed off some 432 chips (embedded in
| plastic) that he'd been given. He waxed lyrically about them, how
| the big boys were doing silicon in Silly Valley. (We, with our
| colored pencils, were learning how to do NAND gates and full
| adders in NMOS, on graph paper).
|
| Later, I read Organick's book on the 432. It was kind of a mess,
| no idea how they expected the thing to perform.
|
| This was also back when ADA was the up-and-coming language, which
| the 432 was going to run really well (if you believed the
| marketing). ADA was pretty intimidating, as it was complicated
| for the time and generics seemed to scare everyone. (Little did
| we know that C++ was going to be a thing in a decade or so, and
| it made ADA seem simple in comparison).
| chasil wrote:
| ADA evolved into the procedural scripting syntax of many SQL
| databases.
|
| "SQL/PSM is derived, seemingly directly, from Oracle's PL/SQL.
| Oracle developed PL/SQL and released it in 1991, basing the
| language on the US Department of Defense's Ada programming
| language... IBM's SQL PL (used in DB2) and Mimer SQL's PSM were
| the first two products officially implementing SQL/PSM. It is
| commonly thought that these two languages, and perhaps also
| MySQL/MariaDB's procedural language, are closest to the SQL/PSM
| standard. However, a PostgreSQL addon implements SQL/PSM
| (alongside its other procedural languages like the PL/SQL-
| derived plpgsql), although it is not part of the core product."
|
| https://en.wikipedia.org/wiki/SQL/PSM
| PAPPPmAc wrote:
| The first of Intel's many expensive lessons about the problems
| with extremely complicated ISAs dependent on even more
| sophisticated compilers making good static decisions for
| performance. Then they did it again with the i860. Then they did
| it again with Itanium.
| sytse wrote:
| speps wrote:
| Why would you do that without giving credit?
| sytse wrote:
| Good idea, I added
| https://twitter.com/sytses/status/1557803849041072128
| bobloblaw724449 wrote:
| It's fine, it's Sid (he's a good guy).
| generalizations wrote:
| Except when he palms off other people's ideas as his own.
| bobloblaw724449 wrote:
| It's only an HN comment and I don't see why it honestly
| matters. At the end of the day, more people will see his
| tweet and learn about these failed architectures then
| some random comment on some random HN post. Significantly
| more people read twitter than HN.
|
| The way you're reacting to this is like it's 2007 and he
| stole the blueprints to the iPhone.
| bri3d wrote:
| iAPX 432 was sort of a different failure from i860 and Itanium,
| no? My understanding is that the issue with iAPX 432 was that
| the architecture provided object-oriented instructions, but
| they turned out to be slow in practice, and the compiler didn't
| know how slow they were, so it abused them in situations where
| they should have used scalar ops instead, and that in tandem,
| the ABI relied too heavily on pass-by-value. Basically, that
| the iAPX was explained to compiler authors as an object-
| oriented CPU, when it should have been treated as a CPU with
| object-oriented extensions.
|
| Whereas i860 and Itanium were just trying to shoehorn VLIW into
| general-purpose computing, which is generally incredibly
| challenging. VLIW is great for places like DSP, where you have
| a defined real-time stream of data and limited context
| switching. In this case, you can use the spare die space you
| didn't spend on dispatch, prediction, and retirement on more
| MACs or ALUs or vectors, and the compiler can accurately
| predict the latency of a given operation because the source is
| defined. Fundamentally, compiler scheduling is intractable in a
| multiuser or task switching environment, because you have _no
| idea_ what will be in cache ahead of runtime and always end up
| with the i860/Itanium problem, where you stall your entire
| execution pipeline every time you miss cache unexpectedly.
| bombcar wrote:
| Have we (finally) realized the dream? By basically putting the
| "smart" part of the compiler in the chip itself, or do we still
| run relatively simple ISAs?
| PAPPPmAc wrote:
| I argue about this a lot. Some reasonably substantiated
| opinions:
|
| 1. Highly sophisticated large-scale static analysis keeps
| getting beaten by relatively stupid tricks built into
| overgrown instruction decoders, working on relatively narrow
| windows of instructions.
|
| 2. The primary reason for (1) is that performance is now
| almost completely dominated by memory behavior, and making
| good static predictions about the dynamic behavior of fancy
| memory systems in the face of multitasking, DRAM refresh
| cycles, multiple independent devices competing for the memory
| bus, layers of caches, timing variations, etc. is essentially
| impossible.
|
| 3. You can give up on a bunch of your dynamic tricks and
| build much simpler more predictable systems that can be
| statically optimized effectively. You could probably find an
| good local maxima in that style. The dynamic tricks are,
| however, unreasonably effective for performance, and have the
| advantage that they let you have good performance with the
| same binaries on multiple different implementations of an
| ISA. That's not insurmountable (eg. the AOT compilation for
| ART objects on Android), but the ecosystem isn't fully set up
| to support that kind of thing.
| AnimalMuppet wrote:
| By putting it on the chip, it can be dynamic rather than
| static. The microcode can know a lot more of what's going on
| than the compiler can.
___________________________________________________________________
(page generated 2022-08-11 23:01 UTC)