[HN Gopher] Porting OpenVMS to the Itanium Processor Family (200...
       ___________________________________________________________________
        
       Porting OpenVMS to the Itanium Processor Family (2003)[pdf]
        
       Author : naves
       Score  : 29 points
       Date   : 2024-09-29 16:35 UTC (6 hours ago)
        
 (HTM) web link (de.openvms.org)
 (TXT) w3m dump (de.openvms.org)
        
       | twoodfin wrote:
       | The Apache #'s pretty much give the game away: An Itanium clocked
       | 50% higher was losing to a 2yo Alpha by about 20% on throughput
       | at peak.
       | 
       | VLIW made sense when Intel wanted to win the FP-heavy workstation
       | market. But while it was in development, integer-heavy web
       | workloads became dominant and that was basically the ballgame.
        
         | johndoe0815 wrote:
         | The world would be much nicer if we still had new Alpha CPUs.
         | It was intended to be a CPU architecture that lasts 25 years
         | and Digital intended the architecture to support a 1000x
         | increase in performance during that time.
         | 
         | Now we have RISC-V reinventing the wheel. Not the worst
         | outcome, but we could have had it so much better...
        
           | ahoka wrote:
           | It couldn't even handle unaligned access in it's original
           | form. Surely an architecture to last for 25 years.
        
             | fredoralive wrote:
             | Not handling unaligned access gracefully is a classic RISC
             | "feature", as part of the general simplification of a
             | processor to its basics. I'm not sure if it's really an
             | Alpha specific thing. Plus they added some instructions to
             | ease the pain in 1996.
             | 
             | The main issue people tend to bring up with Alpha is the
             | very loose memory model, of the "things happen, but
             | different processors may not really agree on the order they
             | happened in" type of thing (plus, isn't it rude to want to
             | know what other cores have in their cache?). Which would be
             | a pain in our modern multicore world.
             | 
             | Of course we don't know how things would've evolved over
             | time, ARM (at least on big cores[1]) shifted towards the
             | forgiving model for unaligned access, it's possible over
             | time Alpha would've similarly moved to a more forgiving
             | environment for low level programmers.
             | 
             | [1] On embedded stuff, you're going to the Hardfault
             | handler.
        
               | formerly_proven wrote:
               | Alpha had a super loosy-goosy memory model because iirc
               | the cache size they wanted couldn't be built with the
               | performance they needed on the process they had, so they
               | made it from two wholly independent cache banks, both
               | serving the same core through a shared queue.
        
               | dfox wrote:
               | BWX does not help with unaligned accesses, but solves the
               | fact that original Alpha did not even have instructions
               | for memory accesses smaller than word. Which kind of
               | becomes an issue when you start building the systems on
               | PC-like hardware (another related "feature" is that EV5
               | does not have equivalent to MTRRs, but dealing with the
               | weirdness of VGA framebuffer accesses is part of the
               | architecture specification by means of hardcoded uncached
               | memory region).
        
               | fredoralive wrote:
               | TBH, I'm not an expert on Alpha, and wow, as an embedded
               | programmer by trade, that's really wacky way of handling
               | memory access. I guess it made more sense in the
               | minicomputer world where you control the whole stack, but
               | as a more general purpose architecture its well, not the
               | greatest is it.
        
               | dfox wrote:
               | There is a lot of good things to be said about Alpha and
               | it is probably the most RISC of all 90's RISC ISAs, but
               | the actual hardware is full of weirdness and all the real
               | CPUs were deeply pipelined OoO designs (think Intel
               | NetBurst) that prioritized high clock rates and huge
               | straight-line throughput above all else (which is also
               | why that ran really hot and could not be really scaled
               | down for embedded use). Taking good ideas from that while
               | discarding the "speed-demon" design is a part of why AMD
               | become relevant again in 00's with amd64 cementing that
               | position (but well, AMD K7 is very much Alpha-related
               | design, to the extent that the chipsets are
               | interchangeable between K7 and EV6. The interesting part
               | of that is that these CPUs do not have FSB in the "bus"
               | sense, but there is a point-to-point link between CPU and
               | chipset).
        
               | fredoralive wrote:
               | AMD using an Alpha bus for early Athlons feels like a
               | weird lost opportunity. Cheap x86 aimed motherboards that
               | can run also run Alpha chips with Windows 2000 + FX!32
               | for compatibility, it might've had a chance to shine,
               | albeit a slight chance. Sadly by then Compaq had already
               | boarded the Itanic...
        
           | formerly_proven wrote:
           | DEC designed StrongARM pretty much immediately after Alpha
           | shipped because Alpha chips ran hot as frick and DEC
           | engineers didn't see a path to low-power Alpha.
        
             | twoodfin wrote:
             | Do you know a good paper on the development of StrongARM?
        
           | aardvark179 wrote:
           | Much as some aspects of the Alpha were great its weak memory
           | model would have resulted in even more concurrency issues
           | than we have now, and way more explicit fences.
        
         | formerly_proven wrote:
         | Itanium was primarily developed by Intel, Itanium 2 primarily
         | by the HP team that also was responsible for the competitive
         | PA-RISC chips. (Or so they say). In any case, Itanium 2 still
         | outperformed much later AMD Opterons and Intel Xeons running at
         | twice the clock in numerical workloads. That's pretty
         | impressive.
        
           | twoodfin wrote:
           | That's my point: If the demand for high-end compute at the
           | turn of the millennium had looked the same as the demand for
           | high-end compute in 1992, Itanium probably would have
           | conquered the world.
           | 
           | But Tim Berners-Lee had a NeXT and some good ideas...
        
         | jcranmer wrote:
         | The Itanium architecture is a weird weird architecture. It's
         | not weird in the sense of Alpha's weirdness (e.g., the super
         | weak memory model), which can be fairly easily compensated for,
         | but it's weird in several ways that make me just stare at the
         | manual and go "how are you supposed to write a compiler for
         | this?" It's something that requires a sufficiently smart
         | compiler to get good performance, while at the same time so
         | designed as to make writing that sufficiently smart compiler
         | effectively compiled.
         | 
         | It wouldn't surprise me if Itanium actually had pretty
         | compelling SPECint numbers. But a lot of those compelling
         | numbers would have come from massive overtuning of the compiler
         | to the benchmark specifically. Something that's going to be
         | especially painful for I/O-heavy workloads is that the
         | gargantuan register files make any context switch painfully
         | slow.
        
           | aleph_minus_one wrote:
           | There exists a co-evolution between compilers, programming
           | languages and CPUs (or more generally ASICs). I consider it
           | to be very plausible that it is quite possible to develop a
           | programming language that makes it sufficiently easy for a
           | programmer to write performant code for an Itanium, but such
           | a programming language would look different from C or C++.
        
         | aleph_minus_one wrote:
         | > The Apache #'s pretty much give the game away: An Itanium
         | clocked 50% higher was losing to a 2yo Alpha by about 20% on
         | throughput at peak.
         | 
         | This is not just a benchmark of the CPUs, but also of the
         | compilers involved. It is well-known that it was very hard to
         | write a compiler that generates programs that could harness the
         | optimization potential of Itanium's instruction set.
        
       | pdw wrote:
       | Amusingly similar to the much more recent slide decks about the
       | x86 port, e.g.
       | https://vmssoftware.com/docs/State_of_Port_20171006.pdf
        
       | sillywalk wrote:
       | On a similar note, porting Linux to Itanium -- A System
       | Implementor's Tale [PDF]
       | 
       | https://www.usenix.org/legacy/event/usenix05/tech/general/gr...
       | 
       | and, NonStop on Itanium [PDF]:
       | 
       | https://www.researchgate.net/profile/David_Bernick/publicati...
        
       ___________________________________________________________________
       (page generated 2024-09-29 23:01 UTC)