hngopher.com

       [HN Gopher] MIPS Becomes RISC-V
       ___________________________________________________________________
        
       MIPS Becomes RISC-V
        
       Author : zimmerfrei
       Score  : 262 points
       Date   : 2021-03-08 17:53 UTC (5 hours ago)
        
 (HTM) web link (www.eejournal.com)
 (TXT) w3m dump (www.eejournal.com)
        
       | buescher wrote:
       | "Development of the MIPS processor architecture has now stopped"
       | 
       | Is anyone still developing SPARC?
        
         | pwdisswordfish0 wrote:
         | Gaisler and the ESA are launching SPARC into space.
         | 
         | https://www.gaisler.com/index.php/products/components/gr740
        
         | znpy wrote:
         | No one worth mentioning AFAIK.
        
         | tyingq wrote:
         | Fujitsu for general purpose servers, and Atmel (maybe others
         | too?) for rad-hardened Sparc.
        
           | Symmetry wrote:
           | Given that Fujitsu has jumped ship to ARM for their
           | supercomputers I wonder if they'll be doing so with their
           | other hardware as well?
        
           | zokier wrote:
           | > Atmel
           | 
           | Microchip these days, Atmel got acquired few years back..
           | 
           | The packaging looks cool of those
           | https://www.microchip.com/wwwproducts/en/AT697F
        
             | Gracana wrote:
             | That's how it's packaged prior to trimming and lead
             | forming, which looks like this:
             | 
             | https://www.youtube.com/watch?v=4CfEN5R13w4
        
               | [deleted]
        
             | tyingq wrote:
             | That is an interesting look. There's an 8 bit ATMEGA like
             | that too:
             | https://www.microchip.com/wwwproducts/en/ATmegaS128
             | 
             | I wonder what that form factor with the chip suspended like
             | that actually does for it.
        
       | gautamcgoel wrote:
       | This is huge. It looks like the only architectures widely-
       | deployed in ten years will be x86, ARM, Power, and RISC-V (maybe
       | also SPARC64 in Japan, although that's rare in the US).
        
         | justicezyx wrote:
         | This is indeed huge.
         | 
         | Should open the path for the modernization of the auto chip
         | industry, according to [1], 80% of ADAS chips are based on
         | mips.
         | 
         | [1] https://www.globenewswire.com/news-
         | release/2019/02/28/174460...
        
         | steviedotboston wrote:
         | so basically the same architecutes that have been widely
         | deployed for the past 20+ years?
        
         | justin66 wrote:
         | Why do you assume Power will last another 10 years?
        
         | tyingq wrote:
         | The _" not ARM based"_ MCU market might be interesting to watch
         | as well. Not just for RISC-V, but also as ARM MCUs continue to
         | drop in power needs and unit cost.
        
         | sneak wrote:
         | What is the SPARC64 use case in Japan? Supercomputing?
        
         | rst wrote:
         | Depends where you look. Hard to see IBM's z/Architecture dying
         | out in that timeframe (the latest branding for S/360-derived
         | mainframes), for example, and the embedded space is likely to
         | remain an odd bestiary for quite some time.
        
           | monocasa wrote:
           | Embedded is going to become way less of a bestiary, at least
           | in the five digit gate count RISC space.
           | 
           | ARC, Xtensa, V850, arguably MIPS, etc. all worked in the
           | "we're cheaper to license than ARM, and will let you modify
           | the core more than ARM will" space. I'm not sure how they
           | maintain that value add when compared to lower end RISC-V
           | cores. I expect half of them to fold, and half of them to
           | turn into consulting ships for RISC-V and fabless design in
           | general.
        
           | projektfu wrote:
           | Isn't z/Architecture just emulated on top of POWER? That's
           | been my impression for a while.
        
             | monocasa wrote:
             | They share RTL, but it's not just a POWER core with a
             | z/Arch frontend.
        
             | dfox wrote:
             | You probably mean "TIMI" which is the user-visible ISA of
             | IBM's "midrange" systems (ie. AS/400 or System/i) which was
             | from the start meant as virtual machine ISA that is then
             | mostly AOT transpiled into whatever hardware ISA OS/400 or
             | i5/OS runs on. z/Architecture (S/360, ESA/390, what have
             | you...) is distinct from that and distinct from PowerPC.
             | Modern POWER and z/Architecture CPUs and machines are
             | somewhat similar when you look at the execution units and
             | overall system design, but the ISA is completely different
             | and even the performance profile of the mostly similar CPU
             | is different (z/Architecture is "uber-CISC" with
             | instructions like "calculate sha256 of this megabyte of
             | memory").
        
               | chx wrote:
               | I learned Z80 assembly in 1987, x86 assembly somewhere
               | '91-92 can't exactly remember but it was in '94 when I
               | met IBM Assembler (yes they called Assembler language
               | which is also confusing) and I was like "what is this
               | sorcery where assembly has an instruction to insert into
               | a tree".
        
               | monocasa wrote:
               | No, they're probably talking about how modern z/Arch and
               | POWER cores share a lot of HDL source these days.
        
         | dragontamer wrote:
         | The big innovation in architectures is in the SIMD world.
         | 
         | AVX512 (x86 512-bit), SVE (ARM 512-bit), NVidia PTX / SASS
         | (32x32-bit), AMD RDNA (32x32-bit), AMD CDNA (64x32-bit).
         | 
         | 64-bit cores (aka: classic CPUs) are looking like a solved
         | problem, and are becoming a commodity. SIMD compute however,
         | remains an open question. NVidia probably leads today, but
         | there seems to be plenty of room for smaller players.
         | 
         | Heck, one major company (AMD) is pushing 32x32-bit on one
         | market (6xxx series) and 64x32-bit in another market (MI100 /
         | Supercomputers).
        
           | tyingq wrote:
           | Fujitsu seems like a leader for SIMD via their A64FX. I
           | wonder if they will ever venture outside of the
           | supercomputing niche.
        
             | dragontamer wrote:
             | SVE is looking like a general-purpose ARM instruction set
             | in the future.
             | 
             | I believe the Neoverse V-cores (high-performance) will have
             | access to the SVE instructions for example. So the SVE-SIMD
             | is not necessarily locked to Fujitsu (though Fujitsu's
             | particular implementation is probably crazy good. HBM2 +
             | 512-bit wide and #1 supercomputer in the world and all...)
        
           | londons_explore wrote:
           | SIMD today is only really helpful with a few usecases. If you
           | want to encode some video, decode some jpegs, or do a physics
           | simulation quicker, it's really going to help. It won't boot
           | Linux any quicker tho.
           | 
           | I suspect for consumer uses, SIMD is already used for nearly
           | all the use cases it can be.
        
             | m00x wrote:
             | This comment seems a bit shortsighted. GPUs and TPUs are
             | SIMD and ML models are increasingly being used in consumer
             | hardware. Video cards are selling out so fast that there's
             | months worth of backorders.
             | 
             | SIMD processors are being put in self driving cars, robots
             | with vision, doorbell cameras, drones, etc. We're only at
             | the beginning of SIMD use-cases.
        
             | dragontamer wrote:
             | Are you sure about that?
             | 
             | The original SIMD-papers in the 1980s show how to compile a
             | Regex into a highly-parallel state machine and then
             | "reduced" (aka: Scan / Prefix-operation:
             | https://en.wikipedia.org/wiki/Prefix_sum).
             | 
             | A huge amount of operations, such as XML-whitespace removal
             | (aka: SIMD Steam Compacting), Regular Expressions, and
             | more, have been proven ~30 to 40 years ago to benefit from
             | SIMD compute. Yet such libraries don't exist today yet.
             | 
             | SIMD compute is highly niche, and clearly today's
             | population is overly focused on deep-learning... without
             | even seeing the easy opportunities of XML parsing or simple
             | regex yet. Further: additional opportunities are being
             | discovered in O(n^2) operations: such as inner-join
             | operations on your typical database.
             | 
             | Citations.
             | 
             | * For Regular Expressions: Read the 1986 paper "DATA
             | PARALLEL ALGORITHMS". Its an easy read. Hillis / Steele are
             | great writers. They even have the "impossible Linked List"
             | parallelism figured out in there (granted: the nodes are
             | located in such a way that the SIMD-computer can work with
             | the nodes. But... if you had a memory-allocator that worked
             | with their linked-list format, you could very well
             | implement their pointer-jumping approach to SIMD-linked
             | list traversal)
             | 
             | * For whitespace folding / removal, see
             | http://www.cse.chalmers.se/~uffe/streamcompaction.pdf. They
             | don't cite it as XML-whitespace removal, but it seems
             | pretty obvious to me that it could be used for parallel
             | whitespace removal in O(lg(n)) steps.
             | 
             | * Database SIMD:
             | http://www.cs.columbia.edu/~kar/pubsk/simd.pdf . Various
             | operations have been proven to be better on SIMD, including
             | "mass binary search" (one binary search cannot be
             | parallelized. But if you have 5000-binary searches
             | operating in parallel, its HIGHLY efficient to execute all
             | 5000 in a weird parallel manner, far faster than you might
             | originally imagine).
             | 
             | ----------
             | 
             | SIMD-cuckoo hashing, SIMD-skip lists, etc. etc. There's so
             | many data-structures that haven't really been fleshed out
             | on SIMD yet outside of research settings. They have been
             | proven easy to implement and simple / clean to understand.
             | They're just not widely known yet.
        
               | skybrian wrote:
               | It seems like there are a wide variety of ways to
               | serialize and deserialize data, their performance
               | sometimes varies by orders of magnitude, and the slow
               | code persists because it doesn't matter enough to
               | optimize compared to other virtues like convenience and
               | maintainability.
               | 
               | The key seems to be figuring out how to get good (not the
               | best) performance when you mostly care about other
               | things?
               | 
               | Machine learning itself seems like an example of throwing
               | hardware at problems to try to improve the state of the
               | art, to the point where it becomes so expensive that they
               | have to think about performance more.
        
               | dragontamer wrote:
               | SIMD-compute is a totally different model of compute than
               | what most programmers are familiar with.
               | 
               | That's the biggest problem. If you write optimal SIMD-
               | code, no one else in your team will understand it. Since
               | we have so much compute these days (to the point where
               | O(n^2) scanf parsers are all over the place), its become
               | increasingly obvious that few modern programmers care
               | about performance at all.
               | 
               | Nonetheless, the more and more I study SIMD-compute, the
               | more I realize that these expert programmers have figured
               | out a ton of good and fast solutions to a wide-variety of
               | problems... decades ago and then somehow forgotten until
               | recently.
               | 
               | Seriously: that Data Parallel Algorithms paper is just
               | WTF to me. Linked list traversal (scan-reduced sum from a
               | linked list in SIMD-parallel), Regular Expressions and
               | more.
               | 
               | --------
               | 
               | Then I look at the GPU-graphics guys, and they're doing
               | like BVH tree traversal in parallel so that their
               | raytracers work.
               | 
               | Its like "Yeah, Raytracing is clearly a parallel
               | operation cause GPUs can do it". So I look it up and wtf?
               | Its not easy. Someone really thought things through. Its
               | non-obvious how they managed to get a recursive /
               | sequential operation to operate in highly parallel SIMD
               | operations while avoiding branch divergence issues.
               | 
               | Really: think about it: Raytracing is effectively:
               | If(ray hit object) recursively bounce ray.
               | 
               | How the hell did they make that parallel? A combination
               | of stream-compaction and very intelligent data-
               | structures, as well as a set of new SIMD-assembly
               | instructions to cover some obscure cases.
               | 
               | There's some really intelligent stuff going on in the
               | SIMD-compute world, that clearly applies beyond just the
               | machine-learning crowd.
        
               | my123 wrote:
               | https://github.com/simdjson/simdjson as an example that
               | fits in the "outside-of-a-conventional-SIMD-workload"
               | mold.
        
               | sujayakar wrote:
               | I'm very interested in this space! I've been hacking on
               | some open-source libraries around these ideas: rsdict
               | [1], a SIMD-accelerated rank/select bitmap data
               | structure, and arbolito [2], a SIMD-accelerated tiny
               | trie.
               | 
               | For rsdict, the main idea is to use `pshufb` to implement
               | querying a lookup table on a vector of integers and then
               | use `psadbw` to horizontally sum the vector.
               | 
               | The arbolito code is a lot less fleshed out, but the main
               | idea is to take a small trie and encode it into SIMD
               | vectors. Laying out the nodes into a linear order, we'd
               | have one vector that maintains a parent pointer (4 bits
               | for 16 node trees in 128-bit vectors) and another vector
               | with the incoming edge label.
               | 
               | Then, following the Teddy algorithm[3] (very similar to
               | the Hillis/Stele state transition ideas too!), we can
               | implement traversing the tree as a state machine, where
               | each node in the trie has a bitmask, and the state
               | transition is a parallel bitshift + shuffle of parent
               | state to children states + bitwise AND. We can even
               | reduce the circuit depth of this algorithm to `O(log
               | depth)` by using successive squaring of the transition,
               | like Hillis/Steele describe too.
               | 
               | I've put it on the backburner, but my main goal for
               | arbolito would be to find a way to stitch together these
               | "tiny tries" into a general purpose trie adaptively and
               | get query performance competitive with a hashmap for
               | integer keys. The ART paper[4] does similar stuff but
               | without the SIMD tricks.
               | 
               | [1] https://github.com/sujayakar/rsdict
               | 
               | [2] https://github.com/sujayakar/arbolito
               | 
               | [3] https://github.com/jneem/teddy#teddy-1
               | 
               | [4] https://db.in.tum.de/~leis/papers/ART.pdf
        
               | dragontamer wrote:
               | Cool stuff! I'll give it a lookover later.
               | 
               | A few years ago, I wrote AESRAND
               | (https://github.com/dragontamer/AESRand). I managed to
               | get some well-known programmers to look into it, and
               | their advice helped me write some pretty neat SIMD-
               | tricks. EX: I SIMD-implemented a 32-bit integer ->
               | floating point [0.0, 1.0] operator, to convert the
               | bitstream into floats. As well as integer-based nearly
               | bias-free division / modulus free conversion into [0,
               | WhateverInt] (such as D20 rolls). For 16-bit, 32-bit, and
               | 64-bit integers (with less bias the more bits you
               | supplied).
               | 
               | Unfortunately, I ran out of time and some work-related
               | stuff came up. So I never really finished the
               | experiments.
               | 
               | ----------
               | 
               | My current home project is bump-allocator + semi-space
               | garbage collection in SIMD for GPUs. As far as I can
               | tell, both bump-allocation and semi-space garbage
               | collection are easily SIMDified in an obvious manner. And
               | since cudamalloc is fully synchronous, I wanted a more
               | scalable, parallel solution to the GPU memory allocation
               | problem.
        
               | sujayakar wrote:
               | Very cool! Independent of the cool use of `aesenc` and
               | `aesdec`, the features for skipping ahead in the random
               | stream and forking a separate stream are awesome.
               | 
               | > My current home project is bump-allocator + semi-space
               | garbage collection in SIMD for GPUs. As far as I can
               | tell, both bump-allocation and semi-space garbage
               | collection are easily SIMDified in an obvious manner. And
               | since cudamalloc is fully synchronous, I wanted a more
               | scalable, parallel solution to the GPU memory allocation
               | problem.
               | 
               | This is a great idea. I wonder if we could speed up
               | LuaJIT even _more_ by SIMD accelerating the GC 's mark
               | and/or sweep phases...
               | 
               | If you're interested in more work in this area, a former
               | coworker wrote a neat SPMD implementation of librsync
               | [1]. And, if you haven't seen it, the talk on SwissTable
               | [2] (Google's SIMD accelerated hash table) is excellent.
               | 
               | [1] https://github.com/dropbox/fast_rsync
               | 
               | [2] https://www.youtube.com/watch?v=ncHmEUmJZf4
        
               | dragontamer wrote:
               | > Very cool! Independent of the cool use of `aesenc` and
               | `aesdec`, the features for skipping ahead in the random
               | stream and forking a separate stream are awesome.
               | 
               | Ah yeah, those features... I forgot about them until you
               | mentioned them, lol.
               | 
               | I was thinking about 4x (512-bits per iteration) with
               | enc(enc), enc(dec), dec(enc), and dec(dec) as the four
               | 128-bit results (going from 256-bits per iteration to
               | 512-bits per iteration, with only 3-more instructions). I
               | don't think I ever tested that...
               | 
               | But honestly, the thing that really made me stop playing
               | with AESRAND was discovering multiply-bitreverse-multiply
               | random number generators (still unpublished... just
               | sitting in a directory in my home computer).
               | 
               | Bit-reverse is single-cycle on GPUs (NVidia and AMD), and
               | perfectly fixes the "multiplication only randomizes the
               | top bits" problem.
               | 
               | Bit-reverse is unimplemented on x86 for some reason, but
               | bswap64() is good enough. Since bswap64() and
               | multiply64-bit are both implemented really fast on
               | x86-64-bit, a multiply-bswap64-multiply generator
               | probably is fastest for typical x86 code.
               | 
               | ---------
               | 
               | The key is that multiplying by an odd number (bottom-bit
               | == 1) results in a fully invertible (aka: no information
               | loss) operation.
               | 
               | So multiply-bitreverse-multiply is a 1-to-1 bijection in
               | the 64-bit integer space: all 64-bit integers have a
               | singular, UNIQUE multiply-bitreverse-multiply analog.
               | (with multiply-bitreverse-multiply(0) == 0 being the one
               | edge case where things don't really workout).
        
           | gnufx wrote:
           | The V in SVE is for vector, the S isn't for SIMD, and it's
           | length-agnostic; I don't know how similar it is to the RISC-V
           | vector extension. Think CDC, Cray, NEC, not AMD/Intel. I
           | guess the recent innovation in that space is actual matrix
           | multiplication instructions in CPUs.
        
         | the8472 wrote:
         | + many proprietary GPU instruction sets
        
       | MaxBarraclough wrote:
       | Everyone's right to celebrate the success of RISC-V, but part of
       | me thinks it's a shame that there's relatively little
       | architectural diversity ( _edit_ I should have said _ISA
       | diversity_ ) in modern CPUs. MIPS, Alpha, and Super-H, have all
       | but faded away. Power/PowerPC is still out there somewhere
       | though. Apparently they're still working on SPARC, too. [0]
       | 
       | At least we'll always have the PS2. ...until the last one breaks,
       | I guess.
       | 
       | [0] https://en.wikipedia.org/wiki/SPARC
        
         | Fordec wrote:
         | We need diversity for solving different problems, not for
         | diversity sake.
         | 
         | What problem did MIPS solve in a unique way that others didn't?
         | Because it wasn't desktop, mobile, embedded, graphics or AI.
        
           | zdw wrote:
           | SPARC is well known to be different enough (big endian,
           | register windowing of the stack, alignment, etc.) that it
           | exposes a lot of bugs in code that would be missed in a
           | little-endian, x86 derived monoculture.
           | 
           | https://marc.info/?l=openbsd-bugs&m=152356589400654&w=2
        
             | tyingq wrote:
             | I had a neat experience a long time ago when I wrote a Perl
             | XS module in C, in my x86 monoculture mindset. When you
             | deploy something to their package manager (CPAN), it's
             | automatically tested on a lot of different platforms via a
             | loose network of people that volunteer their equipment to
             | test stuff...https://cpantesters.org.
             | 
             | So, I immediately saw it had issues on a variety of
             | different platforms, including an endianess problem.
             | Cpantesters.org lets you drill down and see what went wrong
             | in pretty good detail, so I was able to fix the problems
             | pretty quickly.
             | 
             | It used to have a ton of different platforms like HPUX/PA-
             | RISC, Sun/Sparc, IRIX/MIPS and so on, but the diversity is
             | down pretty far now. Still lots of OS's, but few different
             | CPUs.
        
             | simias wrote:
             | I always found SPARC's stack handling to be very elegant
             | and I write enough of low level code that these
             | architectural details do from time to time impact me, but
             | isn't it largely irrelevant for the industry at large?
             | 
             | After all MIPS's original insight was that machine code was
             | now overwhelmingly written by compilers and not handwritten
             | assembly, so they made an ISA for compilers. I think
             | history proved them absolutely right, actually these days
             | there are often a couple of layers between the code people
             | write and the instructions fed into the CPU.
             | 
             | I guess my point is that nowadays I'm sure that many
             | competent devs don't know what little-endian means and
             | probably wouldn't have any idea of what "register windowing
             | of the stack" is, and they're completely unaffected by
             | these minute low level details.
             | 
             | Making it a bit easier for OpenBSD to find subtle bugs is
             | certainly nice, but that seems like a rather weak argument
             | for the vast amount of work required to support a distinct
             | ISA in a kernel.
             | 
             | Honestly I'm not convinced by the argument for diversity
             | here, _as long as_ the ISA of choice is open source and not
             | patent encumbered or anything like that. Preventing an x86
             | or ARM monoculture is worth it because you don 't want to
             | put all your eggs in Intel or Nvidia's basket, but if
             | anybody is free to do whatever with the ISA I don't really
             | see how that really prevents innovation. It's just a shared
             | framework people can work with.
             | 
             | Who knows, maybe somebody will make a fork of RISC-V with
             | register windows!
        
           | bitwize wrote:
           | MIPS was a nice simple ISA for CE students to learn and
           | implement. Both ARM and RISC-V have gotchas that make them
           | more complicated.
           | 
           | I suppose it will live on in that form, especially if the IP
           | is opened up.
        
             | schoen wrote:
             | I remember CS 61C at Berkeley used to use MIPS to teach
             | assembly language programming and a bit about computer
             | architecture, using the original MIPS version of Patterson
             | and Hennessy's _Computer Organization and Design_. Now that
             | book is available in both MIPS and RISC-V versions, with, I
             | 've assumed, much more effort going into the RISC-V
             | version...
             | 
             | I do think the simplicity of MIPS was a big plus there,
             | including simplicity of simulating it
             | (http://spimsimulator.sourceforge.net/). I suppose a lot of
             | students may appreciate being taught on something that is
             | or is about to be very widely used, even if it's more
             | complicated in various ways -- and the fact that one of the
             | textbook authors was a main RISC-V designer makes me assume
             | that educational aspects are not at all neglected in the
             | RISC-V world.
        
               | saagarjha wrote:
               | Not entirely related, but I found MARS
               | (http://courses.missouristate.edu/KenVollmar/MARS/) to be
               | much nicer to use than SPIM.
               | 
               | More on topic, though, RISC-V seems to really be designed
               | in a way that makes it easy to teach. This is partially
               | why I have doubts that it can be made very performant,
               | but the focus of a prettier design over a more practical
               | one is probably going to help it be more accessible to
               | students.
        
           | yjftsjthsd-h wrote:
           | Throwing out diversity because you don't see any immediate
           | benefit is a great way to not have it when a different
           | problem _does_ show up.
           | 
           | I don't know about _unique_ way, but MIPS certainly was good
           | at embedded; there 's plenty of networking gear using it.
        
             | capableweb wrote:
             | > MIPS certainly was good at embedded
             | 
             | As far as I understand, it's not that MIPS is the best at
             | embedded, it's just that it's cheaper to sell as the
             | license cost is non-existing and good support already
             | exists in kernels and so on.
        
               | MaxBarraclough wrote:
               | What are the criteria for 'best'?
               | 
               | If MIPS offered adequate performance and features, good
               | performance-per-watt, and a competitive licence fee, and
               | if none of its competitors could beat it, doesn't that
               | count as 'best'?
        
             | jnwatson wrote:
             | PowerPC was the dominant processor in telecom equipment as
             | far as I was aware. Perhaps the low end went MIPS.
        
           | hajile wrote:
           | MIPS and Berkeley RISC started an entire revolution. They
           | appear "not unique" only because other ISAs copied them so
           | thoroughly. I think it's safe to say that Alpha, ARM, POWER,
           | PA-RISC, etc wouldn't have been designed as they were without
           | MIPS.
           | 
           | Even today, comparing modern MIPS64 and ARM aarch64, I find
           | ARM's new ISA to be perhaps more similar to MIPS than to
           | ARMv7.
        
         | xiphias2 wrote:
         | What RISC-V achieves is architectural diversity over the boring
         | mov, add, mul instructions: the interesting part is in vector
         | and matrix manipulation, and while RISC-V is working on a great
         | solution, it allows for other accelerators to be added.
        
         | elihu wrote:
         | I wish the barriers to using new architectures were lower.
         | 
         | For instance, suppose binaries were typically distributed in a
         | platform-agnostic format, like LLVM intermediate representation
         | or something equivalent. When you run your program the first
         | time, it's compiled to native code for your architecture and
         | cached for later use.
         | 
         | I realize I've sort of just re-invented Javascript. But what if
         | we just did away with native binaries entirely, except as
         | ephemeral objects that get cached and then thrown away and
         | regenerated when needed? It seems like this would solve a lot
         | of problems. You could deprecate CPU instructions without
         | worrying about breaking backwards compatibility. If a
         | particular instruction has security or data integrity issues,
         | just patch the compiler not to emit that instruction. As new
         | side-channel speculation vulnerabilities are discovered, we can
         | add compiler workarounds whenever possible. If you're a CPU
         | architect and want to add a new instruction for a particular
         | weird use-case, you just have to add it to your design and
         | patch the compiler, and everyone can start using your new
         | instruction right away, even on old software. You'd be able to
         | trust that your old software would at least be compatible with
         | future instruction architectures. Processors would be able to
         | compete directly with each other without regard to vendor-lock-
         | in.
        
           | saagarjha wrote:
           | This sounds a bit like Google's Portable Native Client.
        
           | MaxBarraclough wrote:
           | > I wish the barriers to using new architectures were lower.
           | 
           | > For instance, suppose binaries were typically distributed
           | in a platform-agnostic format, like LLVM intermediate
           | representation or something equivalent.
           | 
           | We're doing a pretty good job on portability these days
           | already. Well-written Unix applications in C/C++ will compile
           | happily for any old ISA and run just the same. Safe high-
           | level languages like JavaScript, Java, and Safe Rust are
           | pretty much ISA-independent by definition, it's 'just' a
           | matter of getting the compilers and runtimes ported across.
           | 
           | Adopting LLVM IR for portable distribution, probably isn't
           | the way forward. I don't see that it adds much compared to
           | compiling from source, and it's not what it's intended for.
           | (LLVM may wish to change the representation in a subsequent
           | major version, for instance.)
           | 
           | For programs which are architecture-sensitive by nature, such
           | as certain parts of kernels, there are no shortcuts. Or,
           | rather, I'm confident the major OSs already use all the
           | practical shortcuts they can think up.
           | 
           | > When you run your program the first time, it's compiled to
           | native code for your architecture and cached for later use.
           | 
           | Source-based package management systems already give us
           | something a lot like this.
           | 
           | There are operating systems that take this approach, such as
           | Inferno. [0] I like this HN comment on Inferno [1]: kernels
           | are the wrong place for 'grand abstractions' of this sort.
           | 
           | > I realize I've sort of just re-invented Javascript
           | 
           | Don't be too harsh on yourself, JavaScript would be a
           | terrible choice as a universal IR!
           | 
           | > [[ to the bulk of your second paragraph ]]
           | 
           | In the Free and Open Source world, we're already free to
           | recompile the whole universe. The major distros do so as
           | compiler technology improves.
           | 
           | > Processors would be able to compete directly with each
           | other without regard to vendor-lock-in.
           | 
           | For most application-level code, we're already there. For
           | example, your Java code will most likely run just as happily
           | on one of Amazon's AArch64 instances as on an AMD64 machine.
           | In the unlikely case you encounter a bug, well, that's pretty
           | much always a risk, no matter which abstractions we use.
           | 
           | [0] https://en.wikipedia.org/wiki/Inferno_(operating_system)
           | 
           | [1] https://news.ycombinator.com/item?id=9807777
        
           | dataangel wrote:
           | > For instance, suppose binaries were typically distributed
           | in a platform-agnostic format, like LLVM intermediate
           | representation or something equivalent.
           | 
           | The Mill does something like this, but only for their own
           | chips. "Binaries" are bitcode that's not specialized to any
           | particular Mill CPU, and get run through the "specializer"
           | which knows the real belt width and other properties to make
           | a final CPU-specific version.
        
             | brucehoult wrote:
             | "does" is not exactly the right word here. "Proposes to
             | do"?
        
           | gnufx wrote:
           | Surely the barrier to using a new architecture is being able
           | to boot a kernel and run (say) the GNU toolchain, as
           | demonstrated with RISC-V. Then you just compile your code,
           | assuming it doesn't contain assembler, or something. Whether
           | or not you'll have the same sort of board support issues with
           | RISC-V as with Arm, I don't know.
        
           | retrac wrote:
           | That's how IBM implemented the AS/400 platform. Everything
           | compiled down to a processor-agnostic bytecode that was the
           | "binary" format. That IR was translated to native code for
           | the underlying processor architecture as the final step. And
           | objects contained both the IR and the native code. If you
           | moved a binary to another host CPU, it would be retranslated
           | and run automatically. The migration to POWER as the
           | underlying processor was almost entirely transparent to the
           | user and programming environment.
           | 
           | https://en.wikipedia.org/wiki/IBM_System_i#Instruction_set
        
             | skissane wrote:
             | > That's how IBM implemented the AS/400 platform.
             | Everything compiled down to a processor-agnostic bytecode
             | that was the "binary" format
             | 
             | Originally, AS/400 used its own bytecode called MI (or TIMI
             | or OMI). A descendant of the System/38's bytecode. That was
             | compiled to CISC IMPI machine code, and then after the RISC
             | transition to POWER instructions.
             | 
             | However, around the same time as the CISC-to-RISC
             | transition, IBM introduced a new virtual execution
             | environment - ILE (Integrated Language Environment). The
             | original virtual execution environment was called OPM
             | (Original Program Model). ILE came with a new bytecode,
             | W-code aka NMI. While IBM publicly documented the original
             | OPM bytecode, the new W-code bytecode is only available
             | under NDA. OPM programs have their OMI bytecode translated
             | internally to NMI which then in turn is translated to POWER
             | instructions.
             | 
             | The interesting thing about this, is while OMI was
             | originally invented for the System/38, W-code has a quite
             | different heritage. W-code is actually the intermediate
             | representation used by IBM's compilers (VisualAge, XL C,
             | etc). It is fundamentally the same as what IBM compilers
             | use on other platforms such as AIX or Linux, and already
             | existed on AIX before it was ever used on OS/400. There are
             | some OS/400-specific extensions, and it plays a quite more
             | central architectural role in OS/400 than in AIX. But
             | W-code is conceptually equivalent to LLVM IR/bitcode. So
             | here we may see something in common with what Apple does
             | with asking for LLVM bitcode uploads for the App Store.
             | 
             | > And objects contained both the IR and the native code. If
             | you moved a binary to another host CPU, it would be
             | retranslated and run automatically
             | 
             | Not always true. The object contains two sections - the MI
             | bytecode and the native machine code. It is possible to
             | remove the MI bytecode section (that's called removing
             | "observability") leaving only the native machine code
             | section. If you do that, you lose the ability to migrate
             | the software to a new architecture, unless you recompile
             | from source. I think, most people kept observability intact
             | for in-house software, but it was commonly removed in
             | software shipped by IBM and ISVs.
        
           | MaulingMonkey wrote:
           | > I realize I've sort of just re-invented Javascript.
           | 
           | Or one of several bytecodes that get JIT or AOT compiled.
           | 
           | WASM in particular has my interest these days, thanks to
           | native browser support and being relatively lean and more
           | friendly towards "native" code, whereas JVM and CLR are
           | fairly heavyweight, and their bytecodes assume you're going
           | to be using a garbage collector (something that e.g. wasmtime
           | manages to avoid.)
           | 
           | Non-web use cases of WASM in practice seem more focused on
           | isolation, sandboxing, and security rather than architecture
           | independence - stuff like "edge computing" - and I haven't
           | read about anyone using it for AOT compilation. But perhaps
           | it has some potential there too?
        
           | msla wrote:
           | > For instance, suppose binaries were typically distributed
           | in a platform-agnostic format, like LLVM intermediate
           | representation or something equivalent. When you run your
           | program the first time, it's compiled to native code for your
           | architecture and cached for later use.
           | 
           | IBM's OS/400 (originally for the AS/400 hardware, now branded
           | as System i) did precisely this: Compile COBOL or RPG to a
           | high-level bytecode, which gets compiled to machine code on
           | first run, and save the machine code to disk; thereafter, the
           | machine code is just run, until the bytecode on disk is
           | changed, whereupon it's replaced with newer machine code. IBM
           | was able to transition its customers to a new CPU
           | architecture just by having them move their bytecode (and,
           | possibly, source code) from one machine to another that way.
           | 
           | https://en.wikipedia.org/wiki/IBM_System_i
           | 
           | Other OSes could definitely do it.
        
             | skissane wrote:
             | > Other OSes could definitely do it.
             | 
             | See https://en.wikipedia.org/wiki/Architecture_Neutral_Dist
             | ribut...
             | 
             | Using ANDF you could produce portable binaries that would
             | run on any UNIX system, regardless of CPU architecture. It
             | was never commercially released though. I think while it is
             | cool technology the market demand was never really there.
             | For a software vendor, recompiling to support another UNIX
             | isn't that hard; the real hard bit is all the compatibility
             | testing to make sure the product actually works on the new
             | UNIX. ANDF solved the easy part but did nothing about the
             | hard bit. It possibly would even make things worse, because
             | then customers might have just tried running some app on
             | some other UNIX the vendor has never tested, and then
             | complain when it only half worked.
             | 
             | Standards are always going to have implementation bugs,
             | corner cases, ambiguities, undefined behaviour, feature
             | gaps which force you to rely on proprietary extensions,
             | etc. That's where the "hard bit" of portability comes from.
        
           | Ericson2314 wrote:
           | Come to Nix and Nixpkgs, where we can cross compile most
           | things in myriad ways. I think the barriers to new hardware
           | ISAs on the software side have never been lower.
           | 
           | Even if we get an ARM RISC-V monoculture, at least we are
           | getting diverse co-processors again, which present the same
           | portability challenges/opportunities in a different guise.
        
         | spamizbad wrote:
         | I feel like MIPS and RISC-V are so closely related they're not
         | terribly diverse. Academic MIPS evolved into RISC-V.
        
           | Taniwha wrote:
           | Nope - they have different competing heritages - in this case
           | the headline is "Berkeley beats Stanford"
           | 
           | https://www.youtube.com/watch?v=09kPcg8Hehg
        
             | monocasa wrote:
             | Academic RISC was designed by Patterson and Hennessy.
             | Hennessy went off and was one of the founders of MIPS,
             | Patterson is one of the instrumental leaders in the RISC-V
             | space.
        
         | dralley wrote:
         | I'm curious if someone could explain the architectural
         | differences between ppc64le and aarch64? I've always heard they
         | are quite similar.
        
           | tyingq wrote:
           | Since you mentioned ppc64le, there's also aarch64eb (arm64 in
           | big endian mode). I saw that NetBSD supports it. It seems
           | like support for other operating systems is limited mostly
           | because of issues around booting...not the actual kernel or
           | userland itself.
        
       | synergy20 wrote:
       | It could preempt RISC-V long time ago by doing this. I hope it's
       | not too late.
       | 
       | MIPS is still used in routers and set-top-boxes, but the steam is
       | running out quickly, nearly all routers/set-top-box new design
       | are now using ARMs already. There is a last hope though.
        
       | moonbug wrote:
       | If synthesisable MIPS cores from R3k to R20k enter the public
       | domain, that would be something special. based on how previous
       | such "announcements" have turned out, it's not gonna happen this
       | time either.
        
       | musicale wrote:
       | There's a lot to like about MIPS. It's a perfectly usable RISC
       | architecture that:
       | 
       | - is easy to implement
       | 
       | - is supported by Debian, gcc, etc..
       | 
       | - is virtualizable
       | 
       | - scales from embedded systems (e.g. compressed MIPS16 ISA) up to
       | huge shared-memory multiprocessor systems with hundreds of CPUs
       | 
       | Like RISC-V, MIPS traces its lineage to the dawn of the RISC
       | revolution in the 1980s, though on the Hennessy/Stanford side
       | rather than the Patterson/Berkeley side.
       | 
       | And it was supposed to go open source:
       | https://www.mips.com/mipsopen/
       | 
       | but that effort sadly seems to be dead: http://mipsopen.com
       | 
       | That's really too bad. MIPS doesn't deserve to die.
       | 
       | Fortunately as mentioned above it will live on as long as there
       | are PS2 consoles or emulators around.
        
       | st_goliath wrote:
       | I just read the official statement[1] that's linked to in the
       | article.
       | 
       | Just so I get this straight: Wave Computing, the company that
       | bought the remains of MIPS, is now (after bancruptcy) spinning it
       | off as a separate company, that is going to work under the name
       | MIPS, holds the rights to the MIPS architecture, but is doing
       | RISC-V?
       | 
       | [1] https://www.prnewswire.com/news-releases/wave-computing-
       | and-...
        
         | bitwize wrote:
         | Wave Computing is changing its name to MIPS, like how Tandy
         | changed to RadioShack.
        
         | moistbar wrote:
         | That's what I got from it, but I have to say I don't understand
         | the decision to throw away such a storied architecture as MIPS.
         | I mean come on, the N64 runs on it!
        
           | Narishma wrote:
           | PS1, PS2 and PSP also used MIPS processors and were each much
           | more successful than the N64.
        
             | throwaway_6142 wrote:
             | Atmel
        
       | abrowne wrote:
       | If you can't beat 'em, join 'em?
        
         | muterad_murilax wrote:
         | Or, conversely: beat 'em and eat 'em!
        
       | mortenjorck wrote:
       | This is more or less analogous to Blackberry moving to Android,
       | isn't it? Storied, old-guard tech company loses most of its
       | market share, trades in its first-party stack for a rising open-
       | source alternative.
       | 
       | Is MIPS still a big enough name to make this much of a coup for
       | RISC-V? Or is this the last-ditch effort of a fallen star of the
       | semi market?
        
         | slezyr wrote:
         | Check your router's CPU. I own 4 routers and all of them use
         | MIPS. RISC-V is more like Graphene, it's yet to leave the lab.
        
           | makomk wrote:
           | Home routers used to be one of the last holdouts of MIPS, but
           | all the modern ones have been switching to ARM. It's pretty
           | much on its last legs there aside from the really cheap, low-
           | end stuff.
        
         | UncleOxidant wrote:
         | > Is MIPS still a big enough name to make this much of a coup
         | for RISC-V?
         | 
         | No, not at this point. RISC-V already has plenty of momentum
         | without this. The only thing that might change that analysis is
         | if MIPS has some architecture patents that they could leverage
         | to get some kind of RISC-V performance advantage. But I doubt
         | they have anything like that now. And it's not like they have a
         | stable of CPU designers that are now going to be switched from
         | working on MIPS to working on RISC-V - they likely haven't had
         | any of those folks working there since the 90s.
        
         | scj wrote:
         | It likely forces an inflection point for existing MIPS users:
         | 
         | 1. Stick with MIPS for as long as possible.
         | 
         | 2. Follow MIPS into RISC-V.
         | 
         | 3. Or go to some other camp.
        
         | abrowne wrote:
         | They're even changing the name of the company to MIPS, just
         | like RIM - Blackberry
         | (https://news.ycombinator.com/item?id=26389870
         | 
         | One big difference is most consumers, and I think even many
         | companies using the chips, don't really care what architecture
         | they are using, unlike with an end-user OS/"ecosystem". So if
         | they can use their abilities and experience from MIPS to make
         | RISC-V chips with good price to performance, they could do OK.
        
         | MisterTea wrote:
         | > Is MIPS still a big enough name to make this much of a coup
         | for RISC-V? Or is this the last ditch effort of a fallen star
         | of the semi market?
         | 
         | Mips has seemingly been on life support sine the late 90's. I
         | kind of think the SGI buyout and later spin-off doomed them as
         | they were focused on building high performance workstation
         | processors while Arm was busy focusing on low power and
         | embedded systems. Guess who was better prepared for the mobile
         | revolution of the 00's?
        
           | mikepurvis wrote:
           | It's interesting that that's where they've been focused,
           | because my only exposure to MIPS chips has been in low-cost
           | Mikrotik routerboards, eg:
           | 
           | https://mikrotik.com/product/RB450
        
             | tyingq wrote:
             | I imagine MIPS (via Atheros, Broadcom, etc) was broadly
             | deployed in things like home routers because it was royalty
             | free, power efficient, and already had Linux kernel
             | mainline support. Though probably losing share to ARM now.
        
               | theodric wrote:
               | Indeed, Asus moved from MIPS to ARM between the RT-AC66U
               | and RT-AC68U, and the various *pkg repos have since
               | dropped support. MIPS may as well be dead.
        
             | pajko wrote:
             | MIPS was being used in set-top-boxes and media players as
             | well, like https://www.imaginationtech.com/blog/dont-let-
             | the-cpu-contro...
        
       | tyingq wrote:
       | The MIPS name was originally was an acronym for "Microprocessor
       | without Interlocked Pipeline Stages". The RISC-V docs that I've
       | skimmed seem to show quite a lot more hardware pipeline
       | interlocking than the last gen MIPS processors. So the name is a
       | bit funny now.
        
         | monocasa wrote:
         | It's a lot harder to have pipeline stages and no interlocks
         | when you don't have delay slots.
        
       | Andrex wrote:
       | My historical skepticism on the acceptance and proliferation of
       | RISC-V looks more antiquated by the day. No real dog in the
       | fight, but I would love to see this take off like ARM did.
        
       | throwaway81523 wrote:
       | Actual source: https://www.eejournal.com/article/wait-what-mips-
       | becomes-ris...
        
         | dang wrote:
         | Right. We changed the URL from https://tuxphones.com/mips-
         | joins-risc-v-open-hardware-standa..., which points to that.
         | Thanks!
        
       | [deleted]
        
       | zokier wrote:
       | The progression of headlines is funny:
       | 
       | 1) MIPS Strikes Back: 64-bit Warrior I6400 Arrives
       | https://news.ycombinator.com/item?id=8258092
       | 
       | We are still in the game
       | 
       | 2) Linux-running MIPS CPU available for free to universities -
       | full Verilog code https://news.ycombinator.com/item?id=9444567
       | 
       | Okay, we are not doing so great, maybe we can get young kids
       | hooked?
       | 
       | 3) MIPS Goes Open Source
       | https://news.ycombinator.com/item?id=18701145
       | 
       | Open Source is so hip and pop these days, lets do that!
       | 
       | 4) Can MIPS Leapfrog RISC-V?
       | https://news.ycombinator.com/item?id=19460470
       | 
       | Yeah, sure, that'll happen
       | 
       | 5) Is MIPS Dead? Lawsuit, Bankruptcy, Maintainers Leaving and
       | More https://news.ycombinator.com/item?id=22950848
       | 
       | Whoops
       | 
       | 6) Loose Lips Sink MIPS
       | https://news.ycombinator.com/item?id=24402107
       | 
       | And there is the answer to the question from previous headline
       | 
       | And now we are here.
        
         | rwmj wrote:
         | I have one of the purple MIPS SBCs from back when MIPS was
         | briefly owned by Imagination
         | (https://en.wikipedia.org/wiki/Imagination_Creator
         | https://elinux.org/MIPS_Creator_CI20). Slow as hell even back
         | in 2014. I wonder if one day it'll be a museum piece :-?
        
           | Koshkin wrote:
           | Microchip PIC32 seems to be plenty fast.
        
           | jhallenworld wrote:
           | I have a tube of IDT R3041s and R3051s. I remember using GCC
           | compiled for a DECStation 2000 to write code for them. (I
           | made a hand-held computer based on R3041).
           | 
           | I also have a tube of ARM 610s (VY86C060s I think) from the
           | same project, but R3041 was PLCC, whereas ARM was fine pitch
           | PQFP. PLCC was easier to deal with at the time...
        
           | trulyme wrote:
           | Is it something special? If no, then no.
        
           | ChuckMcM wrote:
           | Yes it will. You should preserve it and/or donate it once you
           | no longer want or need it.
        
           | Zenst wrote:
           | Not any time soon as can still buy them: https://uk.rs-
           | online.com/web/p/single-board-computers/125330...
        
         | dv_dt wrote:
         | I had to do a quick context switch to make sure by ISA they
         | meant Instruction Set Architecture not Industry Standard
         | Architecture (ISA Bus)...
        
       | nickysielicki wrote:
       | This article from 2015 ("The Death of Moore's Law Will Spur
       | Innovation: As transistors stop shrinking, open-source hardware
       | will have its day") is getting better and better with age.
       | 
       | https://spectrum.ieee.org/semiconductors/design/the-death-of...
        
       | einpoklum wrote:
       | Does this mean that design features from MIPS can now be adopted
       | / adapted in RISC-V? Or is it just a no-longer-used ISA and chip
       | designs that become freely usable?
        
       | lallysingh wrote:
       | I guess that they're going to ship RISC-V CPUs? Makes sense. Do
       | they still have design chops for making fast implementations?
        
       | xony wrote:
       | article not loading , might be blocked ISP here
        
       | ChuckMcM wrote:
       | It was unclear if "v8" of the MIPS architecture is a re-branded
       | RISC-V or if v8 MIPS is a combined RISC-V + MIPS or what.
        
       | p1mrx wrote:
       | Mirror: https://archive.is/S1s80
        
       ___________________________________________________________________
       (page generated 2021-03-08 23:00 UTC)