[HN Gopher] Is RISC-V ready for HPC? Evaluating the 64-core Soph...
       ___________________________________________________________________
        
       Is RISC-V ready for HPC? Evaluating the 64-core Sophon SG2042
       RISC-V CPU
        
       Author : anewhnaccount2
       Score  : 90 points
       Date   : 2023-12-10 08:06 UTC (14 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | not_your_vase wrote:
       | This is how an academic paper looks like nowadays? I thought it
       | would fit Tom's Hardware more... maybe I'm getting old.
        
         | j16sdiz wrote:
         | These are workshops paper. ACM workshops were always like that.
        
           | anewhnaccount2 wrote:
           | The other thing to look at is the affiliations. All these
           | people work for https://www.epcc.ed.ac.uk/ i.e. they are
           | working for a HPC facility attached to a university.
        
           | not_your_vase wrote:
           | I see, that's a bit of a relief. Thanks for the
           | clarification. Though I still think it should go to Tom's Hw
           | or ServeTheHome instead of arxiv :)
        
       | tromp wrote:
       | Closely related: discussion of the Milk-V Pioneer workstation
       | employing this chip [1].
       | 
       | [1] https://news.ycombinator.com/item?id=38553647
        
       | Razengan wrote:
       | I'm sure it's far from being a sophon.. :)
        
         | l3x4ur1n wrote:
         | Three body problem
        
       | rwmj wrote:
       | This particular chip is very slow, so definitely not. However
       | Sophgo are going to switch the cores to using SiFive P670 in the
       | next iteration (SG2380) and those cores have much faster single
       | thread performance.
        
         | justinclift wrote:
         | Is that still happening after SiFive let go a bunch of staff
         | recently?
         | 
         | https://news.ycombinator.com/item?id=37996295
        
           | davidlt wrote:
           | To my understanding SiFive continues to offer their selection
           | of core IP. Anyways, I would assume any existing contract
           | would have to be fulfilled for various legal reasons.
           | 
           | SG2042 itself is T-HEAD C920 design which is a mess, and
           | might not be even called a RISC-V compliant design. We are
           | kinda stuck it existing and being used in various chips.
           | There are other design issues discovered IIRC (atomic might
           | not work properly [at least on the kernel side workarounds
           | required]; floating point failures in glibc testsuite because
           | FP not being compliant). SG2044 is scheduled for the next
           | year (2024). Not many details are known: 64 cores, 8 DDR
           | controller, 3x memory bandwidth, vector v1.0 support, 2x PCIe
           | (unknown what that means, Gen3 -> Gen4? More lanes?). The
           | cores are unknown, but SG2038 is SiFive P670. T-HEAD has C908
           | that support vectors v1.0 (and solves some other issues), but
           | that's a smaller core. Not a replacement for C910/C920.
        
             | justinclift wrote:
             | Thanks, that's good info. :)
        
           | photonbeam wrote:
           | This is dated after the layoffs, so presumably yes
           | 
           | https://www.sifive.com/press/sophgo-licenses-sifive-risc-
           | v-p...
        
         | brucehoult wrote:
         | It's not _that_ slow.
         | 
         | It's about the same as the AWS Graviton 1 that went into their
         | data centres in 2019, except it's got 64 cores while the
         | Graviton has 16.
         | 
         | On tasks that can use all 64 cores (building software, web
         | serving, and yes HPC) it can have about the same total
         | performance as a current generation 16 core x86 that costs
         | about the same.
        
       | asystole wrote:
       | Does Betteridge's law apply here?
        
         | wongarsu wrote:
         | yes
        
       | sylware wrote:
       | What's very nice about RISC-V 64bits: code assembly once, run it
       | everywhere, almost quite literaly... no absurdely and grotesquely
       | massive and complex compilers anywhere, no planned obsolescence,
       | feature creeps on computer language syntax nowhere to be found,
       | ultra stable in time, near 0 SDK.
        
         | magicalhippo wrote:
         | That's not really true though is it? A lot of the speed and
         | interesting bits for HPC would come in the form of ISA
         | extensions, and the paper even mentions challenges in this area
         | due to the chip only supporting version 0.7.1 of the RVV
         | vectorisation extension.
         | 
         | Regardless, I imagine in HPC you'll want to recompile anyway to
         | get the most bang for your buck, unless you're doing a short
         | run. Why throw away performance if you'll be running your code
         | for days or months?
        
           | sylware wrote:
           | Well, this is much more true that with classic computer
           | languages. That's more than enough when you look at the
           | desastrous stability of classic computer languages. Not even
           | reasonable with C, and I have strong suspicions about rust
           | syntax (it seems it became not much less complex and insane
           | than c++, wrong?). Most other real life computer language
           | syntaxes are hopeless if you are honest with yourself, that
           | on even less than 10 years cycles (blame ISO and compiler
           | extensions).
           | 
           | If one cannot write reasonably a naive but _real-life_
           | alternative compiler for a computer language syntax, which
           | would be _REALLY_ stable in time - > full frontal compiler
           | planned obsolescence and feature creep.
           | 
           | That's why risc-v has a very high potential: roughly
           | speaking, you exit the compilers, which is a very good thing.
           | 
           | That's why I wish RISC-V to succeed, we all know once some
           | code paths are assembly written, we get a very strong
           | independence of those absurdely and grotesquely massive and
           | complex compilers, that with a world wide/PI lock free
           | standard ISA: this is priceless.
           | 
           | I see risc-v compiler support as legacy support.
           | 
           | You can even get a nice middleground with high level language
           | program interpreters written directly in 64bits risc-v
           | assembly. Think about a python3 interpreter, a javascript
           | interpreter, lua, etc etc...
        
             | formerly_proven wrote:
             | Hey maybe you can get together with the grand unified pure
             | pipeline dataflow functions theory guy and make some kind
             | of HN self-help group.
        
               | sylware wrote:
               | That was more than a decade ago.
               | 
               | Nowadays you write pipeline generic assembly as a lot
               | happen at runtime in modern micro-archs. If some static
               | "optimizations" go in, it is mostly those who are likely
               | to be "true or benign" enough across most if not all
               | micro-archs, for instance cache line or code fetch window
               | alignment (and even that...), branching reduction, etc.
               | 
               | You can still have pipeline specific optimized assembly
               | code, usually just a matter of installing/branching to
               | the right assembly code path at runtime, hardly more.
               | 
               | We have to realize than with that, the entire insane (the
               | word is fair) cost and planned obsolescence of optimizing
               | compilers are literaly... gone... and just for that, even
               | if some assembly code paths are a bit slower, oh god,
               | this is worth billions!
               | 
               | But there is a pitfall though: if the assembly code is
               | written using an ultra powerfull macro preprocessor, "c++
               | grade" (you get the picture), this would be a complete
               | loss, at it is just displacing the core of the issue from
               | an optimizing compiler to an omega preprocessor.
        
             | magicalhippo wrote:
             | I'll admit I have a fever so perhaps that is why, but
             | you're not making much sense to me.
             | 
             | It's not like we haven't had ISAs spanning different core
             | architectures already. Take Intel's NetBurst (Pentium 4) vs
             | Core (Core 2 Duo fex) architectures. Same ISA, quite
             | different optimization targets.
             | 
             | I don't see why RISC-V will be different in this regard.
             | 
             | Of course relying on higher-level languages with JIT
             | compliation is a thing, thats why Julia[1] exists. And with
             | RISC-V you do have those extensions that languages like
             | Julia could take advantage of. But it would have to be
             | implemented in Julias complier and libraries, there's no
             | free lunch.
             | 
             | [1]: https://julialang.org/
        
               | eigenspace wrote:
               | > But it would have to be implemented in Julias complier
               | and libraries, there's no free lunch.
               | 
               | Actually, for the most part Julia itself doesn't need to
               | be concerned at all with the ISA or hardware differences.
               | That's mostly LLVM's job. So yes, someone does need to
               | implement it, but many languages would get to benefit
               | from that work, not just Julia.
               | 
               | It may not be a free lunch, but you can pay for one lunch
               | and feed many mouths.
        
               | HybridCurve wrote:
               | I don't think it has anything to do with your fever- the
               | post looks like autogenerated garbage. Below they were
               | speaking about "planned obsolescence of compilers" which
               | makes no sense as many make efforts to support legacy
               | architectures.
        
         | bigbillheck wrote:
         | > no absurdely and grotesquely massive and complex compilers
         | anywhere
         | 
         | Absence of evidence is not evidence of absence, and anyway
         | there's not even an absence: https://github.com/riscv-
         | collab/riscv-gnu-toolchain
         | https://llvm.org/docs/RISCVUsage.html
         | 
         | > feature creeps on computer language syntax nowhere to be
         | found
         | 
         | At least one of us is very confused, and in case it's me, how
         | do language details matter to RISC-V?
        
         | znpy wrote:
         | you might do the same on x86 as long as you code using only the
         | instructions you'd find on a 286 or a 386 (if you feel fancy
         | you can assume you always have the x87 math coprocessor)
        
         | dezgeg wrote:
         | Yeah, no. I have a bad feeling that fragmentation will become a
         | mess in RISC-V world. Just few month ago Qualcomm threatened to
         | not support the existing C extension and proposed their own
         | 16-bit instruction set. Also another example of such mess from
         | a sibling post by davidlt
         | (https://news.ycombinator.com/item?id=38591977):
         | 
         | > SG2042 itself is T-HEAD C920 design which is a mess, and
         | might not be even called a RISC-V compliant design. [...] There
         | are other design issues discovered IIRC (atomic might not work
         | properly [at least on the kernel side workarounds required];
         | floating point failures in glibc testsuite because FP not being
         | compliant)
         | 
         | Guess what in the above T-HEAD C920 situation would be to best
         | way to achieve "code once"? Add the workaround for the buggy
         | instructions in the compiler and then recompile, with no
         | changes.
        
           | photonbeam wrote:
           | I think once google picks their minimum riscv extension set
           | for android (soon?) then the rest of the vendors will aim at
           | that leaving everything else as obsolete
        
           | sylware wrote:
           | "Rome was not built in one day": Mistakes, debugging, fixing
           | _MUST_ happen.
           | 
           | And it is going to hurt. To get rid of the plague of
           | optimizing compilers, I am willing to pay that price.
           | 
           | And instead of having to to fix assembly code generation in
           | those ultra-complex compilers, it is much more reasonable to
           | hook in an assembler which is beyond orders of magnitude
           | saner than dealing with any optimizing compilers.
        
       | tw1984 wrote:
       | you don't need a paper from a group of UK developers to
       | understand its performance & potential. The processor is made by
       | a Chinese startup, as China is fighting an all out semiconductor
       | war with the US, there is literally unlimited public & private
       | investments that could be poured into such a 64 cores processor
       | if it is indeed remotely on par with the SOFA of x86. Any Chinese
       | company holding such crown jewel would spend billions of $ on PR
       | to get everyone in China to know its name to further milk more
       | investments.
       | 
       | As someone who live in China and has been in the area for
       | decades, the only reason why I have not heard about it until
       | today is pretty obvious - it is a toy implementation no one cares
       | about. There is actually a dedicated term in Chinese describing
       | such junk - "Luo Hou Chan Neng ", which means backward production
       | capacity.
       | 
       | They knew the expected performance, they knew it is going to be
       | laughed by peers in the area, but they still did it for a good
       | reason - to just fool certain low IQ investors in China to get a
       | free ride of the whole RISC-V thing. Whoever behind this
       | laughable release should really be ashamed - what is the next
       | move? glue together maybe 1024 of those 8051 "cores" and claim it
       | has built a supercomputer on a chip?
        
         | kragen wrote:
         | _you_ evidently _do_ need a paper from a group of uk developers
         | to understand that it is not junk; as the abstract explains, it
         | 's 5-10 times as fast as other risc-v hardware, and even
         | performs better on some workloads than "the x86 high
         | performance CPUs under test", though it still lags behind them
         | by a factor of 4-8 on average
         | 
         | this is enormous progress, and much more convincing than
         | performance projections from simulations carried out by the
         | chip's own designers
        
         | brucehoult wrote:
         | Nothing laughable about it.
         | 
         | It's comparable on a core to core basis with the first AWS
         | Graviton server chip that was deployed in 2019, only four years
         | ago -- except this has 64 cores while the Graviton has 16.
         | 
         | It's also comparable to current x86 chips with 16 cores that
         | cost about the same but use a lot more electricity. Each core
         | is about 1/4 the speed of the x86, but you get four times more
         | of them, which is fine for many server or HPC workloads.
        
       | tw1984 wrote:
       | Just realized that the chip is probably backed by Alibaba, it
       | uses the core built by Alibaba, that is the company behind most
       | of those fake stuff sold online. Its founder openly challenged
       | the financial system in China as he argued that his algorithm
       | targets and profits from the poorest more efficiently than
       | transitional banks!
       | 
       | Now everything can be explained. What you can expect from
       | Alibaba?
        
       | LoganDark wrote:
       | I think it's amazing that RISC-V is advancing at such a pace that
       | there are already RISC-V cores that are "only" ten times slower
       | than similarly-SotA x86 cores.
        
         | bee_rider wrote:
         | Considering that they are comparing against an Icelake Xeon
         | with AVX-512, it is pretty decent.
         | 
         | I mean the answer is basically no, it isn't ready. But x86 has
         | a pretty huge head start, RISC-V had to start somewhere.
        
       | ibobev wrote:
       | Sophon CPU? Is this a reference to "The Three-Body Problem"
       | novel?
        
         | izzydata wrote:
         | Those transistors better be down to the size of protons at
         | least.
        
           | ace2358 wrote:
           | If they're not that small, just fold up some dimensions until
           | it is.
        
           | thewakalix wrote:
           | I can guarantee they are at least as large as protons.
        
       | mesham wrote:
       | Hi all - I'm one of the authors of the paper so thought I would
       | post my thoughts. Firstly, thanks for the interest in the paper -
       | it's really nice to have this sort of discussion. As it's been
       | highlighted in the comments this is a workshop paper (the
       | workshop on RISC-V for HPC last month at SC) which allows us to
       | focus on some of the more practical aspects compared to, for
       | instance, a main-track conference or journal etc. Given the
       | availability of this 64-core RISC-V CPU we felt that it would be
       | interesting to, independent of the manufacturer, explore some of
       | the performance and try and answer the question around how
       | realistic a proposition this is for HPC workloads (I suppose
       | really trying to preempt questions from the HPC community around
       | whether this 64-core CPU moves us closer to RISC-V being a
       | realistic choice for HPC).
       | 
       | Obviously the numbers are in the paper so you can draw your own
       | conclusions, but we were pretty impressed by the results overall
       | - both in relation to the SG2042 itself and also more widely what
       | this means for RISC-V. The SG2042 isn't perfect (as has been
       | highlighted here, it only supports RVV 0.7.1 for instance), but
       | my feeling is that it's a significant step forward for the RISC-V
       | community.
       | 
       | For the SG2042 specifically, as it's been highlighted in these
       | comments, it is within the same order of magnitude (pretty much
       | anyway depending on which number(s) you look at) as well
       | established x86 high-performance CPUs (and we threw in an old
       | Sandybridge CPU too as a bit of a baseline). I think that's
       | pretty impressive, after all the SG2042 is a first-generation
       | RISC-V CPU from Sophon being compared against mature x86 CPUs. As
       | someone else has said they need to start somewhere and are now
       | building on this as illustrated by their roadmap. Furthermore,
       | something we didn't consider in the paper was price - this is a
       | tricky one as it can depend on where you are geographically with
       | exchange rate etc, but I think that the SG2042 is probably a fair
       | bit cheaper than some of the x86 CPUs we compared against too
       | (when they were new anyway).
       | 
       | What I think is pretty phenomenal here is the pace of change for
       | RISC-V more widely. At the start of 2023 the best commodity
       | available RISC-V hardware that we could get was the 4-core
       | VisionFive V2. As we show in the paper, each C920 core in the
       | SG2042 is quite a bit faster for the benchmarks than the U74 in
       | the V2, but also the SG2042 is providing 64 vs 4 cores. This is
       | within the space of 12 months, or so, and there seem to be a
       | whole load of new high performance RISC-V hardware planned for
       | general availability in 2024 (from a range of manufacturers
       | across the globe) including new CPUs and high-core count
       | accelerators (e.g. see the slides of the four vendor talks at the
       | workshop we presented this paper at
       | https://riscv.epcc.ed.ac.uk/community/sc23-workshop/ ). So I
       | think it's really interesting to see the trajectory here of
       | RISC-V to date and over the next few years to track whether this
       | pace continues (or even accelerates!)
       | 
       | My personal feeling is that unless it unlocks some very
       | significant new capabilities (which to be fair is possible),
       | using RISC-V CPUs instead of a x86 CPUs in supercomputers will
       | probably be a tough sell in the short term. However, I think
       | there is a lot of potential on the accelerator side of things and
       | I suspect this is where we will start seeing RISC-V emerge for
       | HPC initially (and maybe by stealth where people are unaware that
       | their compute or in-network accelerators are leveraging RISC-V in
       | some way).
        
       | cardanome wrote:
       | I don't really get why people are so excited about RISC-V. An
       | open ISA doesn't really offer any additional freedoms to the
       | user. It does not mean we get truly open hardware.
       | 
       | This whole thing is just about Chinese companies not wanting to
       | pay for ARM licenses anymore. Which is good for them but I don't
       | get the excitement from tech people.
       | 
       | Even if a RISC-V CPU would someday offer the same performance for
       | the same price than an x86-64 counterpart, it would be a strictly
       | worse deal, as software support will be so much worse. The x86
       | monoculture was an amazing time to live in and helped us enjoy so
       | much backwards compatibility. The promised lower power-
       | consumption might be nice but we will have to see how much the
       | ISA really matters for that.
        
         | afr0ck wrote:
         | From a tech-nerd point of view, here is the reason why I am so
         | excited about RISC-V. RISC-V means there is a possibility of
         | creating a completely open source implementation of an
         | industry-grade microprocessor, just like the Linux kernel did
         | for operating systems, democratising the business of building
         | advanced microprocessors. I can't work for Intel or Arm design
         | teams. Hardware is not my main job and entry barrier to those
         | companies is so high (without mentioning age and geographic
         | barriers). However, I would have loved if there was a real,
         | industrial, free and open implementation of a microprocessor
         | where you collaborate with smart people all over the world to
         | solve interesting problems and implement advanced hardware
         | designs and algorithms.
        
           | fwsgonzo wrote:
           | I feel the same way. I think RISC-V has changed my opinion on
           | the hardware aspect of things, and now for the first time I
           | want to try to create a simple CPU at home.
           | 
           | There is even a simpler RV32E spec, for those (like me) with
           | little hardware experience. Perhaps RV32E is reasonable to
           | start with?
           | 
           | https://five-embeddev.com/riscv-isa-manual/latest/rv32e.html
        
         | dcreater wrote:
         | Its so so important for gatekeepers not to exist for free
         | market capitalism to succeed. RISC V is reinstating this state
         | of normalcy..
        
           | wwtrv wrote:
           | As much as free market capitalism can exist when China
           | completely dominates the RISC-V market.
        
       ___________________________________________________________________
       (page generated 2023-12-10 23:00 UTC)