[HN Gopher] Arm Announces Neoverse V1, N2 Platforms and CPUs, CM...
       ___________________________________________________________________
        
       Arm Announces Neoverse V1, N2 Platforms and CPUs, CMN-700 Mesh
        
       Author : timthorn
       Score  : 178 points
       Date   : 2021-04-27 13:21 UTC (9 hours ago)
        
 (HTM) web link (www.anandtech.com)
 (TXT) w3m dump (www.anandtech.com)
        
       | akmittal wrote:
       | Apple's M1 has made ARM mainstream for laptops, Lets see which
       | company does same for server space.
       | 
       | Hopefully ARM on cloud will result in cheaper prices.
        
         | lifty wrote:
         | Apple's M1 will make ARM mainstream on the server side.
        
           | leesalminen wrote:
           | Personally, I don't see many server admins choosing to pay
           | the Apple Tax to get M1 into their data center. I don't see
           | how the watt/performance ratio could pay off that kind of
           | tax.
        
             | lifty wrote:
             | I did not mean to imply that the actual M1 will be used in
             | data centers. Apple is quite popular among developers and
             | its also a trendsetter which will probably lead to other
             | computer manufacturers to adopt ARM for personal computers.
             | So having more people use ARM on their personal computers
             | will lead to more ARM adoption in the data center.
        
               | leesalminen wrote:
               | Ah, ok, I understand. Thanks for clarifying!
        
               | wayneftw wrote:
               | > Apple is quite popular among developers...
               | 
               | The great majority of developers use Windows or Linux
               | according to every Stack Overflow survey from the past
               | ten years. Only ~25% use a Mac.
        
               | [deleted]
        
               | alireza94 wrote:
               | I believe interpreting statistics from those surveys in
               | this way isn't fair. There are so many developers around
               | the world but the pattern of value/money generation by
               | them is not uniform; in other words, a small percentage
               | of developers work for companies that pay the largest
               | share of server bills and penetration rate of macOS
               | devices among developers of top companies is probably
               | higher than average. (I'm not implying that developers
               | who work on non-macOS devices, make less value because
               | your device doesn't have - nearly - anything to do with
               | your impact. I'm just talking about a trend and possible
               | misinterpretation of data)
        
             | simondotau wrote:
             | The OP wasn't suggesting Apple M1 chips in the data centre,
             | but rather that Apple M1 chips in developer workstations
             | will disrupt the inertia of x64 dev -> x64 prod. It will be
             | easier for developers to choose ARM in production when
             | their local box is ARM.
        
           | Symmetry wrote:
           | Apple hasn't seemed interested historically. And the Nuvia
           | folks left Apple to found their company explicitly because
           | they thought an M1 style CPU core would do well in servers
           | but Apple wasn't interested in doing that.
        
             | mushufasa wrote:
             | it's not that apple will sell server chips. it's that
             | developers can locally work on arm which makes it easier to
             | deploy to severs. linus torvalds had a quote about this...
        
               | gregsadetsky wrote:
               | Linus' quote/post:
               | 
               | """ And the only way that changes is if you end up saying
               | "look, you can deploy more cheaply on an ARM box, _and
               | here 's the development box you can do your work on_".
               | """
               | 
               | (emphasis in original)
               | 
               | https://www.realworldtech.com/forum/?threadid=183440&curp
               | ost...
               | 
               | Thanks, I did not know about this!
        
               | Symmetry wrote:
               | Ah, ok, that makes sense.
        
           | jamesfmilne wrote:
           | Apple have been hiring for Kubernetes and related roles. This
           | may well be for their own devops for Apple services.
           | 
           | However I'd be amazed if they don't release some kind of
           | managed service for running Swift code in the cloud. Caveat
           | emptor, though.
        
         | ageyfman wrote:
         | this is inevitable.
        
           | k__ wrote:
           | My operating systems teacher 2001 was a total RISC fan and
           | always said it would eventually overtake CISC.
           | 
           | I guess, he didn't expect this to take well after his
           | retirement.
        
             | zamadatix wrote:
             | ARM today is probably more CISCy than what he considered
             | CISC in 2001.
        
               | fanf2 wrote:
               | The best analysis of RISC vs CISC is John Mashey's
               | classic Usenet comp.arch post,
               | https://www.yarchive.net/comp/risc_definition.html
               | 
               | There he analyses existing RISC and CISC architectures,
               | and counts various features of their instruction sets.
               | They clearly fall into distinct camps.
               | 
               | But!
               | 
               | Back then (mid 1990s) x86 was the least CISCy CISC, and
               | ARM was the least RISCy RISC.
               | 
               | However, Mashey's article was looking at arm32 which is
               | relatively weird; arm64 is more like a conventional RISC.
               | 
               | So if anything, arm is more RISC now than it was in 2001.
        
               | floatboth wrote:
               | AArch64 is load-store + fixed-instruction-length, which
               | is basically what "RISC" has come to mean in the modern
               | day. X86 in 2001 was already... not that :)
        
               | k__ wrote:
               | I always understood it as that too.
        
               | lofi_lory wrote:
               | Also, isn't x86 ISA just a translation layer today? I
               | thought on the metal, there is a RISC like architecture
               | these days anyway.
        
               | astrange wrote:
               | Not really, because the variable length instructions have
               | consequences - mostly good ones because they fit in
               | memory better.
               | 
               | Also, the complex memory operands can be executed
               | directly because you can add more ALUs inside the
               | load/store unit. ARM also has more types of memory
               | operands than a traditional RISC (which was just whatever
               | MIPS did.)
        
               | nonameiguess wrote:
               | In most cases, yes, but it doesn't get rid of the
               | complexity for compiler backends that can't directly
               | target the real instruction sets Intel uses and have to
               | target the compatibility shim layer instead.
        
               | monocasa wrote:
               | Eh, it has a lot of instructions, but that was only the
               | surface of RISC. It's a deeper design philosophy than
               | that.
        
         | ericbarrett wrote:
         | I nominate Amazon for this award. As mentioned in another
         | comment here, ~50% of newly allocated EC2 instances are ARM.
        
       | klelatti wrote:
       | > AWS Graviton2-based EC2 Instances make up 14% of the installed
       | base within AWS
       | 
       | > 49% of AWS EC2 instance additions in 2020 are based on
       | Graviton2
       | 
       | Surprised at this level of Graviton2 adoption in AWS at this
       | stage. Any clues as to who is using these instances?
       | 
       | Edit: Presumably Intel's shrinking Q1 2021 Data Center revenues
       | are partly as a result of this.
        
         | KuiN wrote:
         | They're the cheapest EC2 instance type, so they're very
         | attractive to small scale deployments like side projects,
         | personal sites etc. (basically anything that can run on one or
         | two small nodes) where budget is a major concern. The t4g.micro
         | is in the free tier as well, so that'll help.
         | 
         | I host a few very low traffic sites & I'm in the process of
         | switching from a basic DO Droplet to a pair of low-end
         | Gravitons. Will save me money and give better peak performance
         | for my workloads.
        
           | ac29 wrote:
           | > switching from a basic DO Droplet to a pair of low-end
           | Gravitons. Will save me money and give better peak
           | performance for my workloads.
           | 
           | I'm having trouble figuring this out - a t4g.micro is
           | $6/month, before any storage or data transfer costs. The
           | roughly equivalent DO offering is $5/month, inclusive of 25GB
           | SSD and 1TB transfer. Even with a reserve instance discount
           | and significantly less than 1TB outbound transfer, DO seems
           | likely to be cheaper.
        
             | floatboth wrote:
             | CPU power on $5 offerings from others is likely not as
             | great. Also AWS did a free tier for everyone, and the spot
             | market is fun...
        
           | klelatti wrote:
           | Maybe, but it would take a _lot_ of people moving small
           | deployments (where by definition the savings would be small
           | especially relative to the fixed costs of getting to work on
           | Arm) in a relatively short space of time to have this impact
           | - so I'm sceptical (and if it is then it must be very easy to
           | move to Arm - which I'm also sceptical of).
           | 
           | More likely some very big customers (peer comment mentions
           | Twitter) moving to Graviton2 for cost savings.
        
             | tyingq wrote:
             | Graviton might be the top and/or default choice in their
             | management console when you create an EC2 instance. That
             | would swing things pretty quickly for all the free tier
             | folks.
             | 
             | Edit: Nope, not yet, but close..you still have to change
             | the radio button: https://imgur.com/a/W0Sweyy
        
               | klelatti wrote:
               | I'm confused - the x86 box is ticked by default there.
        
               | tyingq wrote:
               | Yes...edited my edit. I'm pretty sure there was no radio
               | button for some time, you would have had to scroll into
               | other choices to get a Graviton instance.
        
         | vosper wrote:
         | The company I work for has migrated hundreds of heavily-
         | utilised Elasticsearch and Storm nodes to Graviton. No
         | performance issues, pure cost saving. We're working on the rest
         | of our systems now. We're going to save hundreds of thousands
         | of dollars over the next few years.
        
         | TheGuyWhoCodes wrote:
         | RDS support Graviton2 as the instance type, maybe people with
         | supported versions just migrated.
        
         | speedgoose wrote:
         | AWS itself is probably the main user I would guess. So you use
         | them indirectly through AWS' vendor lock APIs.
        
         | ksec wrote:
         | >Presumably Intel's shrinking Q1 2021 Data Center revenues are
         | partly as a result of this.
         | 
         | It was both AMD and ARM.
         | 
         | There are many work loads that G2 offer immediate cost /
         | performance advantage. AWS charges per vCPU, which is one
         | _thread_ on Intel /AMD and one _Core_ on ARM. So you get ~30%
         | performance improvement along with a ~30% lower cost for using
         | ARM Graviton Series. Most of them have reported a total of 50%
         | reduction in cost. For those that have hundreds if not
         | thousands of EC2 running which fits that workload advantage,
         | this is too much saving to pass on.
         | 
         | There are many SaaS running on EC2 that has mentioned their
         | success on twitter and various other places.
         | 
         | Worth pointing out, this is with Amazon installing as many as
         | they get from TSMC.
         | 
         | A few months ago on HN I wrote [1] about how half of the Intel
         | DC market will be gone in a few years time.
         | 
         | Edit: Another point worth mentioning, this is as much of a
         | threat to Medium and Smaller Size Cloud like Linode and DO
         | where they dont have access to ARM (Yet). And even when they
         | get it Amazon have the cost advantage of building their own
         | instead of buying from a company ( Ampere ).
         | 
         | [1] https://news.ycombinator.com/item?id=25808856
        
           | qzw wrote:
           | Linode and DO could always offer a physical x86 core instead
           | of a virtual SMT core. It would cut into margins somewhat,
           | but maybe Intel and AMD would be more willing to discount
           | when they have to play defense. I think one problem for the
           | x86 guys is that because the demand for chips far exceeds
           | supply, they're still doing "fine" or even "well" right now.
           | So the threat from ARM may still be perceived on mostly an
           | intellectual level instead of provoking the necessary
           | visceral survival response.
        
             | klelatti wrote:
             | I think it's unlikely that anyone at Intel thinks -20% Q1
             | datacenter revenue YOY is doing well.
             | 
             | The question is what options do they have to deal with
             | this?
        
         | eqvinox wrote:
         | "instance additions" also doesn't take instance
         | size/performance into account. If ARM-based instances are
         | overall smaller, that'd allow more of them, distorting the
         | numbers...
         | 
         | Percentage of compute power would be cool to know here.
        
         | [deleted]
        
         | sleepy_keita wrote:
         | I wouldn't be surprised if AWS is using Graviton2 pretty
         | heavily for internal processes as well, stuff like control
         | planes for the major services like S3, SQS, SNS, etc...
        
         | StreamBright wrote:
         | I know many users. Basically any non-x86 workload that cost
         | sensitive can benefit from moving to arm instances. Database
         | instances are good candidates, big data workloads as well.
        
         | baybal2 wrote:
         | I believe Intel couldn't have imagined with what ease their
         | biggest customers can turn into their biggest competitors
         | overnight.
         | 
         | Even a decade ago that would've been unthinkable, but today,
         | making a cookiecutter SoC is relatively easy because nearly
         | everything can be taken off the shelf.
         | 
         | Production costs though.... sub-10nm mask set costs completely
         | rule out anything resembling a startup competing in this area.
         | 
         | I think 65nm was the last golden opportunity to jump on the
         | departing train. It was still posible to ship a cookie cutter
         | chip under $1m, now... no way.
         | 
         | Now, Semi industry is basically Airbus vs. Boeing.
        
           | sitkack wrote:
           | Startups can absolutely compete here. There is sufficient
           | capital to fund chip design (integration) and it is
           | relatively low risk. We are going to see a huge number of Arm
           | and RISC-V solutions on the market 14 months from now.
        
             | [deleted]
        
           | klelatti wrote:
           | Agreed .. except weren't Ampere (2017) and Nuvia (2019)
           | startups?
        
             | floatboth wrote:
             | Ampere kinda was an acquisition of Applied Micro, but its
             | internal X-Gene uarch was dropped into the trash very
             | quickly in favor of Arm Neoverse...
        
             | baybal2 wrote:
             | Ampere basically started as a re-labeled XGene from Applied
             | Micro which started back in 40nm days. And they came with
             | quite some cash to start with: their backer is Carlyle
             | Group, the biggest LBO shop in the world.
             | 
             | Nuvia basically never intended to really compete Intel, or
             | AMD heads on. Their $30m stash would've been just enough
             | for a single "leap of faith" tapeout on a generation old
             | node, and a year of life support after.
             | 
             | They were aiming for a quick sell from the start too.
        
               | klelatti wrote:
               | Depends on your definition of startup I guess. Certainly
               | seems to be enough capital available.
               | 
               | I definitely don't agree with premise that it's now
               | Boeing vs Airbus now (certainly less so than it was a few
               | years ago when x86 was the only game in town).
        
           | aliswe wrote:
           | Cookie cutter?
        
           | jleahy wrote:
           | Do you actually know how much a sub-10nm mask set costs?
           | There's a lot of speculation from people who don't have
           | access to those numbers. Those who do are bound by NDAs.
        
             | baybal2 wrote:
             | I do hear figures in single megabucks for relatively small
             | tapeouts.
             | 
             | Back in 65nm, 40nm days, big tapeouts were already costing
             | in high 6 figure digits in masks.
             | 
             | And... masks are not the most expensive items on the
             | signoff costs these days.
             | 
             | Specialist verification, outsourced synthesis, layout,
             | analog, physical, test, and other specialist services will
             | easily cost more than the maskset for <40nm.
             | 
             | I would not be surprised if tier 1 fabless already spend
             | $10m+ per design just on them.
        
         | phamilton wrote:
         | We haven't migrated yet but we expect to do some benchmarking
         | this quarter for Aurora.
         | 
         | For EC2 we run on spot and spot c5.metal are cheaper per vcpu
         | than c6g.metal, so we haven't prio'd benchmarking our compute
         | loads.
        
         | [deleted]
        
         | somethingwitty1 wrote:
         | There have been a bunch of higher-profile, "we moved to
         | Graviton2 and cut costs". Twitter, for example, migrated:
         | https://www.hpcwire.com/off-the-wire/twitter-selects-aws-and...
        
           | neuronexmachina wrote:
           | Also Netflix, although I don't think they've said what
           | portion of their instances they've migrated:
           | https://aws.amazon.com/ec2/graviton/customers/
        
         | carlosf wrote:
         | AWS's own offerings such as RDS and internal control plane
         | stuff are very likely using ARM behind the sheets.
         | 
         | I have evaluated going ARM, but I ended up deciding the savings
         | were not worth it.
         | 
         | Not only you need to mantain 2 archs simultaneously for some
         | time, but porting some stuff to ARM, (e.g. Python) can be a
         | pain in the ass.
         | 
         | Finally, my devs work in AMD64 and that would be another source
         | for "why does this work in dev but not prod".
        
           | loudmax wrote:
           | > Finally, my devs work in AMD64 and that would be another
           | source for "why does this work in dev but not prod".
           | 
           | I can see a use case for building a CI/CD pipeline on
           | Raspberry Pi's.
        
       | mwcampbell wrote:
       | I hope that ARM servers with reasonable specs won't be exclusive
       | to AWS and the other hyperscalers. For example, it would be nice
       | if OVH would offer ARM-based dedicated servers.
        
       | simondotau wrote:
       | You can tell it's a modern CPU because its name matches the
       | [A-Z][0-9] pattern.
        
       | ksec wrote:
       | Tl;dr
       | 
       | V1 = Slightly tweaked ARM Cortex X1 with SVE ( Used on Snapdragon
       | 888 ) on 7nm aiming at ~4W per Core.
       | 
       | N2 = New Cortex with AMRv9, ~40% IPC improvement over N1 or 10%
       | lower than V1, SVE _2_ , 5nm aiming at ~2W per Core. With Similar
       | die size to N1. ( I fully expect Amazon to go 128 Core with their
       | N2 Graviton )
       | 
       | So in case anyone is wondering, no, it is not Apple M1 level. Not
       | anywhere close.
       | 
       | CMN-700 = More Cores and support of Memory partitioning,
       | important for VMs.
        
         | paulpan wrote:
         | Nice TLDR!
         | 
         | It's apples and oranges comparison to Apple M1 chips (server
         | vs. consumer) but does hint at what's possible with the next
         | generation ARM Cortex "X2" cores, that could appear in next
         | year's flagship smartphones and laptops. A 30-40% IPC jump,
         | partly due to moving to 5nm fabrication process, is huge.
         | 
         | Given the right implementation, namely squeezing more big cores
         | than the current 1-3-4 configuration, it could close the gap
         | considerably with Apple.
        
           | plekter wrote:
           | Process node changes generally doesn't do anything for IPC -
           | those are generally rather due to microarchitecture
           | improvements, so I doubt the move to 5nm has anything to do
           | with the IPC gain..?
        
             | wmf wrote:
             | The node shrink lets you afford more transistors that
             | provide more IPC.
        
               | plekter wrote:
               | I agree with that - but if you take an unchanged core and
               | manufacture it at a different node, then you won't see a
               | change in IPC, which in my book makes it questionable to
               | attribute IPC gains to the process node.
        
         | andrewcchen wrote:
         | The cores don't serve the same purpose as the M1 cores. M1 is
         | optimized for single thread at the cost of die size (and a bit
         | of power). I don't have exact numbers, but say the apple M1
         | core takes 1.5x the die area of N2, the you'd get better
         | performance by putting in 1.5x the number of N2 cores.
        
       | MangoCoffee wrote:
       | this is good. if ARM take off for consumers and business. i'm
       | hoping RISC-V will get some traction. there are alternative to
       | Intel's x86.
        
       | truth_seeker wrote:
       | Hah. I like it when I can enjoy my hammock instead of fine tuning
       | my code to weird limits for performance.
       | 
       | DDR5,PCIE 5.0, SVE speedup and 40% IPC improvement put a big
       | smile on my face.
        
       | eqvinox wrote:
       | Funnily coincident timing: current post #2 (GCC 11.1 released)
       | adds support for the CPUs mentioned here (currently post #4):
       | AArch64 & arm              A number of new CPUs are supported
       | through arguments to the -mcpu and -mtune options in both the arm
       | and aarch64 backends (GCC identifiers in parentheses):
       | Arm Cortex-A78 (cortex-a78).             Arm Cortex-A78AE
       | (cortex-a78ae).             Arm Cortex-A78C (cortex-a78c).
       | Arm Cortex-X1 (cortex-x1).             Arm Neoverse V1
       | (neoverse-v1).             Arm Neoverse N2 (neoverse-n2).
       | 
       | Good to see work going into this at the proper times. (Not that
       | that was much of a problem for CPU cores in recent times. Still
       | not a matter of course though.)
        
         | floatboth wrote:
         | These tunings will only be used if you compile stuff yourself
         | with -march=native (or specifying one particular model). Most
         | software out there would be compiled with generic non-tuned
         | optimizations. The tuning is rarely a huge deal though.
        
           | sitkack wrote:
           | That is not entirely true. Binaries in the packaging systems
           | might not be compiled for the most recent atomic instructions
           | which can really affect performance.
           | 
           | https://blog.dbi-services.com/aws-postgresql-on-
           | graviton2-aa...
           | 
           | https://github.com/microsoft/STL/issues/488
           | 
           | We are about 9-14 months away from the right pieces making
           | their way through the software ecosystems where this will be
           | almost a non-issue.
           | 
           | Exciting times for everyone!
        
           | eqvinox wrote:
           | True, but it's still relevant for 3 things:
           | 
           | - when you have a particularly CPU-intensive application,
           | you'd hopefully compile it to target your system
           | 
           | - the cloud providers can just do a custom Debian/Ubuntu/...
           | build for their zillions of identical systems
           | 
           | - the library loading mechanism on Linux is slowly getting
           | support for having multiple compile variants of a library
           | packaged into different subdirectories of /lib (e.g.
           | "/usr/lib64/tls/haswell/x86_64")
           | 
           | Also I was mostly trying to point out as a positive how well
           | the interaction is working there between ARM and the GCC
           | project. I wish it were like this for other types of silicon.
           | 
           | (CPU vendors all seem to be getting this right, and GPUs are
           | slowly getting there, but much other silicon is horrible...
           | e.g. wifi chips)
        
       | up6w6 wrote:
       | Does anyone know if these processors make mobile development
       | easier ? I mean, its the same architecture now, right ?
       | 
       | The only thing I could find is
       | https://www.genymotion.com/blog/just-launched-arm-native-and...
        
         | jayd16 wrote:
         | Not really. Maybe the emulators are faster but the main
         | languages are managed anyway.
        
           | danudey wrote:
           | Plus for iOS, even the idevice simulator just builds for
           | whatever platform you're testing on.
        
       | potlee wrote:
       | I wonder why they didn't use AWS wide numbers rather then just
       | EC2. I would have thought EC2 would lag in the transition while
       | AWS services would make the switch quickly
        
         | dogma1138 wrote:
         | Because EC2 represents a more realistic market adoption, it's
         | more important to know if you can run the software of your
         | choice on ARM than can Amazon develop a service on an ARM
         | stack.
        
       ___________________________________________________________________
       (page generated 2021-04-27 23:00 UTC)