[HN Gopher] Arm Announces Neoverse V1, N2 Platforms and CPUs, CM...
___________________________________________________________________
Arm Announces Neoverse V1, N2 Platforms and CPUs, CMN-700 Mesh
Author : timthorn
Score : 178 points
Date : 2021-04-27 13:21 UTC (9 hours ago)
(HTM) web link (www.anandtech.com)
(TXT) w3m dump (www.anandtech.com)
| akmittal wrote:
| Apple's M1 has made ARM mainstream for laptops, Lets see which
| company does same for server space.
|
| Hopefully ARM on cloud will result in cheaper prices.
| lifty wrote:
| Apple's M1 will make ARM mainstream on the server side.
| leesalminen wrote:
| Personally, I don't see many server admins choosing to pay
| the Apple Tax to get M1 into their data center. I don't see
| how the watt/performance ratio could pay off that kind of
| tax.
| lifty wrote:
| I did not mean to imply that the actual M1 will be used in
| data centers. Apple is quite popular among developers and
| its also a trendsetter which will probably lead to other
| computer manufacturers to adopt ARM for personal computers.
| So having more people use ARM on their personal computers
| will lead to more ARM adoption in the data center.
| leesalminen wrote:
| Ah, ok, I understand. Thanks for clarifying!
| wayneftw wrote:
| > Apple is quite popular among developers...
|
| The great majority of developers use Windows or Linux
| according to every Stack Overflow survey from the past
| ten years. Only ~25% use a Mac.
| [deleted]
| alireza94 wrote:
| I believe interpreting statistics from those surveys in
| this way isn't fair. There are so many developers around
| the world but the pattern of value/money generation by
| them is not uniform; in other words, a small percentage
| of developers work for companies that pay the largest
| share of server bills and penetration rate of macOS
| devices among developers of top companies is probably
| higher than average. (I'm not implying that developers
| who work on non-macOS devices, make less value because
| your device doesn't have - nearly - anything to do with
| your impact. I'm just talking about a trend and possible
| misinterpretation of data)
| simondotau wrote:
| The OP wasn't suggesting Apple M1 chips in the data centre,
| but rather that Apple M1 chips in developer workstations
| will disrupt the inertia of x64 dev -> x64 prod. It will be
| easier for developers to choose ARM in production when
| their local box is ARM.
| Symmetry wrote:
| Apple hasn't seemed interested historically. And the Nuvia
| folks left Apple to found their company explicitly because
| they thought an M1 style CPU core would do well in servers
| but Apple wasn't interested in doing that.
| mushufasa wrote:
| it's not that apple will sell server chips. it's that
| developers can locally work on arm which makes it easier to
| deploy to severs. linus torvalds had a quote about this...
| gregsadetsky wrote:
| Linus' quote/post:
|
| """ And the only way that changes is if you end up saying
| "look, you can deploy more cheaply on an ARM box, _and
| here 's the development box you can do your work on_".
| """
|
| (emphasis in original)
|
| https://www.realworldtech.com/forum/?threadid=183440&curp
| ost...
|
| Thanks, I did not know about this!
| Symmetry wrote:
| Ah, ok, that makes sense.
| jamesfmilne wrote:
| Apple have been hiring for Kubernetes and related roles. This
| may well be for their own devops for Apple services.
|
| However I'd be amazed if they don't release some kind of
| managed service for running Swift code in the cloud. Caveat
| emptor, though.
| ageyfman wrote:
| this is inevitable.
| k__ wrote:
| My operating systems teacher 2001 was a total RISC fan and
| always said it would eventually overtake CISC.
|
| I guess, he didn't expect this to take well after his
| retirement.
| zamadatix wrote:
| ARM today is probably more CISCy than what he considered
| CISC in 2001.
| fanf2 wrote:
| The best analysis of RISC vs CISC is John Mashey's
| classic Usenet comp.arch post,
| https://www.yarchive.net/comp/risc_definition.html
|
| There he analyses existing RISC and CISC architectures,
| and counts various features of their instruction sets.
| They clearly fall into distinct camps.
|
| But!
|
| Back then (mid 1990s) x86 was the least CISCy CISC, and
| ARM was the least RISCy RISC.
|
| However, Mashey's article was looking at arm32 which is
| relatively weird; arm64 is more like a conventional RISC.
|
| So if anything, arm is more RISC now than it was in 2001.
| floatboth wrote:
| AArch64 is load-store + fixed-instruction-length, which
| is basically what "RISC" has come to mean in the modern
| day. X86 in 2001 was already... not that :)
| k__ wrote:
| I always understood it as that too.
| lofi_lory wrote:
| Also, isn't x86 ISA just a translation layer today? I
| thought on the metal, there is a RISC like architecture
| these days anyway.
| astrange wrote:
| Not really, because the variable length instructions have
| consequences - mostly good ones because they fit in
| memory better.
|
| Also, the complex memory operands can be executed
| directly because you can add more ALUs inside the
| load/store unit. ARM also has more types of memory
| operands than a traditional RISC (which was just whatever
| MIPS did.)
| nonameiguess wrote:
| In most cases, yes, but it doesn't get rid of the
| complexity for compiler backends that can't directly
| target the real instruction sets Intel uses and have to
| target the compatibility shim layer instead.
| monocasa wrote:
| Eh, it has a lot of instructions, but that was only the
| surface of RISC. It's a deeper design philosophy than
| that.
| ericbarrett wrote:
| I nominate Amazon for this award. As mentioned in another
| comment here, ~50% of newly allocated EC2 instances are ARM.
| klelatti wrote:
| > AWS Graviton2-based EC2 Instances make up 14% of the installed
| base within AWS
|
| > 49% of AWS EC2 instance additions in 2020 are based on
| Graviton2
|
| Surprised at this level of Graviton2 adoption in AWS at this
| stage. Any clues as to who is using these instances?
|
| Edit: Presumably Intel's shrinking Q1 2021 Data Center revenues
| are partly as a result of this.
| KuiN wrote:
| They're the cheapest EC2 instance type, so they're very
| attractive to small scale deployments like side projects,
| personal sites etc. (basically anything that can run on one or
| two small nodes) where budget is a major concern. The t4g.micro
| is in the free tier as well, so that'll help.
|
| I host a few very low traffic sites & I'm in the process of
| switching from a basic DO Droplet to a pair of low-end
| Gravitons. Will save me money and give better peak performance
| for my workloads.
| ac29 wrote:
| > switching from a basic DO Droplet to a pair of low-end
| Gravitons. Will save me money and give better peak
| performance for my workloads.
|
| I'm having trouble figuring this out - a t4g.micro is
| $6/month, before any storage or data transfer costs. The
| roughly equivalent DO offering is $5/month, inclusive of 25GB
| SSD and 1TB transfer. Even with a reserve instance discount
| and significantly less than 1TB outbound transfer, DO seems
| likely to be cheaper.
| floatboth wrote:
| CPU power on $5 offerings from others is likely not as
| great. Also AWS did a free tier for everyone, and the spot
| market is fun...
| klelatti wrote:
| Maybe, but it would take a _lot_ of people moving small
| deployments (where by definition the savings would be small
| especially relative to the fixed costs of getting to work on
| Arm) in a relatively short space of time to have this impact
| - so I'm sceptical (and if it is then it must be very easy to
| move to Arm - which I'm also sceptical of).
|
| More likely some very big customers (peer comment mentions
| Twitter) moving to Graviton2 for cost savings.
| tyingq wrote:
| Graviton might be the top and/or default choice in their
| management console when you create an EC2 instance. That
| would swing things pretty quickly for all the free tier
| folks.
|
| Edit: Nope, not yet, but close..you still have to change
| the radio button: https://imgur.com/a/W0Sweyy
| klelatti wrote:
| I'm confused - the x86 box is ticked by default there.
| tyingq wrote:
| Yes...edited my edit. I'm pretty sure there was no radio
| button for some time, you would have had to scroll into
| other choices to get a Graviton instance.
| vosper wrote:
| The company I work for has migrated hundreds of heavily-
| utilised Elasticsearch and Storm nodes to Graviton. No
| performance issues, pure cost saving. We're working on the rest
| of our systems now. We're going to save hundreds of thousands
| of dollars over the next few years.
| TheGuyWhoCodes wrote:
| RDS support Graviton2 as the instance type, maybe people with
| supported versions just migrated.
| speedgoose wrote:
| AWS itself is probably the main user I would guess. So you use
| them indirectly through AWS' vendor lock APIs.
| ksec wrote:
| >Presumably Intel's shrinking Q1 2021 Data Center revenues are
| partly as a result of this.
|
| It was both AMD and ARM.
|
| There are many work loads that G2 offer immediate cost /
| performance advantage. AWS charges per vCPU, which is one
| _thread_ on Intel /AMD and one _Core_ on ARM. So you get ~30%
| performance improvement along with a ~30% lower cost for using
| ARM Graviton Series. Most of them have reported a total of 50%
| reduction in cost. For those that have hundreds if not
| thousands of EC2 running which fits that workload advantage,
| this is too much saving to pass on.
|
| There are many SaaS running on EC2 that has mentioned their
| success on twitter and various other places.
|
| Worth pointing out, this is with Amazon installing as many as
| they get from TSMC.
|
| A few months ago on HN I wrote [1] about how half of the Intel
| DC market will be gone in a few years time.
|
| Edit: Another point worth mentioning, this is as much of a
| threat to Medium and Smaller Size Cloud like Linode and DO
| where they dont have access to ARM (Yet). And even when they
| get it Amazon have the cost advantage of building their own
| instead of buying from a company ( Ampere ).
|
| [1] https://news.ycombinator.com/item?id=25808856
| qzw wrote:
| Linode and DO could always offer a physical x86 core instead
| of a virtual SMT core. It would cut into margins somewhat,
| but maybe Intel and AMD would be more willing to discount
| when they have to play defense. I think one problem for the
| x86 guys is that because the demand for chips far exceeds
| supply, they're still doing "fine" or even "well" right now.
| So the threat from ARM may still be perceived on mostly an
| intellectual level instead of provoking the necessary
| visceral survival response.
| klelatti wrote:
| I think it's unlikely that anyone at Intel thinks -20% Q1
| datacenter revenue YOY is doing well.
|
| The question is what options do they have to deal with
| this?
| eqvinox wrote:
| "instance additions" also doesn't take instance
| size/performance into account. If ARM-based instances are
| overall smaller, that'd allow more of them, distorting the
| numbers...
|
| Percentage of compute power would be cool to know here.
| [deleted]
| sleepy_keita wrote:
| I wouldn't be surprised if AWS is using Graviton2 pretty
| heavily for internal processes as well, stuff like control
| planes for the major services like S3, SQS, SNS, etc...
| StreamBright wrote:
| I know many users. Basically any non-x86 workload that cost
| sensitive can benefit from moving to arm instances. Database
| instances are good candidates, big data workloads as well.
| baybal2 wrote:
| I believe Intel couldn't have imagined with what ease their
| biggest customers can turn into their biggest competitors
| overnight.
|
| Even a decade ago that would've been unthinkable, but today,
| making a cookiecutter SoC is relatively easy because nearly
| everything can be taken off the shelf.
|
| Production costs though.... sub-10nm mask set costs completely
| rule out anything resembling a startup competing in this area.
|
| I think 65nm was the last golden opportunity to jump on the
| departing train. It was still posible to ship a cookie cutter
| chip under $1m, now... no way.
|
| Now, Semi industry is basically Airbus vs. Boeing.
| sitkack wrote:
| Startups can absolutely compete here. There is sufficient
| capital to fund chip design (integration) and it is
| relatively low risk. We are going to see a huge number of Arm
| and RISC-V solutions on the market 14 months from now.
| [deleted]
| klelatti wrote:
| Agreed .. except weren't Ampere (2017) and Nuvia (2019)
| startups?
| floatboth wrote:
| Ampere kinda was an acquisition of Applied Micro, but its
| internal X-Gene uarch was dropped into the trash very
| quickly in favor of Arm Neoverse...
| baybal2 wrote:
| Ampere basically started as a re-labeled XGene from Applied
| Micro which started back in 40nm days. And they came with
| quite some cash to start with: their backer is Carlyle
| Group, the biggest LBO shop in the world.
|
| Nuvia basically never intended to really compete Intel, or
| AMD heads on. Their $30m stash would've been just enough
| for a single "leap of faith" tapeout on a generation old
| node, and a year of life support after.
|
| They were aiming for a quick sell from the start too.
| klelatti wrote:
| Depends on your definition of startup I guess. Certainly
| seems to be enough capital available.
|
| I definitely don't agree with premise that it's now
| Boeing vs Airbus now (certainly less so than it was a few
| years ago when x86 was the only game in town).
| aliswe wrote:
| Cookie cutter?
| jleahy wrote:
| Do you actually know how much a sub-10nm mask set costs?
| There's a lot of speculation from people who don't have
| access to those numbers. Those who do are bound by NDAs.
| baybal2 wrote:
| I do hear figures in single megabucks for relatively small
| tapeouts.
|
| Back in 65nm, 40nm days, big tapeouts were already costing
| in high 6 figure digits in masks.
|
| And... masks are not the most expensive items on the
| signoff costs these days.
|
| Specialist verification, outsourced synthesis, layout,
| analog, physical, test, and other specialist services will
| easily cost more than the maskset for <40nm.
|
| I would not be surprised if tier 1 fabless already spend
| $10m+ per design just on them.
| phamilton wrote:
| We haven't migrated yet but we expect to do some benchmarking
| this quarter for Aurora.
|
| For EC2 we run on spot and spot c5.metal are cheaper per vcpu
| than c6g.metal, so we haven't prio'd benchmarking our compute
| loads.
| [deleted]
| somethingwitty1 wrote:
| There have been a bunch of higher-profile, "we moved to
| Graviton2 and cut costs". Twitter, for example, migrated:
| https://www.hpcwire.com/off-the-wire/twitter-selects-aws-and...
| neuronexmachina wrote:
| Also Netflix, although I don't think they've said what
| portion of their instances they've migrated:
| https://aws.amazon.com/ec2/graviton/customers/
| carlosf wrote:
| AWS's own offerings such as RDS and internal control plane
| stuff are very likely using ARM behind the sheets.
|
| I have evaluated going ARM, but I ended up deciding the savings
| were not worth it.
|
| Not only you need to mantain 2 archs simultaneously for some
| time, but porting some stuff to ARM, (e.g. Python) can be a
| pain in the ass.
|
| Finally, my devs work in AMD64 and that would be another source
| for "why does this work in dev but not prod".
| loudmax wrote:
| > Finally, my devs work in AMD64 and that would be another
| source for "why does this work in dev but not prod".
|
| I can see a use case for building a CI/CD pipeline on
| Raspberry Pi's.
| mwcampbell wrote:
| I hope that ARM servers with reasonable specs won't be exclusive
| to AWS and the other hyperscalers. For example, it would be nice
| if OVH would offer ARM-based dedicated servers.
| simondotau wrote:
| You can tell it's a modern CPU because its name matches the
| [A-Z][0-9] pattern.
| ksec wrote:
| Tl;dr
|
| V1 = Slightly tweaked ARM Cortex X1 with SVE ( Used on Snapdragon
| 888 ) on 7nm aiming at ~4W per Core.
|
| N2 = New Cortex with AMRv9, ~40% IPC improvement over N1 or 10%
| lower than V1, SVE _2_ , 5nm aiming at ~2W per Core. With Similar
| die size to N1. ( I fully expect Amazon to go 128 Core with their
| N2 Graviton )
|
| So in case anyone is wondering, no, it is not Apple M1 level. Not
| anywhere close.
|
| CMN-700 = More Cores and support of Memory partitioning,
| important for VMs.
| paulpan wrote:
| Nice TLDR!
|
| It's apples and oranges comparison to Apple M1 chips (server
| vs. consumer) but does hint at what's possible with the next
| generation ARM Cortex "X2" cores, that could appear in next
| year's flagship smartphones and laptops. A 30-40% IPC jump,
| partly due to moving to 5nm fabrication process, is huge.
|
| Given the right implementation, namely squeezing more big cores
| than the current 1-3-4 configuration, it could close the gap
| considerably with Apple.
| plekter wrote:
| Process node changes generally doesn't do anything for IPC -
| those are generally rather due to microarchitecture
| improvements, so I doubt the move to 5nm has anything to do
| with the IPC gain..?
| wmf wrote:
| The node shrink lets you afford more transistors that
| provide more IPC.
| plekter wrote:
| I agree with that - but if you take an unchanged core and
| manufacture it at a different node, then you won't see a
| change in IPC, which in my book makes it questionable to
| attribute IPC gains to the process node.
| andrewcchen wrote:
| The cores don't serve the same purpose as the M1 cores. M1 is
| optimized for single thread at the cost of die size (and a bit
| of power). I don't have exact numbers, but say the apple M1
| core takes 1.5x the die area of N2, the you'd get better
| performance by putting in 1.5x the number of N2 cores.
| MangoCoffee wrote:
| this is good. if ARM take off for consumers and business. i'm
| hoping RISC-V will get some traction. there are alternative to
| Intel's x86.
| truth_seeker wrote:
| Hah. I like it when I can enjoy my hammock instead of fine tuning
| my code to weird limits for performance.
|
| DDR5,PCIE 5.0, SVE speedup and 40% IPC improvement put a big
| smile on my face.
| eqvinox wrote:
| Funnily coincident timing: current post #2 (GCC 11.1 released)
| adds support for the CPUs mentioned here (currently post #4):
| AArch64 & arm A number of new CPUs are supported
| through arguments to the -mcpu and -mtune options in both the arm
| and aarch64 backends (GCC identifiers in parentheses):
| Arm Cortex-A78 (cortex-a78). Arm Cortex-A78AE
| (cortex-a78ae). Arm Cortex-A78C (cortex-a78c).
| Arm Cortex-X1 (cortex-x1). Arm Neoverse V1
| (neoverse-v1). Arm Neoverse N2 (neoverse-n2).
|
| Good to see work going into this at the proper times. (Not that
| that was much of a problem for CPU cores in recent times. Still
| not a matter of course though.)
| floatboth wrote:
| These tunings will only be used if you compile stuff yourself
| with -march=native (or specifying one particular model). Most
| software out there would be compiled with generic non-tuned
| optimizations. The tuning is rarely a huge deal though.
| sitkack wrote:
| That is not entirely true. Binaries in the packaging systems
| might not be compiled for the most recent atomic instructions
| which can really affect performance.
|
| https://blog.dbi-services.com/aws-postgresql-on-
| graviton2-aa...
|
| https://github.com/microsoft/STL/issues/488
|
| We are about 9-14 months away from the right pieces making
| their way through the software ecosystems where this will be
| almost a non-issue.
|
| Exciting times for everyone!
| eqvinox wrote:
| True, but it's still relevant for 3 things:
|
| - when you have a particularly CPU-intensive application,
| you'd hopefully compile it to target your system
|
| - the cloud providers can just do a custom Debian/Ubuntu/...
| build for their zillions of identical systems
|
| - the library loading mechanism on Linux is slowly getting
| support for having multiple compile variants of a library
| packaged into different subdirectories of /lib (e.g.
| "/usr/lib64/tls/haswell/x86_64")
|
| Also I was mostly trying to point out as a positive how well
| the interaction is working there between ARM and the GCC
| project. I wish it were like this for other types of silicon.
|
| (CPU vendors all seem to be getting this right, and GPUs are
| slowly getting there, but much other silicon is horrible...
| e.g. wifi chips)
| up6w6 wrote:
| Does anyone know if these processors make mobile development
| easier ? I mean, its the same architecture now, right ?
|
| The only thing I could find is
| https://www.genymotion.com/blog/just-launched-arm-native-and...
| jayd16 wrote:
| Not really. Maybe the emulators are faster but the main
| languages are managed anyway.
| danudey wrote:
| Plus for iOS, even the idevice simulator just builds for
| whatever platform you're testing on.
| potlee wrote:
| I wonder why they didn't use AWS wide numbers rather then just
| EC2. I would have thought EC2 would lag in the transition while
| AWS services would make the switch quickly
| dogma1138 wrote:
| Because EC2 represents a more realistic market adoption, it's
| more important to know if you can run the software of your
| choice on ARM than can Amazon develop a service on an ARM
| stack.
___________________________________________________________________
(page generated 2021-04-27 23:00 UTC)