[HN Gopher] AWS unveils Graviton4 & Trainium2
___________________________________________________________________
AWS unveils Graviton4 & Trainium2
Author : skilled
Score : 57 points
Date : 2023-11-28 16:41 UTC (6 hours ago)
(HTM) web link (press.aboutamazon.com)
(TXT) w3m dump (press.aboutamazon.com)
| fhub wrote:
| Not much to discuss until there is pricing. I have a bunch of
| Graviton2 instances that didn't make sense to upgrade to any
| Graviton3 instances due to pricing bump for 16GB 4 cores
| (t4g.xlarge).
| monlockandkey wrote:
| What Arm core is Graviton 4 using? 30% performance uplift is a
| good amount
| cherioo wrote:
| Likely Neoverse V2 architecture, based on A710 cores
| aeyes wrote:
| > Neoverse V2
|
| https://aws.amazon.com/blogs/aws/join-the-preview-for-new-me...
| PedroBatista wrote:
| What happens to old&used Graviton 3 chips?
|
| At least in the "old days" there was ( still is ) a secondary
| market for used server parts..
|
| Don't know how companies like Amazon, Microsoft and Google would
| frame a question like this so their "green" narratives wouldn't
| be hurt but I'm sure they'll do an excellent job.
| rstupek wrote:
| They'll continue to run in their datacenter since they're still
| basically brand new?
| baz00 wrote:
| They're still running prehistoric Intel Xeons. I'm sure they'll
| just rot slowly until the instances fail.
| threeseed wrote:
| As a user you don't get much visibility into the specs of
| managed services e.g. DynamoDB.
|
| So that's an obvious home for the chips that are no longer
| available to users.
| asperous wrote:
| If you haven't used aws a lot then you might not know this but
| the old instance types stick around and you can still use them,
| especially as "spot" which lets you bid for server time.
|
| I had a science project which was cpu bound and it turns out
| because people bid based on the performance, the old chips end
| up costing the same in terms of cpu work done/$ (older chips
| cost less per hr but do less).
|
| aws though was by far the most expensive so switching to like
| oracle with their ampere arm was a lot cheaper for me.
| discodave wrote:
| They just... don't retire them? The most expensive thing in a
| DC is the chips, so it's worth it to just build more datacenter
| space and keep the old ones around.
|
| In 2019, before I left the EC2 Networking / VPC team, we were
| using M3 instances for our internal services... those machines
| were probably installed in 2013 or 2014, making them over 5
| years old.
|
| With the slowdown in Moore's law and chip speeds, I'd wager
| that team is still using those M3s now.
|
| Eventually the machines actually start failing, so they need to
| be retired, but a large portion of machines likely make it to
| 10 years.
| temp0826 wrote:
| They for sure can find a use internally for them. Hat-tip to
| the less-shiny teams like glacier that have to endlessly put
| out fires on dilapidated old s3 compute/array handmedowns.
| whalesalad wrote:
| I can't wait to find old surplus custom ARM silicon from this
| period at the recycler or on eBay.
|
| As a kid I always wanted one of those yellow google search
| appliances and now you can find them everywhere being used as
| like lawn ornaments.
| cjsplat wrote:
| Depending on the numbers involved, previous generation hardware
| can waterfall to infrastructure apps that are throughput based.
|
| Things accessed through network APIs and billed per op or in
| aggregate. Distributed file systems, databases, even build and
| regression suite systems.
|
| Another key point is that older generations of servers for full
| custom cloud environments tend to co-evolve with their
| environments. The amount of power and cooling for a rack may
| not support a modern deployment.
|
| Especially if a generation lasts 6 years. You might be able to
| cascade gen N+1 to N, but N+6 may require a full retrofit. A 6
| year old data center that is partially filled as individual
| servers fail may justify waiting for N+7 or even 8 to cover the
| cost of the downtime and retrofit.
|
| There is a reason Google announced that they are depreciating
| servers over 6 years and Meta is at 5 years, vs the old
| accounting standard of 3 years.
|
| Then of course there is a secondary market for memory and
| standard PCI cards, but the market for 6 year old tech is
| mainly spares, so it is unlikely to absorb the full size of the
| N-6 year data center build.
|
| If you are considering a refurb style resale market for 6 year
| old tech, it is often the case that the performance per dollar
| is a non-starter because of the amount of power the older tech
| consumes.
| aseipp wrote:
| They don't sell these. They reuse them and perform maintenance
| on them until their last breath and part them out once they
| die.
|
| Hyperscalers design their own datacenter "SKUs" for
| storage/compute, all the way from power delivery to networking
| to chassis. These servers are going to be heavily customized
| and it's unlikely that even if they fit normal form factors
| that they will work in the same way as COTS devices or things
| you would buy from Supermicro.
|
| You could possibly make it work. If they sold them. But they
| don't, and if you're in the market for that stuff, Supermicro
| will just design it for you anyway, because presumably you have
| actual money.
|
| And the reality is they're probably either break even or
| greener doing it this way, as opposed to washing their hands of
| it and selling servers on Ebay so they can eventually get throw
| in landfills wholesale by nerds once their startups fail or
| they get bored of them. Just because you stick your head in the
| sand doesn't mean it doesn't end up in a landfill.
| buildbot wrote:
| The scale they are quoting at 100,000 chip clusters and 65
| exaflops seems impossible. At 800W per chip, that's 80MW of
| power! Unless they literally built an entire DC of these things,
| nobody is training anything on the entire cluster at once. It's
| probably 10-20 separate datacenters being combined for marketing
| reasons here.
| tempay wrote:
| What makes you think it's 800W per chip?
| buildbot wrote:
| It's about what the I though the H100 was, that's 700W
| actually. But even at say, 400W, that's 40MW of power. I
| guess some datacenters are built in the 40-100MW range from
| some quick googling, but I really doubt they actually can
| network 100,000 chips together in any sort of performant way,
| that's supercomputer level interconnect. I don't think most
| datacenters support highly interlinked network interconnect
| like this would need either.
| tempay wrote:
| They have instances with 16 chips so I presume there are at
| least 16 chips per server. I'd also expect the power
| consumption to be more like 100-200W given they seem more
| like Google's TPUs than a H100.
|
| For the interconnect I doubt this is their typical
| interconnect but it doesn't seem completely unreasonable.
| Even when not running massive clusters they'll still need
| the interconnect to pair the random collections of machines
| that people are using.
| bluedino wrote:
| Per server? Our dual CPU intel servers take about 800-900W at
| full power
| LunaSea wrote:
| Since Graviton3 still isn't available in most regions, especially
| on the RDS side, I'm really not holding my breath.
| ilaksh wrote:
| Do you need specific software to train a model using Trainium2?
| For example, what about fine-tuning a language model? Will
| something like QLoRA work?
| snewman wrote:
| > Graviton4 processors deliver up to 30% better compute
| performance, 50% more cores, and 75% more memory bandwidth than
| Graviton3.
|
| This seems ambiguous. Presumably this is 50% more cores per chip.
| What about "30% better compute performance" and "75% more memory
| bandwidth": is that per core, or per chip? If the latter, then
| per-core compute performance would actually be lower.
|
| Also, "up to" could be hiding almost anything. Has anyone seen a
| source with clearer information as to how per-core application
| performance compares to earlier Graviton generations?
| p1esk wrote:
| Wait, how could "50% more cores, and 75% more memory bandwidth"
| result in anything less than 50% of better compute performance?
| jagger27 wrote:
| Clock speed
| bluedino wrote:
| It's the best Graviton processor yet!
| kevincox wrote:
| I would assume that "up to" means that for all of the workloads
| that they benchmarked the best result was 30% better compute
| performance. Not a very useful number as your workload is very
| unlikely to hit the right set of requirements to see that
| uplift.
| otterley wrote:
| AWS Graviton specialist here!
|
| The performance improvement is on a per-core basis. The pending
| availability of 96-vCPU Graviton4 instances is icing on the
| cake!
| aseipp wrote:
| Neoverse V2, so this will be probably be the first widely
| available ARMv9 server with SVE2, a server-class SKU you can
| actually get your hands on (i.e. not a mobile
| phone/Grace/Fugaku.) It's about damn time!
___________________________________________________________________
(page generated 2023-11-28 23:01 UTC)