[HN Gopher] Microsoft is first to get HBM-juiced AMD CPUs
___________________________________________________________________
Microsoft is first to get HBM-juiced AMD CPUs
Author : rbanffy
Score : 45 points
Date : 2024-11-23 19:40 UTC (4 days ago)
(HTM) web link (www.nextplatform.com)
(TXT) w3m dump (www.nextplatform.com)
| alias_neo wrote:
| Another article source that uses a headline initialisation,
| "HBM"[0] in this case, and almost 30 times at that, and yet
| doesn't spell out what it stands for even once. I will point this
| out every time I see it, and continue to refuse to read from
| places that don't follow this simple etiquette.
|
| Be better.
|
| [0] High Bandwidth Memory
| sroussey wrote:
| They don't define HPC either, but I think the audience of this
| site knows these acronyms.
| switchbak wrote:
| > Be better
|
| They're almost certainly not on this forum, and they're not
| reading your post. So who is that quip directed at?
| rcthompson wrote:
| Presumably it's directed at anyone writing an article for
| public consumption.
| dang wrote:
| " _Please don 't pick the most provocative thing in an article
| or post to complain about in the thread. Find something
| interesting to respond to instead._"
|
| https://news.ycombinator.com/newsguidelines.html
| Dylan16807 wrote:
| I don't feel like that rule works here? If you cut out part
| of the second sentence to get "Find something interesting to
| respond to", that's a good point, but the full context is
| "instead [of the most provocative thing in the article]" and
| that doesn't fit a complaint about acronyms.
| dang wrote:
| To paraphrase McLuhan, you don't like that guideline? We
| got others:
|
| " _Please don 't complain about tangential annoyances--e.g.
| article or website formats, name collisions, or back-button
| breakage. They're too common to be interesting._"
|
| The point, in any case, is to avoid off-topic indignation
| about tangential things, even annoying ones.
| sroussey wrote:
| This is very unlikely, but it would be interesting if Apple
| included HBM memory interfaces in the Max series of Apple
| Silicon, to be used in MacPro (and maybe the studio, but the Pro
| needs some more differentiation like HBM or a NUMA layout).
| throwaway48476 wrote:
| They'd have to redesign the on die memory controller and tape
| out a new die all of which is expensive. Apple is a consumer
| technology company not a cutting tech tech company making high
| cost products for niche markets. There's just no way to make
| HBM work in the consumer space at the current price.
| sroussey wrote:
| Well, they could put in a memory controller for both DDR5 and
| HBM on the die, so they would only have one die to tape out.
|
| The Max variant is something they are using in their own
| datacenters. It would be possible that they would use an HBM
| solely for themselves, but it would be cheaper overall if
| they did the same thing for workstations.
| nsteel wrote:
| HBM has a very wide, relatively slow interface. A HBM phy
| is physically large and takes up a lot of beachfront, a
| massive waste of area (money) if you're not going to use
| it. It also (currently) requires you to use a silicon
| interposer, another huge extra expense in your design.
| 7e wrote:
| The MacPro is not a consumer device. It is very much a high
| cost niche (professional) product.
| whatever1 wrote:
| So currently our consumer grade CPUs with DDR5 are limited to
| less than 100GB/s. Meanwhile Apple is shipping computers with
| multiples of that.
| oDot wrote:
| Strix Halo is rumored to be about twice as fast but
| unfortunately not near Apple's speed.
| mananaysiempre wrote:
| On the other hand, I bought an 8C/16T Zen 4 laptop with 64GB
| RAM and an 4TB SSD for less than $2000 total including tax.
| I'll take that trade.
| sroussey wrote:
| How are 70b LLMs running on that?
| jocaal wrote:
| The free version of chatgpt is better than your 70b LLM,
| whats the point?
| cma wrote:
| Qwen coder 32b instruct is the state of the art for local
| LLM coding and will run with a smallish context with that
| on a 64GB laptop with partial GPU offload. Probably around
| .8 tok/sec.
|
| With a quantization of it you can run larger contexts and
| go a bit faster. 1.4 tok/sec at 8b quant with offload to a
| 6GB laptop GPU.
|
| Speculative decoding has been being added to lots of the
| runtimes recently and can give a 20-30% boost with a 1
| billion weight model running the speculative token stream.
| YetAnotherNick wrote:
| Why do you need 64GB RAM?
| theandrewbailey wrote:
| Several Electron apps and 1000+ Chrome tabs. (just
| guessing)
| scheme271 wrote:
| Memory can get eaten up pretty quickly between IDEs,
| containers, and other dev tools. I have had a combination
| of a fairly small C++ application, clion, and a container
| use up more than 32GB when combined with my typical
| applications.
| shakabrah wrote:
| I have 128gb in PC (largely because I can) and android
| studio, a few containers and running emulators will take a
| sizable bite into that. My 18gb MacBook would be digging
| into swap and compressing to get there.
| mananaysiempre wrote:
| Partly because I can, because unless you go absolutely wild
| with excess it's the RAM equivalent of fuck-you money.
| (Note it's unified though, so in some situations a desktop
| with 48GB main RAM and 16GB VRAM can be comparable, and
| from what I know about today's desktops that could be a
| good machine but not a lavish one.) Partly because I need
| to do exploratory statistics to say ten- or twenty-gigabyte
| I/O traces, and being able to chuck the whole thing into
| Pandas and not agonize over cleaning up every temporary is
| just comfy.
| jchw wrote:
| Running 128 GiB of RAM on the box I am typing on. I could
| list a lot of things but if you really wanted a quick
| demonstration, compiling Chromium will eat 128 GiB of RAM
| happily.
| whatever1 wrote:
| Nobody ever regretted having extra memory on their
| computer.
| _bare_metal wrote:
| HBM or not, those latest server chips are crazy fast and
| efficient. You can probably condense 8 servers from just a few
| years ago into one latest-gen Epyc.
|
| I run BareMetalSavings.com[0], a toy for ballpark-estimating
| bare-metal/cloud savings, and the things you can do with just a
| few servers today are pretty crazy.
|
| [0]: https://www.BareMetalSavings.com
| 1oooqooq wrote:
| a graph showing this against cloud instance costs and aws
| profits would be funny.
| tame3902 wrote:
| Core counts have increased dramatically. The latest AMD server
| CPUs have up to 192 cores. The Zen1 top model had only 32 cores
| and that was already a lot compared to Intel. However, the
| power consumption has also increased: the current top model has
| a TDP of 500W.
| Guzba wrote:
| Does absolute power consumption matter or would it not be
| better to focus on per-core power consumption? Eg running 6
| 32-core CPUs seems unlikely to be better than 1 192-core.
| tame3902 wrote:
| Yes, per core power consumption or better performance per
| Watt is usually more relevant than the total power
| consumption. And 1 high-core CPU is usually better than the
| same number of cores on multiple CPUs. (That is unless you
| are trying to maximize memory bandwidth per Watt.)
|
| What I wanted to get at is that the pure core count can be
| misleading if you care about power consumption. If you
| don't and just look at performance, the current CPU
| generations are monsters. But if you care about
| performance/Watt, the improvement isn't that large. The
| Zen1 CPU I was talking about had a TDP of 180 W. So you get
| 6x as many cores, but the power consumption increases by
| 2.7x.
| phodge wrote:
| That could be an interesting site when it's done but I couldn't
| see where you factor in the price of electricity for running
| bare metal in a 24/7 climate-controlled environment, which I
| would assume expect is the biggest expense by far.
| _bare_metal wrote:
| The first FAQ question addresses exactly that: colocation
| costs are added to every bare metal item (even storage
| drives).
|
| Note that this doesn't intend to be used for accounting, but
| for estimating, and it's good at that. If anything, it's more
| favorable to the cloud (e.g, no egress costs).
|
| If you're on the cloud right now and BMS shows you can save a
| lot of money, that's a good indicator to carefully research
| the subject.
| AnotherGoodName wrote:
| I'm having trouble parsing the article even though I know fully
| what the mi300 is and what hbm memory is.
|
| I'm not alone right? This article seems to be complete ai
| nonsense at various points confusing the gpu and cpu portions of
| the product and not at all giving clarity on which parts of the
| product have hbm memory.
| Tepix wrote:
| People are buying dual Epyc Zen5 systems to get 24 DDR5-6000
| memory channel bandwidth for inferemcing large LLMs on CPU.
| Clearly there is a demand for very fast memory.
| bobim wrote:
| Sure, implicit finite elements analysis scales up to two cores
| per DDR4 channel. Core density just grew up faster than
| bandwidth and it makes all those high cores cpus a waste for
| this kinds of workloads.
| pixelpoet wrote:
| This website and its spend-128-hours-disabling-1024-separate-
| cookies-and-vendors is pure cancer, I wish HN would just ban all
| these disgusting data hoovering leeches.
|
| By now I get that no one else cares and I should just stop coming
| here.
| codr7 wrote:
| Agreed, I wouldn't mind banning paywall crap while we're at it.
| zelphirkalt wrote:
| Nope, you are not alone. Without uBlock Origin I wouldn't go
| much anywhere these days.
| scouw wrote:
| It's far from an ideal solution but I've started to just have
| JS disabled by default in uBlock Origin and then enabling it
| manually on a per-site basis. Bit of a hassle but many sites,
| including this one, render just fine witbout JS and are
| arguably better without it.
| AnotherGoodName wrote:
| That entire site is 100% ai generated click farming. The fact
| that the top comments here not even talking about the content
| of the article but instead more general 'hbm is great' worries
| me.
|
| For anyone that read the article which product has hbm
| attached? The cpu or gpu? What is the name of this product?
|
| There's literally nothing specific in here and the article is
| rambling ai nonsense. The whole site is a machine gun of such
| articles.
| pixelpoet wrote:
| 1000% agreed, especially the silent carrying on with the
| topic without acknowledging the cancer actual link. I know
| e.g. Deng is a real person who works hard to make HN decent,
| but wtf is this, are we just going to accept Daily Mail links
| too?
|
| I weep for the internet we had as children.
| mitjam wrote:
| These kind of articles are like denial of service attacks on
| human attention. If I read just a few of those, I would be
| confused for the rest of the day.
| gessha wrote:
| Does it make sense to put HBM memory on mobile computing like
| laptops and smartphones?
| nsteel wrote:
| They'd be very expensive. Is there really a consumer market for
| large amounts (tens of GBs) of RAM with super high (800+GB/s)
| bandwidth? I guess you'll say AI applications but doing that
| amount of work on a mobile seems mad.
___________________________________________________________________
(page generated 2024-11-27 23:01 UTC)