[HN Gopher] Microsoft is first to get HBM-juiced AMD CPUs
       ___________________________________________________________________
        
       Microsoft is first to get HBM-juiced AMD CPUs
        
       Author : rbanffy
       Score  : 45 points
       Date   : 2024-11-23 19:40 UTC (4 days ago)
        
 (HTM) web link (www.nextplatform.com)
 (TXT) w3m dump (www.nextplatform.com)
        
       | alias_neo wrote:
       | Another article source that uses a headline initialisation,
       | "HBM"[0] in this case, and almost 30 times at that, and yet
       | doesn't spell out what it stands for even once. I will point this
       | out every time I see it, and continue to refuse to read from
       | places that don't follow this simple etiquette.
       | 
       | Be better.
       | 
       | [0] High Bandwidth Memory
        
         | sroussey wrote:
         | They don't define HPC either, but I think the audience of this
         | site knows these acronyms.
        
         | switchbak wrote:
         | > Be better
         | 
         | They're almost certainly not on this forum, and they're not
         | reading your post. So who is that quip directed at?
        
           | rcthompson wrote:
           | Presumably it's directed at anyone writing an article for
           | public consumption.
        
         | dang wrote:
         | " _Please don 't pick the most provocative thing in an article
         | or post to complain about in the thread. Find something
         | interesting to respond to instead._"
         | 
         | https://news.ycombinator.com/newsguidelines.html
        
           | Dylan16807 wrote:
           | I don't feel like that rule works here? If you cut out part
           | of the second sentence to get "Find something interesting to
           | respond to", that's a good point, but the full context is
           | "instead [of the most provocative thing in the article]" and
           | that doesn't fit a complaint about acronyms.
        
             | dang wrote:
             | To paraphrase McLuhan, you don't like that guideline? We
             | got others:
             | 
             | " _Please don 't complain about tangential annoyances--e.g.
             | article or website formats, name collisions, or back-button
             | breakage. They're too common to be interesting._"
             | 
             | The point, in any case, is to avoid off-topic indignation
             | about tangential things, even annoying ones.
        
       | sroussey wrote:
       | This is very unlikely, but it would be interesting if Apple
       | included HBM memory interfaces in the Max series of Apple
       | Silicon, to be used in MacPro (and maybe the studio, but the Pro
       | needs some more differentiation like HBM or a NUMA layout).
        
         | throwaway48476 wrote:
         | They'd have to redesign the on die memory controller and tape
         | out a new die all of which is expensive. Apple is a consumer
         | technology company not a cutting tech tech company making high
         | cost products for niche markets. There's just no way to make
         | HBM work in the consumer space at the current price.
        
           | sroussey wrote:
           | Well, they could put in a memory controller for both DDR5 and
           | HBM on the die, so they would only have one die to tape out.
           | 
           | The Max variant is something they are using in their own
           | datacenters. It would be possible that they would use an HBM
           | solely for themselves, but it would be cheaper overall if
           | they did the same thing for workstations.
        
             | nsteel wrote:
             | HBM has a very wide, relatively slow interface. A HBM phy
             | is physically large and takes up a lot of beachfront, a
             | massive waste of area (money) if you're not going to use
             | it. It also (currently) requires you to use a silicon
             | interposer, another huge extra expense in your design.
        
           | 7e wrote:
           | The MacPro is not a consumer device. It is very much a high
           | cost niche (professional) product.
        
       | whatever1 wrote:
       | So currently our consumer grade CPUs with DDR5 are limited to
       | less than 100GB/s. Meanwhile Apple is shipping computers with
       | multiples of that.
        
         | oDot wrote:
         | Strix Halo is rumored to be about twice as fast but
         | unfortunately not near Apple's speed.
        
         | mananaysiempre wrote:
         | On the other hand, I bought an 8C/16T Zen 4 laptop with 64GB
         | RAM and an 4TB SSD for less than $2000 total including tax.
         | I'll take that trade.
        
           | sroussey wrote:
           | How are 70b LLMs running on that?
        
             | jocaal wrote:
             | The free version of chatgpt is better than your 70b LLM,
             | whats the point?
        
             | cma wrote:
             | Qwen coder 32b instruct is the state of the art for local
             | LLM coding and will run with a smallish context with that
             | on a 64GB laptop with partial GPU offload. Probably around
             | .8 tok/sec.
             | 
             | With a quantization of it you can run larger contexts and
             | go a bit faster. 1.4 tok/sec at 8b quant with offload to a
             | 6GB laptop GPU.
             | 
             | Speculative decoding has been being added to lots of the
             | runtimes recently and can give a 20-30% boost with a 1
             | billion weight model running the speculative token stream.
        
           | YetAnotherNick wrote:
           | Why do you need 64GB RAM?
        
             | theandrewbailey wrote:
             | Several Electron apps and 1000+ Chrome tabs. (just
             | guessing)
        
             | scheme271 wrote:
             | Memory can get eaten up pretty quickly between IDEs,
             | containers, and other dev tools. I have had a combination
             | of a fairly small C++ application, clion, and a container
             | use up more than 32GB when combined with my typical
             | applications.
        
             | shakabrah wrote:
             | I have 128gb in PC (largely because I can) and android
             | studio, a few containers and running emulators will take a
             | sizable bite into that. My 18gb MacBook would be digging
             | into swap and compressing to get there.
        
             | mananaysiempre wrote:
             | Partly because I can, because unless you go absolutely wild
             | with excess it's the RAM equivalent of fuck-you money.
             | (Note it's unified though, so in some situations a desktop
             | with 48GB main RAM and 16GB VRAM can be comparable, and
             | from what I know about today's desktops that could be a
             | good machine but not a lavish one.) Partly because I need
             | to do exploratory statistics to say ten- or twenty-gigabyte
             | I/O traces, and being able to chuck the whole thing into
             | Pandas and not agonize over cleaning up every temporary is
             | just comfy.
        
             | jchw wrote:
             | Running 128 GiB of RAM on the box I am typing on. I could
             | list a lot of things but if you really wanted a quick
             | demonstration, compiling Chromium will eat 128 GiB of RAM
             | happily.
        
             | whatever1 wrote:
             | Nobody ever regretted having extra memory on their
             | computer.
        
       | _bare_metal wrote:
       | HBM or not, those latest server chips are crazy fast and
       | efficient. You can probably condense 8 servers from just a few
       | years ago into one latest-gen Epyc.
       | 
       | I run BareMetalSavings.com[0], a toy for ballpark-estimating
       | bare-metal/cloud savings, and the things you can do with just a
       | few servers today are pretty crazy.
       | 
       | [0]: https://www.BareMetalSavings.com
        
         | 1oooqooq wrote:
         | a graph showing this against cloud instance costs and aws
         | profits would be funny.
        
         | tame3902 wrote:
         | Core counts have increased dramatically. The latest AMD server
         | CPUs have up to 192 cores. The Zen1 top model had only 32 cores
         | and that was already a lot compared to Intel. However, the
         | power consumption has also increased: the current top model has
         | a TDP of 500W.
        
           | Guzba wrote:
           | Does absolute power consumption matter or would it not be
           | better to focus on per-core power consumption? Eg running 6
           | 32-core CPUs seems unlikely to be better than 1 192-core.
        
             | tame3902 wrote:
             | Yes, per core power consumption or better performance per
             | Watt is usually more relevant than the total power
             | consumption. And 1 high-core CPU is usually better than the
             | same number of cores on multiple CPUs. (That is unless you
             | are trying to maximize memory bandwidth per Watt.)
             | 
             | What I wanted to get at is that the pure core count can be
             | misleading if you care about power consumption. If you
             | don't and just look at performance, the current CPU
             | generations are monsters. But if you care about
             | performance/Watt, the improvement isn't that large. The
             | Zen1 CPU I was talking about had a TDP of 180 W. So you get
             | 6x as many cores, but the power consumption increases by
             | 2.7x.
        
         | phodge wrote:
         | That could be an interesting site when it's done but I couldn't
         | see where you factor in the price of electricity for running
         | bare metal in a 24/7 climate-controlled environment, which I
         | would assume expect is the biggest expense by far.
        
           | _bare_metal wrote:
           | The first FAQ question addresses exactly that: colocation
           | costs are added to every bare metal item (even storage
           | drives).
           | 
           | Note that this doesn't intend to be used for accounting, but
           | for estimating, and it's good at that. If anything, it's more
           | favorable to the cloud (e.g, no egress costs).
           | 
           | If you're on the cloud right now and BMS shows you can save a
           | lot of money, that's a good indicator to carefully research
           | the subject.
        
       | AnotherGoodName wrote:
       | I'm having trouble parsing the article even though I know fully
       | what the mi300 is and what hbm memory is.
       | 
       | I'm not alone right? This article seems to be complete ai
       | nonsense at various points confusing the gpu and cpu portions of
       | the product and not at all giving clarity on which parts of the
       | product have hbm memory.
        
       | Tepix wrote:
       | People are buying dual Epyc Zen5 systems to get 24 DDR5-6000
       | memory channel bandwidth for inferemcing large LLMs on CPU.
       | Clearly there is a demand for very fast memory.
        
         | bobim wrote:
         | Sure, implicit finite elements analysis scales up to two cores
         | per DDR4 channel. Core density just grew up faster than
         | bandwidth and it makes all those high cores cpus a waste for
         | this kinds of workloads.
        
       | pixelpoet wrote:
       | This website and its spend-128-hours-disabling-1024-separate-
       | cookies-and-vendors is pure cancer, I wish HN would just ban all
       | these disgusting data hoovering leeches.
       | 
       | By now I get that no one else cares and I should just stop coming
       | here.
        
         | codr7 wrote:
         | Agreed, I wouldn't mind banning paywall crap while we're at it.
        
         | zelphirkalt wrote:
         | Nope, you are not alone. Without uBlock Origin I wouldn't go
         | much anywhere these days.
        
         | scouw wrote:
         | It's far from an ideal solution but I've started to just have
         | JS disabled by default in uBlock Origin and then enabling it
         | manually on a per-site basis. Bit of a hassle but many sites,
         | including this one, render just fine witbout JS and are
         | arguably better without it.
        
         | AnotherGoodName wrote:
         | That entire site is 100% ai generated click farming. The fact
         | that the top comments here not even talking about the content
         | of the article but instead more general 'hbm is great' worries
         | me.
         | 
         | For anyone that read the article which product has hbm
         | attached? The cpu or gpu? What is the name of this product?
         | 
         | There's literally nothing specific in here and the article is
         | rambling ai nonsense. The whole site is a machine gun of such
         | articles.
        
           | pixelpoet wrote:
           | 1000% agreed, especially the silent carrying on with the
           | topic without acknowledging the cancer actual link. I know
           | e.g. Deng is a real person who works hard to make HN decent,
           | but wtf is this, are we just going to accept Daily Mail links
           | too?
           | 
           | I weep for the internet we had as children.
        
           | mitjam wrote:
           | These kind of articles are like denial of service attacks on
           | human attention. If I read just a few of those, I would be
           | confused for the rest of the day.
        
       | gessha wrote:
       | Does it make sense to put HBM memory on mobile computing like
       | laptops and smartphones?
        
         | nsteel wrote:
         | They'd be very expensive. Is there really a consumer market for
         | large amounts (tens of GBs) of RAM with super high (800+GB/s)
         | bandwidth? I guess you'll say AI applications but doing that
         | amount of work on a mobile seems mad.
        
       ___________________________________________________________________
       (page generated 2024-11-27 23:01 UTC)