[HN Gopher] AMD's Strix Halo under the hood
       ___________________________________________________________________
        
       AMD's Strix Halo under the hood
        
       Author : kristianp
       Score  : 145 points
       Date   : 2025-03-14 09:23 UTC (13 hours ago)
        
 (HTM) web link (chipsandcheese.com)
 (TXT) w3m dump (chipsandcheese.com)
        
       | mort96 wrote:
       | I don't understand the name "Strix". It's a name a GPU and
       | motherboard partner of theirs, Asus, uses (used?) for their
       | products. It's impossible for me to read "AMD Strix" and not
       | think of it as some ASU's GPU with an AMD chip in it, or some
       | motherboard for AMD sockets.
       | 
       | Aren't there enough syllables out there to invent a combination
       | which doesn't collide with your own board partners?
        
         | newsclues wrote:
         | Strix https://en.wikipedia.org/wiki/Strix_(mythology) Halo is
         | the code name.
         | 
         | AMD Ryzen AI MAX 300 is the product name. This continuing to
         | use the code name.
        
           | mort96 wrote:
           | Well, it's a public enough code name that it surprises me
           | that they just used Asus's name.
        
             | damnitbuilds wrote:
             | Confused me too.
             | 
             | Did AMD not know?
             | 
             | Or did AMD know and not care?
        
               | 1una wrote:
               | AMD makes exclusive deals with ASUS regularly. I guess
               | they're just good friends.
        
           | alecmg wrote:
           | oh great, they are not using a confusing name anymore...
           | wait, now they are using a stupid name!
        
         | bombcar wrote:
         | Spy X Family!
         | 
         | AMD is captured.
        
         | DCKing wrote:
         | I don't think AMD really uses the name "Strix Halo" to market
         | it to a large audience, it's just an internal codename. Just
         | two other recent internal names are "Hawk Point" and "Dragon
         | Range" internally, where Hawk and Dragon are names that MSI and
         | PowerColor use to market GPUs as as well. Heck, PowerColor even
         | exclusively sells AMD cards under the "Red Dragon" name!
         | 
         | AMD's marketing names for especially their mobile chips are
         | just so deliberately confusing that it makes way more sense for
         | press and enthusiasts to keep referring to it by its internal
         | code name than whatever letter/number/AI nonsense AMD's
         | marketing department comes up with.
        
         | Keyframe wrote:
         | I understand it's internal codename, but also can't read it
         | without thinking Asus. Especially considering I have Asus Strix
         | 4090 in my rig.
        
       | Tepix wrote:
       | I think having a (small desktop) system with Strix Halo plus a
       | GPU to accelerate prompt processing could be a good combo,
       | avoiding the weakness of the Mac Ultra. The Strix Halo has 16
       | PCIe lanes.
        
         | aurareturn wrote:
         | Max RAM for Strix Halo is 128GB. It's not a competitor to the
         | Mac Ultra which goes up to 512GB.
         | 
         | You shouldn't need another GPU to do prompt processing for
         | Strix Halo since the biggest model it can realistically run is
         | a 70B model. Prompt processing isn't going to help much because
         | it has a good enough GPU but its memory bandwidth is only
         | 256GB/s (~210 GB/s effective).
        
           | izacus wrote:
           | > Max RAM for Strix Halo is 128GB. It's not a competitor to
           | the Mac Ultra which goes up to 512GB.
           | 
           | What a... strange statement. How did you get to that
           | conclusion?
        
             | aurareturn wrote:
             | Why do you think it's strange?
        
               | izacus wrote:
               | The original poster arrogantly and confidently proclaims
               | that a device that costs like 2000$ isn't going to be
               | able to compete against a 10000$ SKU of another device.
               | 
               | I'm wondering how do you get to such a conclusion?
        
           | porphyra wrote:
           | The $2000 strix halo with 128 GB might not compete with the
           | $9000 Mac Studio with 512 GB but is a competitor to the $4000
           | Mac Studio with 96 GB. The slow memory bandwidth is a bummer,
           | though.
        
             | aurareturn wrote:
             | but is a competitor to the $4000 Mac Studio with 96 GB. The
             | slow memory bandwidth is a bummer, though.
             | 
             | Not really. The M4 Max has 2x the GPU power, 2.13x the
             | bandwidth, faster CPU.
             | 
             | $2000 M4 Pro Mini is more of a direct comparison. The Mini
             | only has 64GB max ram but realistically, a 32B model is the
             | biggest model you want to run with less than 300 GB/s
             | bandwidth.
        
               | Tepix wrote:
               | You will be limited to a much smaller context size with
               | half the RAM even if you're using a smaller model.
        
           | Tepix wrote:
           | Running something like Qwq 32b q4 with a ~50k context will
           | use up those 128GB with the large KV cache.
        
           | Gracana wrote:
           | Despite the hype, the 512GB Mac is not really a good buy for
           | LLMs. The ability to run a giant model on it is a novelty
           | that will wear off quickly... it's just too slow to run them
           | at that size, and in practice it has the same sweet spot of
           | 30-70B that you'd have with a much cheaper machine with a
           | GPU, without the advantage of being able to run smaller
           | models at full-GPU-accelerated speed.
        
             | SV_BubbleTime wrote:
             | There's so much flux in LLM requirements.
             | 
             | 2 to 3 tokens per second was actually probably fine for
             | most things last year.
             | 
             | Now, with reasoning and deep searching, research models,
             | you're gonna generate 1000 or more tokens just as it's
             | talking to itself to figure out what to do for you.
             | 
             | So everyone's focused on how big a model you can fit inside
             | your ram, the inference speed is now more important than it
             | was.
        
               | Gracana wrote:
               | Absolutely.
               | 
               | The thinking models really hurt. I was happy with
               | anything that ran at least as fast as I could read, then
               | "thinking" became a thing and now I need it to run ten
               | times faster.
               | 
               | I guess code is tough too. If I'm talking to a model I'll
               | read everything it says, so 10-20 tok/s is well and good,
               | but that's molasses slow if it's outputting code and I'm
               | scanning it to see if it looks right.
        
               | adgjlsfhk1 wrote:
               | counterpoint: thinking models are good since they give
               | similar quality at smaller RAM sizes. if a 16b thinking
               | model is as good as a 60b one shot model, you can use
               | more compute without as much RAM bottleneck
        
               | terribleperson wrote:
               | Counter-counterpoint: RAM costs are coming down fast this
               | year. Compute, not so much.
               | 
               | I still agree, though.
        
             | aurareturn wrote:
             | It runs DeepSeek R1 q4 MoE well enough.
        
               | Gracana wrote:
               | It does have an edge on being able to run large MoE
               | models.
        
           | Tepix wrote:
           | Of course it's a competitor. Only a fraction of M3 Ultra sold
           | will have 512GB RAM
        
         | nrp wrote:
         | Note that none of the PCIe interfaces on Strix Halo are larger
         | than x4. The intent is to allow multiple NVMe drives and a Wi-
         | Fi card. We also used PCIe for 5Gbit Ethernet.
        
           | Scramblejams wrote:
           | Love what you're doing, I'm in batch 4!
           | 
           | Feedback: That 4x slot looks like it's closed on the end. Can
           | we get an open-ended slot there instead so we can choose to
           | install cards with longer interfaces? That's often a useful
           | fallback.
        
           | sunshowers wrote:
           | Hi Nirav! Long time admirer.
           | 
           | Gen 4 x4 or gen 5 x4? I saw that gen 5 x4 results in maybe a
           | 3% decrease in 5090 performance compared to gen 5 x16.
        
       | noelwelsh wrote:
       | For me the question is: what does this mean for future of desktop
       | CPUs? High bandwidth unified memory seems very compelling for
       | many applications, but the GPU doesn't have as much juice as a
       | separate unit. Are we going to see more these supposedly laptop
       | APUs finding their way into desktops, and essentially a
       | bifurcation of desktops into APUs and discrete CPU/GPUs? Or will
       | desktop CPUs also migrate to becoming APUs?
        
         | Tepix wrote:
         | iGPUs have been getting ever closer to entry level and even
         | mid-range GPUs.
         | 
         | In addition there's a interest in having a lot of memory for
         | LLM acceleration. I expect both CPUs to get more LLM
         | acceleration capabilities and desktop pc memory bandwidth to
         | increase from its current rather slow dual channel 64bit
         | DDR5-6000 status quo.
         | 
         | We're already hearing the first rumors for Medusa Halo coming
         | in 2026 with 50% more bandwidth than Strix Halo.
        
           | Gravityloss wrote:
           | This has been the case for decades now.
           | 
           | GPU:s have existed about 30 years. Embedded ones for 20 years
           | or so? Why are the embedded GPU:s always so stunted?
        
             | aurareturn wrote:
             | Why are the embedded GPU:s always so stunted?
             | 
             | Memory bandwidth. Besides LLMs, gaming on an iGPU will
             | always be more expensive for the same performance as
             | dedicated GPUs due to memory bandwidth.
             | 
             | Before someone tells me consoles using iGPUs, keep in mind
             | that consoles use GDDR as its main system memory which has
             | slow access times for the CPU. In a non-console, CPU
             | performance is important. GDDR is also power hungry so they
             | can't be used as the main system RAM in a laptop form.
        
               | pixelfarmer wrote:
               | > Memory bandwidth.
               | 
               | It is the thermal envelope that defines pretty much
               | everything nowadays. Without active management of it
               | chips would die a heat death very fast. Which also means
               | chips are designed with a certain chip external heat
               | management in mind. The more heat you can get out of a
               | system and away from a chip, the more powerful you can
               | design these things. And game consoles do have active
               | cooling, i.e. they sit between desktop PCs and thin
               | laptops, probably sharing the thermal handling capacity
               | with larger gaming laptops, if anything.
        
             | close04 wrote:
             | Just look at how a discrete GPU vs. an integrated GPU look
             | like in terms of size, power, cooling, and other
             | constraints like memory type and placement. That's why both
             | options still exist. If one size did it all, the other
             | option would just die out.
        
             | Plasmoid2000ad wrote:
             | I think the market is very limited for high end iGPUs in
             | practice with the compromises that occur with them.
             | 
             | On Desktop, upgradability is very popular and obviously the
             | returns from the cooling on discrete GPUs are immense. With
             | GPU dies costing so much, due to their size and dependency
             | on TSMC, pushing the faster but hotter is probably a cost
             | effecient compromise.
             | 
             | On Laptops with APUs, you currently ususally give up
             | upgradeable memory - the fastest LPDDR is only soldered on
             | (today), and the fastest solution would be on-die memory
             | for bandwith gains that only really Apple is doing.
             | 
             | Marketing wise, low core count Laptops appear to be hard to
             | sell. Gaming laptops seem to ship with more cores than the
             | desktop you would build - the CPU appears out-specced. I
             | think this is because CPUs are cheaper, but that means a
             | high-end APU would also need large CPU to compete. Now
             | you've got a relatively unbalanced APU, with expensive hot
             | CPU and relatively hot iGPU crammed in a small space -
             | cooling is now tricky.
             | 
             | This is going to be compared with cheap RTX 4060 laptops -
             | and generally look bad by comparison. I think what's
             | changing now to narrow the gap is Handhelds, and
             | questionable practices from Nvidia.
             | 
             | The Steam Deck kicked big OEMs into requesting AMD for
             | large APUs.
             | 
             | Nvidia seems to have influence on OEM AMD Laptops - Intel
             | CPU and Nvidia GPU for years now seem to ship first, in
             | larger quantities, and get marketing push despite CPU
             | arguably being worse.
             | 
             | Intel despite their issues seem to raising the iGPU bar too
             | - their Desktop GPU investment seems to be paying off, and
             | might be pressuring AMD to react.
        
             | shadowpho wrote:
             | >Why are the embedded GPU:s always so stunted?
             | 
             | Because gpu want a lot of silicon. 5080 is 300 mm^2.
             | Meanwhile ryzen 9xxx is 50 mm^2.
             | 
             | Meanwhile CPU wants to use that wafer space for themselves.
             | And even if you use 100% of the wafer space for GPU you
             | will have a small gpu and no cpu
        
           | Tepix wrote:
           | The sentence "In addition there's a interest in having a lot
           | of memory for LLM acceleration" was supposed to say "In
           | addition there's a interest in having a lot of memory
           | _bandwidth_ for LLM acceleration " but it's too late to edit
           | it now.
        
         | c2h5oh wrote:
         | APUs are going to replace low end video cards, because they no
         | longer make economical or technical sense.
         | 
         | Historically those cards had narrow memory bus and about a
         | quarter or less video memory of high end (not even halo) cards
         | from the same generation.
         | 
         | That narrow memory bus puts their max memory bandwidth at a
         | comparable level to desktop DDR5 with 2 DIMMs. At the same time
         | quarter of high end is just 4GB VRAM which is not enough for
         | low details for many games and prevents upscaling/frame gen
         | from working.
         | 
         | From manufacturing standpoint low end GPUs aren't great either
         | - memory controllers, video output and a bunch of other non-
         | compute components don't scale with process node.
         | 
         | At the same time unified memory and bypassing PCIE benefits
         | igpus greatly. You don't have to build an entire card, power
         | delivery, cooler - you just slightly beef up existing ones.
         | 
         | tl;dr; sub-200 dollas GPUs are dead and will be replaced by
         | APUs. I won't be surprised if they will start nibbling at lower
         | mid-range market too in the near future.
        
           | rcarmo wrote:
           | My main gaming rig (for admittedly not very intensive games)
           | has been a 7000 series Ryzen APU with a 780M, and my next one
           | will also be an APU. It makes zero economic sense to build a
           | discrete CPU system for casual gaming, even if I believe that
           | APU prices will be artificially inflated to "cozy up" to low-
           | end discrete GPU prices for a while to maximize profits.
        
             | pjmlp wrote:
             | Which is why for the games I play, a graphics workstation
             | laptop like Thinkpad P series is much more useful,
             | including GPGPU coding outside gaming, without being an
             | heavyweight circus laptop whose battery lasts half-hour.
        
         | sambull wrote:
         | That new 'desktop' from framework appears to be just that with
         | the AMD Ryzen Al Max 385
        
           | nrp wrote:
           | We have both Max 385 and Max+ 395 versions.
        
             | foxandmouse wrote:
             | Any word on putting that in a mobile device? So far there's
             | only a hp business laptop and a gaming tablet... none of
             | which appeal to the "MacBook crowd"
        
         | Symmetry wrote:
         | Having a system level cache for low latency transfer of data
         | between CPU and GPU could be very compelling for some
         | applications even if the overall GPU power is lower than a
         | dedicated card. That doesn't seem to be the case here, though?
        
           | noelwelsh wrote:
           | Strix Halo has unified memory, which is the same general
           | architecture as Apple's M series chips. This means the CPU
           | and GPU share the same memory, so there is no need to copy
           | CPU <-> GPU.
        
         | phkahler wrote:
         | >> Are we going to see more these supposedly laptop APUs
         | finding their way into desktops, and essentially a bifurcation
         | of desktops into APUs and discrete CPU/GPUs?
         | 
         | I sure hope so. We could use a new board form factor that omits
         | the GPU slot. Although my case puts the power connector and
         | button over that slot on the back so it's not completely
         | wasted, but the board area is. This has seemed like a good idea
         | for a long time to me.
         | 
         | This can also be a play against nVidia. When mainstream systems
         | use "good enough" integrated GPUs and get rid of that slot,
         | there is no place for nVidia except in high-end systems.
        
           | adrian_b wrote:
           | There is no need for a new board form factor, because they
           | have existed for many decades.
           | 
           | Below the mini-ITX format with a GPU slot, there are 3
           | standard form factors that are big enough for a full-featured
           | personal computer that is more powerful than most laptops:
           | nano-ITX (120 mm x 120 mm, for 5" by 5" cases; half the area
           | of mini-ITX), 3.5" (from the size of the 3.5 inch HDDs,
           | approximately the same area with nano-ITX, but rectangular
           | instead of square) and the 4" x 4" NUC format introduced by
           | Intel.
           | 
           | With a nano-ITX or 3.5" board you can make a computer not
           | bigger than 1 liter that can ensure a low noise even for a 65
           | W power dissipation for the CPU+iGPU and that can have a
           | generous amount of peripheral ports, to cover all needs.
           | 
           | Keeping the low noise condition, one could increase the
           | maximum power-dissipation to 150 W for the CPU+iGPU in a
           | somewhat bigger case, but certainly still smaller than 2.5
           | liter.
           | 
           | I expect that we will see such mini-PCs with Strix Halo, the
           | only question is whether their price would be low enough to
           | make them worthwhile.
           | 
           | The fabrication cost for Strix Halo must be lower than for a
           | combo of CPU with discrete GPU, but the initial offerings
           | with it attempt to make the customer pay more for the benefit
           | of having a more compact system, which for many people will
           | not be enough motivation to accept a higher price.
        
         | icegreentea2 wrote:
         | The bifurcation is already happening. The last few years have
         | seen lots of miniPC/NUC like products being released.
         | 
         | One of (many) factors that were holding back this form factor
         | was the gap in iGPU/GPU performance. However with the frankly
         | total collapse of the low end GPU market in the last 3-4 years,
         | there's a much larger opening for iGPUs.
         | 
         | I also think that within the gaming space specifically, a lot
         | of the chatter around the Steam Deck helped reset expectations.
         | Like if everyone else is having fun playing games at 800p
         | low/medium, then you suddenly don't feel so bad playing at
         | maybe 1080p medium on your desktop.
        
         | adra wrote:
         | Framework made a tiny desktop form factor version with this
         | chip in it, so we'll if it gets much traction (at least among
         | enthusiasts).
        
         | juancn wrote:
         | I would love a unified memory architecture, even for external
         | GPUs.
         | 
         | Pay for memory once, and avoid all the copying around between
         | CPU/GPU/NPU for mixed algorithms, and have the workload define
         | the memory distribution.
        
         | DCKing wrote:
         | Strix Halo is impressive, but it isn't AMD going all out on the
         | concept. Strix Halo's die area (300mm2 ish) is roughly the same
         | as estimates for Apple's M3 Pro die area. The M3 Max and M3
         | Ultra are twice or four times the size.
         | 
         | In a next iteration AMD could look into doubling or quadrupling
         | the memory channels and GPU die area like as Apple has done.
         | AMD is already a pioneer in the chiplet technology Apple is
         | also using to scale up. So there's lots of room to grow for
         | even higher costs.
        
       | ryukoposting wrote:
       | Interesting read, and interesting product. If I understand it
       | right, this seems like it could be at home in a spiritual
       | successor to the Hades Canyon NUCs. I always thought those were
       | neat.
       | 
       | I wish Chips and Cheese would clean up transcripts instead of
       | publishing verbatim. Maybe I'll use the GPU on my Strix Halo to
       | generate readable transcripts of Chips and Cheese interviews.
        
         | keyringlight wrote:
         | The Framework desktop seems like a next step.
         | 
         | Although I appreciate the drive for small profile I wonder
         | where the limits are if you put a big tower cooler onto it,
         | seeing as the broad design direction is for laptops or consoles
         | I doubt there's too much left on the table. I think that
         | highlights a big challenge - is there a sizeable enough market
         | for it, or can you pull in customers from other segments to buy
         | a NUC instead. You'd need a certain amount of mass
         | manufacturing with a highly integrated design to make it
         | worthwhile.
        
           | jorvi wrote:
           | > can you pull in customers from other segments to buy a NUC
           | instead
           | 
           | I've never understood the hype for NUCs for non-office
           | settings. You can make SFF builds that are tiny and still fit
           | giant GPUs like the RTX 3090 /4090, say less for something
           | like a 4080 Super. And then you can upgrade the GPU and (woe
           | is you) CPU later on. Although a high-end X3D will easily
           | last you 2-3 GPU generations.
        
             | woodrowbarlow wrote:
             | i feel like high-end mini-ITX builds only became viable a
             | few years ago with the introduction of 700W+ SFX PSUs.
        
             | kccqzy wrote:
             | The size of NUCs is much smaller than any SFF builds with
             | RTX 3090. Some people just like smallness.
        
               | bee_rider wrote:
               | Closer to a phone than a laptop, in size!
        
             | layer8 wrote:
             | You can't mount an SFF build unobtrusively behind a monitor
             | or under a desk, it's much larger and heavier than a NUC.
        
               | phkahler wrote:
               | Agreed, although this is the smallest I've seen:
               | 
               | https://github.com/phkahler/mellori_ITX/blob/master/image
               | s/m...
               | 
               | It's currently got a 5700G - Zen 3 in it and 64GB RAM.
               | I'd like the next one to hang on the back of a monitor or
               | TV via the standard mounting holes.
        
             | bee_rider wrote:
             | You could fit a NUC in a pair of cargo shorts, FWIW. Or
             | many bicycle under-seat bags, which was nice for biking to
             | school without needing any backpack. They were in a sort
             | of... qualitatively smaller size class than laptops.
        
         | pixelpoet wrote:
         | Yeah would it have killed them to read over it just once? Can
         | they not find a single school kid to do it for lunch money or
         | something? Hell I'll do it for free, I've read this article
         | twice now, and read everything they put out the moment it hits
         | my inbox.
        
       | randomNumber7 wrote:
       | As long as they cant even provide something similiar to a simple
       | CUDA C API on consumer hardware i dont buy their stuff.
        
         | pjmlp wrote:
         | There is no such thing as simple CUDA C API, that is the
         | mistake most folks do when talking about CUDA.
         | 
         | It won over OpenCL, because it is a polyglot ecosystem, with
         | first tier support for C, C++, Fortran, and Python (JIT DSL),
         | plus several languages that have toolchains targeting PTX, the
         | IDE integration, graphical debugger, compute and graphical
         | rendering libraries.
         | 
         | All of the above AMD and Intel could have provided for OpenCL,
         | but never did when it mattered, not even after SPIR was
         | introduced.
         | 
         | Now they finally have GPGPU support for Fortran, C++, Python
         | JIT DSLs, but a bit too late to the party, because contrary to
         | NVidia it isn't like those tools are available regardless of
         | the card.
        
           | randomNumber7 wrote:
           | The early versions had been only C. Then they added a lot of
           | stuff.
           | 
           | You don't need all the fancy stuff, but OpenCL (and even more
           | so Vulkan) are too complicated when all you want to do is
           | some gpu number crunching.
           | 
           | Being able to write a kernel with something that looks like
           | C. Having pointers on gpu and cpu and being able to call
           | these kernels somewhat conveniently (like CUDA C) would be a
           | great starting point.
        
             | pjmlp wrote:
             | Early meaning until CUDA 3.0 in 2010, we are now on CUDA
             | 12.8, 15 years later.
        
       | heraldgeezer wrote:
       | Cool I guess for a mini PC but Im one of those desktop PC tower
       | nerds :)
        
         | hulk-konen wrote:
         | I hope they make this in ATX (or mATX) form factor, toss out
         | all the size, energy, and heat concerns, and add more ports and
         | interfaces.
        
       | FloatArtifact wrote:
       | Seems like Apple's M2 is a sweet spot for AI performance at 800
       | GB/s of memory bandwidth which can be added under $1,500
       | refurbished for 65 gigs of RAM.
        
         | crazystar wrote:
         | Where for $1500?
        
           | runjake wrote:
           | Not on Apple Refurbs. That would cost you about $2200.
           | 
           | And the M2 Max has a memory bandwidth of 400GB/s.
        
             | sroussey wrote:
             | I'm guessing a reference to M2 Ultra? Not sure about that
             | price though...
        
               | runjake wrote:
               | M2 Ultra refurb was over $4,000, last I checked.
        
       | swiftcoder wrote:
       | I don't really like these "lightly edited" machine transcripts.
       | There are transcription errors in many paragraphs, just adds that
       | little bit of extra friction when reading.
        
       | elorant wrote:
       | But why did they choose to build this as a mobile cpu though? I
       | don't need 128GB of unified RAM on my laptop. It's the desktop
       | where things happen.
        
         | cptskippy wrote:
         | Because people are accustom to unified memory in laptops and
         | also complain about the low amounts of ram and inability to
         | upgrade.
         | 
         | This solves those problems but apparently uncovers a new one.
        
           | ForTheKidz wrote:
           | > Because people are accustom to unified memory in laptops
           | 
           | Surely the vast majority of laptops sold in the last _five
           | years_ don 't have unified memory yet.
        
             | zamadatix wrote:
             | I've never seen a good technical comparison showing what's
             | new between "Unified Memory" vs traditional APUs/iGPUs
             | memory subsystems laptops have had for over a decade, only
             | comparisons to dGPU setups which are rarer in laptops. The
             | biggest differences comparing Apple Silicon or Strix Halo
             | to their predecessors seems to be more about the overall
             | performance scale, particularly of the iGPU, than the way
             | memory is shared. Articles and blogposts most commonly
             | reference:
             | 
             | - The CPU/GPU memory are shared (does not have to be
             | dedicated to be used by either).
             | 
             | - You don't need to copy data in memory to move it between
             | the CPU/GPU.
             | 
             | - It still uses separate caches for the CPU & GPU but the
             | two are able to talk to each other directly on the same die
             | instead of an external bus.
             | 
             | But these have long been true of traditional APUs/iGPUs,
             | not new changes. I did even see some claims Apple put the
             | memory on die too and that's what makes it unified but
             | checking that it seems to still actually be "on package",
             | which isn't unique either, and it wouldn't explain any
             | differences in access patterns anyways. I've been
             | particularly confused as to why Strix Halo would now
             | qualify as having Unified Memory when it doesn't seem
             | anything is different than before, save the performance.
             | 
             | If anyone has a deeper understanding of what's new in the
             | Unified Memory approach it'd be appreciated!
        
               | kbolino wrote:
               | I believe, but don't know for sure, that classic iGPUs
               | still behaved like discrete PCI devices under the hood,
               | and accessed RAM using DMA over PCI(e), which is slower
               | than the RAM is capable of, and also adds some overhead.
               | Whereas, modern unified memory approaches have a direct
               | high-bandwidth connection between RAM and the GPU, which
               | may be shared with the CPU, bypassing PCI entirely and
               | accessing the memory at its full speed.
        
             | wmf wrote:
             | Yes, around 90% of laptops sold in the last ten years have
             | unified memory.
        
         | aurareturn wrote:
         | Because desktops are a much smaller market and AMD caught the
         | Apple Silicon FOMO.
        
           | alienthrowaway wrote:
           | IMO, the likely cause is AMD capitalizing on multiple OEMs
           | having Steam-Deck envy and/or setting the foundation for the
           | Steam Deck 2 with near-desktop graphics fidelity rather than
           | 800p medium/low settings users have to put up with.
        
         | bangaladore wrote:
         | Compete with Apple is my guess. There is a decent market for
         | super high end laptops.
         | 
         | Framework (I believe) made one of these into a purchasable
         | desktop.
        
         | ThatMedicIsASpy wrote:
         | I wonder if you could actually put these into a socket or
         | issues would occur.
        
           | neogodless wrote:
           | While I think the market is small, and they don't release a
           | lot of these, AMD has sold desktop / socketed APUs in the
           | past.
           | 
           | They tend to come out much slower than the laptop chips, or
           | the "CPU-only" desktop chips.
           | 
           | This is one of the more recent examples: https://www.amd.com/
           | en/products/processors/desktops/ryzen/80...
        
         | jchw wrote:
         | People keep saying "to compete with Apple" which of course, is
         | nonsense. Apple isn't even second or third place in laptop
         | marketshare last I checked.
         | 
         | So why build powerful laptops? Simple: people want powerful
         | laptops. Remoting to a desktop isn't really a slam dunk
         | experience, so having sufficient local firepower to do real
         | work is a selling point. I do work on both a desktop and a
         | laptop and it's nice being able to bring a portable workstation
         | wherever I might need it, or just around the house.
        
           | dietr1ch wrote:
           | This is a really good point. It's not easy to use both a
           | laptop and a desktop at the same time. There's challenges
           | around locality, latency, limited throughput, unavailability
           | that software can't easily deal with, so you need to be aware
           | and smart about it, and you'll need to compromise on things.
           | 
           | I'd work from my workstation at all times if I could. Tramp
           | is alright, but not too fast and fundamentally can't make
           | things transparent.
        
         | icegreentea2 wrote:
         | 128GB is actually a step down. The previous generation (of
         | sorts) Strix Point had maximum memory capacity of 256GB.
         | 
         | The mini-PC market (which basically all uses laptop chips)
         | seems pretty robust (especially in Asia/China). They've
         | basically torn out the bottom of the traditional small form
         | factor market.
        
         | alienthrowaway wrote:
         | "Mobile CPU" has recently come to mean more than laptops. The
         | Steam Deck validated the market for handheld gaming computers,
         | and other OEMs have joined the fray. Even Microsoft intends to
         | release an XBox-branded portable. I think there's an market
         | opportunity for better-than-800p handheld gaming, and Strix
         | Halo is perfectly positioned for it - I wouldn't bet against
         | the handheld XBox running in this very processor.
        
       | zbrozek wrote:
       | I really want LPDDR5X (and future better versions) to become
       | standard on desktops, alongside faster and more-numerous memory
       | controllers to increase overall bandwidth. Why hasn't CAMM gotten
       | anywhere?
       | 
       | I _also_ really want an update to typical form factors and
       | interconnects of desktop computers. They 've been roughly frozen
       | for decades. Off the top of my head:
       | 
       | - Move to single-voltage power supplies at 36-57 volts.
       | 
       | - Move to bladed power connectors with fewer pins.
       | 
       | - Get rid of the "expansion card" and switch to twinax ribbon
       | interconnects.
       | 
       | - Standardize on a couple sizes of "expansion socket" instead,
       | putting the high heat-flux components on the bottom side of the
       | board.
       | 
       | - Redesign cases to be effectively a single ginormous heatsink
       | with mounting sockets to accept things which produce heat.
       | 
       | - Kill SATA. It's over.
       | 
       | - Use USB-C connectors for both power and data for internal
       | peripherals like disks. Now there's no difference between
       | internal and external peripherals.
        
         | gjsman-1000 wrote:
         | > Why hasn't CAMM gotten anywhere?
         | 
         | Framework asked AMD if they could use CAMM for their new
         | Framework Desktop.
         | 
         | AMD actually humored the request and did some engineering, with
         | simulations. According to Framework, the memory bandwidth on
         | the simulations was _less than half_ of the soldered version.
         | 
         | This completely defied the entire point of the chip - the
         | massive 256 bit bus ideal for AI or other GPU-heavy tasks,
         | which allows this chip to offer the features it does.
         | 
         | This is also why Framework has apologized for non
         | upgradability, but said it can't be helped, so enjoy fair and
         | reasonable RAM prices. Previously, it had been speculated that
         | CAMM had a performance penalty, but Framework's engineer on
         | video saying it was that bad was fairly shocking.
        
           | arghwhat wrote:
           | I do not believe they were asking for CAMM as replacement for
           | soldered RAM, but as an upgrade for DIMMs in desktop.
           | 
           | CAMM is touted as being better than DIMMs when it comes to
           | signal integrity and possible speed. Soldered of course beat
           | any socket, in-package beats any soldered RAM, and on-die
           | beat any external component.
           | 
           | That AMD Strix Halo is unable to maintain signal integrity
           | for any socketed RAM is a Strix Halo problem, not a socket
           | problem. They probably backed themselves a bit into a corner
           | with other parts of the design sacrifying tolerances on the
           | memory side, and it's a lot easier to push motherboard design
           | requirements than redoing a chip.
           | 
           | If this _wasn 't_ a Strix Halo issue, then they would have
           | been able to run with socketed memory with a lower memory
           | clock. All CPUs, this one included, has variable memory
           | clocks that could be utilized and perform memory training as
           | even the PCB traces to the chip cause significant signal
           | degradation.
        
             | kimixa wrote:
             | For signal integrity issues increasing the link power can
             | often overcome some of the issues caused by longer traces
             | and connectors in the line - while less of an issue for
             | desktop devices, then that goes against the ideal of a low-
             | powered device with limited cooling. Doubly so as it's hard
             | to re-clock in the timescales needed for intermittent use
             | power saving, so will be using that extra power when idle.
             | 
             | I suspect the earlier comment about "Half the performance
             | with CAMM" is likely at iso-power, but that might still be
             | a pretty big dealbreaker.
        
               | arghwhat wrote:
               | More power is to overcome switching losses and parasitic
               | reactances. You can increase drive strength up to a limit
               | to overcome this, but a slight clock reduction will make
               | things work at the same power.
               | 
               | CPU's and GPU's reclock extremely fast to my knowledge,
               | but what we're talking about isn't dynamic reclocking,
               | just limiting the max clock as suitable to the system
               | design.
               | 
               | We already see this when we have laptop silicon that run
               | faster clocks when using soldered RAM compared to when
               | the same silicon is using socketed counterparts.
               | 
               | That this wasn't an option probably mean that they're
               | either far too close to the limit, or unwilling to allow
               | a design that runs below max speed.
        
           | sunshowers wrote:
           | I'm curious how much the CUDIMM thing Intel is doing, where
           | the RAM has its own clock, can help in the CAMM context. The
           | Zen 4/5 memory controller doesn't support it but a future one
           | might.
        
           | Tuna-Fish wrote:
           | The problem was specifically routing the 256-bit LPDDR5X out
           | of the chip into the CAMM2 connector. This is hard to do with
           | such a wide bus, because LPDDR5X wasn't originally designed
           | for it.
           | 
           | LPDDR6X is designed for it, and an use CAMM2.
        
             | gjsman-1000 wrote:
             | Judging by that LPDDR5X was announced in 2019; and LPDDR6X
             | was just announced in 2024... we're still a full
             | laptop/desktop cycle away.
        
         | simoncion wrote:
         | > - Move to single-voltage power supplies at 36-57 volts.
         | 
         | Why? And why not 12V? Please be specific in your answers.
         | 
         | > - Get rid of the "expansion card" and switch to twinax ribbon
         | interconnects.
         | 
         | If you want that, it's available right now. Look for a product
         | known as "PCI Express Riser Cable". Given that the "row of
         | slots to slot in stiff cards" makes for nicely-standardized
         | cases and card installation procedures that are fairly easy to
         | understand, I'm sceptical that ditching slots and moving to
         | riser cables for everything would be a benefit.
         | 
         | > - Kill SATA. It's over.
         | 
         | I disagree, but whatever. If you just want to reduce the number
         | of ports on the board, mandate Mini SAS HD ports that are wired
         | into a U.2 controller that can break each port out into four
         | (or more) SATA connectors. This will give folks who want it
         | very fast storage, but also allow the option to attach SATA
         | storage.
         | 
         | > - Use USB-C connectors for both power and data for internal
         | peripherals like disks.
         | 
         | God no. USB-C connectors are fragile as all hell and easy to
         | mishandle. I hate those stupid little almost-a-wafer blades.
         | 
         | > - Standardize on a couple sizes of "expansion socket"
         | instead...
         | 
         | What do you mean? I'm having trouble envisioning how any
         | "expansion socket" would work well with today's highly-
         | variably-sized expansion cards. (I'm thinking especially of
         | graphics accelerator cards of today and the recent past, which
         | come in a very large array of sizes.)
         | 
         | > - Redesign cases to be effectively a single ginormous
         | heatsink with mounting sockets...
         | 
         | See my questions to the previous quote above. I currently don't
         | see how this would work.
        
           | arghwhat wrote:
           | > Why? And why not 12V? Please be specific in your answers.
           | 
           | Higher voltages improve transmission efficiency, in
           | particularly for connectors, as long as sufficient insulation
           | is easy to maintain. Datacenters are looking at 48V for a
           | reason.
           | 
           | Nothing comes for free though, and it makes for slightly more
           | work for the various buck converters.
           | 
           | > God no. USB-C connectors are fragile as all hell and easy
           | to mishandle. I hate those stupid little almost-a-wafer
           | blades.
           | 
           | They are numerous orders of magnitude more rugged than any
           | internal connector you've used - most of them are only
           | designed to handle insertion a handful of times (sometime
           | connectors even only work once!), vs. ten thousand times for
           | the USB-C connector. In that sense, a locking USB-C connector
           | would be quite superior.
           | 
           | ... on that single metric. It would be ridiculously
           | overcomplicated, driving up part costs when a trivial and
           | stupidly cheap connector can do the job sufficiently. Having
           | to run off 48V to push 240W and have no further power budget
           | at all also increase complexity, cost and add limitations.
           | 
           | USB-C is meant for end-user things where everything has to be
           | crammed into the same, tiny connector, where it does great.
        
           | wtallis wrote:
           | Graphics cards have finally converged on all using about the
           | same small size for the PCB. The only thing varying is the
           | size of the heatsink, and due to the inappropriate nature of
           | the current legacy form factor (which was optimized for large
           | PCBs) the heatsinks grow along the wrong dimension and are
           | louder and less effective than they should be.
        
         | wmf wrote:
         | There's a rumor that future desktops will use LPDDR6 (with
         | CAMMs presumably) instead of DDR6. Of course CAMMs will be
         | slower so they might "only" run at ~8000 GT/s while soldered
         | LPDDR6 will run at >10000.
        
           | Tuna-Fish wrote:
           | LPDDR6 won't go that low, even on CAMM2. The interface is
           | designed for up to 14.4Gbps, with initial modules aiming for
           | 10.6Gbps.
        
       | runjake wrote:
       | When, if ever, will this be released as a bare
       | processor/memory/motherboard combination that I can buy and throw
       | in my own case?
       | 
       | Does anyone know?
        
         | colejohnson66 wrote:
         | Framework Desktop
         | 
         | https://frame.work/products/desktop-diy-amd-aimax300
        
         | Timshel wrote:
         | Framework is selling the board as a stand alone:
         | https://frame.work/fr/en/products/framework-desktop-mainboar...
         | 
         | Too bad there isn't a full PCIe (might not be enough bandwidth
         | left) :(.
        
           | runjake wrote:
           | I was looking to see if they sold only the motherboard just
           | last night and failed. Thanks!!
           | 
           | $1,299 (64GB) and $1,599 USD (128GB) for the motherboards.
           | Yikes, but I get why.
        
         | wmf wrote:
         | Here's one motherboard: https://frame.work/products/framework-
         | desktop-mainboard-amd-...
         | 
         | I wouldn't be surprised if Minisforum also offers a
         | motherboard.
        
       | sourtrident wrote:
       | Fascinating how Strix Halo feels like AMD's spiritual successor
       | to their ATI merger dreams - finally delivering desktop-class
       | graphics and CPU power in a genuinely portable form factor. Can't
       | wait to see where it pushes laptop capabilities.
        
       ___________________________________________________________________
       (page generated 2025-03-14 23:01 UTC)