[HN Gopher] Intel completely disables AVX-512 on Alder Lake afte...
___________________________________________________________________
Intel completely disables AVX-512 on Alder Lake after all
Author : pantalaimon
Score : 327 points
Date : 2022-01-07 11:31 UTC (11 hours ago)
(HTM) web link (www.igorslab.de)
(TXT) w3m dump (www.igorslab.de)
| alkonaut wrote:
| Do both E and P cores handle avx instructions or are some
| instructions specific to p-cores? In that case, how does a
| processor with heterogeneous cores deal with processes (or, how
| does the OS/driver do it in automatic mode when affinity isn't
| guided)?
| PragmaticPulp wrote:
| Nope! You have to disable the efficiency cores to enable
| AVX-512. Between that and the relative scarcity of AVX-512
| code, this isn't really a big loss. It wasn't like it was a
| free feature.
| avx512 wrote:
| The elephant in the room is the classic CPU architecture
| mistake where you use two different instruction sets on your
| performance and efficiency cores. There is no way to prevent
| code with AVX-512 instructions from being executed on an
| unsupported core. This exact same problem that plagued early
| ARM big.LITTLE designs that used ARMv8.2 on the performance
| core but ARM8.0 on the efficiency core, forcing all code to use
| the more restrictive instruction set and cripple the
| performance benefits of the "big" cores.
| Iwan-Zotow wrote:
| Yeah, it killed heterogenic MPI clusters before as well
| nottorp wrote:
| Intel has always been doing artificial market segmentation. I
| would only be surprised if they somehow release a cpu generation
| with the same features across all models...
| fomine3 wrote:
| Intel never sold Alder Lake with AVX512 support but only
| underlying P-core technically supported. They failed to
| completely disable it. I believe they don't validated AVX512
| functionality because anyway they don't support. It's not fair to
| blame disabling.
| sebow wrote:
| igammarays wrote:
| Between shenanigans like this and Spectre/Meltdown mitigations
| giving me a ~20% performance hit [1], I avoid Intel like the
| plague if I have the choice.
|
| [1] https://arxiv.org/pdf/1807.08703.pdf
| mhh__ wrote:
| Pretty crap reasoning then. Intel never ever marketed these as
| having AVX-512 and unless you are running a Xeon from 2012 like
| the paper you've linked a modern CPU (note, any desktop class
| CPU) will only be vulnerable to some bits of Spectre and not
| meltdow
| xondono wrote:
| My 7700K (2017) got a similar downgrade in performance from
| the mitigations.
|
| Switched to AMD for the first time in my life, no regrets so
| far.
| mhh__ wrote:
| Meltdown hadn't been (publicly) found until 2018 so the
| same applies. Intel's current offerings are extremely
| competitive with AMD on many workloads.
| pantalaimon wrote:
| mitigations=off
|
| is IMHO a save thing to do on a single user desktop unless
| you have nation state level adversaries
| SkeuomorphicBee wrote:
| Stuff like this is why I plan to never buy Intel ever again. I
| always disliked Intel's strategy for market segmentation by
| disabling specific instructions, I much prefer AMD's strategy of
| segmenting just by speed and number of cores. It is really
| annoying that with Intel you can't run the same program on the
| server and on your desktop, it makes development and testing a
| huge pain.
| flyinghamster wrote:
| The Core 2 Quad Q8200 was what did it for me. I had intended to
| build a system to play with virtual machines, only to find
| after I built it up that the Q8200 didn't have VT-x - something
| that was NOT noted in the store display.
|
| I put it in an Intel-branded motherboard as well, and that
| turned out to be one of the few motherboards I've had that
| failed prematurely.
|
| Since then, the only Intels I've bought since then have been
| used ThinkPads.
| fomine3 wrote:
| Unlocked K CPUs didn't support VT-d until Skylake. That was
| weird segmentation.
| timpattinson wrote:
| to be fair, that's unavoidable with AVX-512, you can't
| underestimate how much power it draws compared to non-
| vectorised instructions. That's why Prime95 is such an
| effective stress test and draws 150% of the power that
| Cinebench R20 does for example.
| ncmncm wrote:
| You can, in fact, underestimate that, and most people who
| guess do.
| [deleted]
| ineedasername wrote:
| Didn't AMD disable cores on chips that were enabled and the
| exact same chips branded higher end? [0]
|
| I know the explanation in some cases were that they didn't pass
| QC, but I also never came across stories of widespread problems
| when these cores were reenabled by end users.
|
| If Intel is playing games here, I'm not sure they're unique to
| Intel.
|
| [0] https://www.techradar.com/news/computing-
| components/processo...
|
| [1] partial list of prior unlockable AMD chips:
| https://docs.google.com/spreadsheets/d/19Ms49ip5PBB7nYnf5urx...
| marcan_42 wrote:
| Don't forget it's not just instruction sets; Intel is the
| reason we don't have ECC RAM on desktops. Every other high
| density storage technology has used error correction for a
| decade or two, but we're still sitting here pretending we can
| have 512 billion bits of perfect memory sitting around that
| will never go wrong, because Intel fuse it off on desktop
| chips. I guess only servers need to be reliable.
|
| AMD supports ECC on their consumer chips, but without Intel
| support it's never taken off and some motherboards don't
| support it, or if they do it's not clear in the documentation.
| I do use ECC RAM on my Threadripper machine and it does work,
| but I had to look for third party info on whether it would and
| dig around DMI and EDAC info to convince myself it was really
| on. It also makes it safer to overclock RAM since you get
| warnings when you're pushing things too far, before outright
| failures. And it helps with Rowhammer mitigation.
|
| Apple M1s don't do ECC in the memory controller as far as I can
| tell, but at least they have a good excuse: you can't sensibly
| do ECC with 16-bit LPDDR RAM channels. There's no such excuse
| for 64/72-bit DIMM modules. I do hope we work out a way to make
| ECC available on mobile/LPDDR architectures in the future,
| though. Probably with something like in-RAM-die ECC (which for
| all I know might already be a thing on M1s; we don't have all
| the details).
| rbobby wrote:
| > AMD supports ECC on their consumer chips
|
| And now the next desktop consumer upgrade I purchase will be
| AMD and will have ECC (well... unless it's way more
| expensive).
| ClumsyPilot wrote:
| "pretending we can have 512 billion bits of perfect memory
| sitting around that will never go wrong, because Intel fuse
| it off on desktop chips"
|
| I think computers are now so important to our life, we need
| to start regulating them like we do cars.
|
| Start seriously slapping companies that deliberately or
| negligently release equipment with obsolete kernels and
| security holes, mandate ECC like we mandate ABS, mandate part
| avaliability for 10 years like we do with cars, etc.
|
| Every day we let this this slide, thousands of people loose
| precious data and number of 'smart' toasters mining crypto
| increases.
| kmeisthax wrote:
| My main worry with this sort of thing, is that if we start
| mandating legal liability, and security becomes a
| compliance line-item, then companies are going to start
| locking down everything they ship so they have a legal
| defense in court. The argument's going to be, "if we are
| liable for shipping insecure desktops then you shouldn't be
| allowed to install Linux onto them and then sue us when you
| get hacked".
|
| Think about how many laptops ship with Wi-Fi whitelists
| with the excuse of "FCC certification". It doesn't matter
| that the FCC doesn't actually prohibit users from swapping
| out Wi-Fi cards; manufacturers will do it anyway.
| formerly_proven wrote:
| > Don't forget it's not just instruction sets; Intel is the
| reason we don't have ECC RAM on desktops. Every other high
| density storage technology has used error correction for a
| decade or two, but we're still sitting here pretending we can
| have 512 billion bits of perfect memory sitting around that
| will never go wrong, because Intel fuse it off on desktop
| chips. I guess only servers need to be reliable.
|
| And not just storage - the _main memory bus_ is the _only_
| data bus in a modern computer that doesn 't use some form of
| error correction or detection. Even USB 1.0 has a checksum.
| So everywhere else we use ECC/FEC or at least a checksum, be
| it PCIe, SATA, USB, all storage devices as you mentioned rely
| heavily on FEC, all CPU caches use ECC. Except the main
| memory and its bus. Where all data is moved through
| (eventually). D'uh.
| marcan_42 wrote:
| Yup. PCIe will practically run over wet string, thanks to
| error detection and retransmits and other reasons, but try
| having a marginal DRAM bus and see how much fun that is...
| vanderZwan wrote:
| Could be a fun way to test and demonstrate robustness of
| various parts of computer hardware, actually. It's
| already been done with ADSL for example:
|
| [0] https://www.revk.uk/2017/12/its-official-adsl-works-
| over-wet...
| marcan_42 wrote:
| My comment was actually a self quote from a talk I gave
| about PS4 hacking where I described that PCIe will
| happily run over bare soldered wires without much care
| for signal integrity, at least over short distances
| (unlike what you might expect of a high-speed bus like
| that) :)
|
| Not literally wet string, but definitely low tech. ADSL
| is special though, not many technologies can _literally_
| run over wet string :-)
| vanderZwan wrote:
| Well, then we could make it a scale of what the worst
| transmission medium is that two pieces of hardware can
| (sort of) communicate across :)
| cameron_b wrote:
| off topic, but I commend the article if just for the
| conclusion
| PragmaticPulp wrote:
| > Intel is the reason we don't have ECC RAM on desktops.
|
| Intel has offered ECC support in a lot of their low-end i3
| parts for a long time. They're popular for budget server
| builds for this reason.
|
| The real reason people don't use ECC is because they don't
| like paying extra for consumer builds. That's all. ECC
| requires more chips, more traces, and more expense. Consumers
| can't tell if there's a benefit, so they skip it.
|
| > AMD supports ECC on their consumer chips, but without Intel
| support it's never taken off
|
| You're blaming Intel's CPU lineup for people not using ECC
| RAM on their AMD builds?
|
| Let's be honest: People aren't interested in ECC RAM for the
| average build. I use ECC in my servers and workstations, but
| I also accept that I'm not the norm.
| makomk wrote:
| As far as I can tell, Intel only offered ECC on a small
| handful of i3 parts that mainly seemed to be marketed to
| NAS manufacturers, likely because they were otherwise
| giving up that market entirely to competitors like AMD.
| They really don't seem to be interested in offering it as
| an option on consumer desktops.
| temac wrote:
| They did support ECC on some i3 simply because they did not
| bother to double the sku, however IIRC you need the server
| / WS S chipset to enable it. At which point just put an
| entry level Xeon on that.
|
| In the absolute the cost of ECC everywhere would not be
| substantially greater than the prices we have now without.
| The current ECC prices are high because it is not broadly
| used, and not really the inverse. Consumer skip it because
| it is fucking hard to get ECC enable parts for S SKUs (or H
| / U) in the current situation, while there are plenty of
| non-ECC vendors and resellers, and something like at least
| 3 times the number of SKUs. And consumers have not been
| informed they are buying unreliable shit.
| marcan_42 wrote:
| > You're blaming Intel's CPU lineup for people not using
| ECC RAM on their AMD builds?
|
| I'm blaming the decade+ of Intel dominance for killing any
| chance of ECC becoming popular in non-server environments,
| just as RAM density was reaching the point where it is
| absolutely essential for reliability.
|
| > The real reason people don't use ECC is because they
| don't like paying extra for consumer builds. That's all.
| ECC requires more chips, more traces, and more expense.
| Consumers can't tell if there's a benefit, so they skip it.
|
| Motherboard traces are ~free and the feature is in the die
| already, so it requires zero expense to _offer_ it to
| consumers. Intel chose to artificially cripple their chips
| to remove that option. Yes, I know there are a few oddball
| lines where they did offer it. They should have offered it
| across the board from the get go, seeing as they were
| selling the same dies with ECC for workstation use.
| Karunamon wrote:
| ECC _memory_ on the other hand is always going to be more
| expensive.
| marcan_42 wrote:
| Indeed, which is why it should be an option.
|
| OTOH, it shouldn't be _significantly_ more expensive. It
| should be ~9 /8 the cost of regular memory. It's just one
| extra chip for every 8. Nothing more.
| namibj wrote:
| Actually less, because you only need the additional
| memory chip and associated trace layouting, not any
| additional PCB manufacturing cost (beyond miniscule yield
| impact of the additional traces) and no significant added
| distribution cost (packaging, shipping weight, etc.).
| my123 wrote:
| in-band ECC is also a thing. In that scenario, you give
| up some capacity for the ECC bits but stay with the same
| DRAM config as before.
|
| (in-band ECC is present on Elkhart Lake Atoms and on
| Tegra Xavier for example)
| PragmaticPulp wrote:
| > I'm blaming the decade+ of Intel dominance for killing
| any chance of ECC becoming popular in non-server
| environments
|
| I disagree. AMD has offered ECC support for a while and
| it's not catching on. It doesn't make sense to blame this
| on Intel.
|
| > Motherboard traces are ~free and the feature is in the
| die already, so it requires zero expense to offer it to
| consumers.
|
| Yet it's missing from a substantial number of AMD boards,
| despite being supported. You have to specifically confirm
| the motherboard added those traces before buying it.
|
| Traces aren't entirely free. Modern boards are densely
| packed and manufacturers aren't interested in spending
| extra time on routing for a feature that consumers aren't
| interested in anyway.
| marcan_42 wrote:
| > Traces aren't entirely free. Modern boards are densely
| packed and manufacturers aren't interested in spending
| extra time on routing for a feature that consumers aren't
| interested in anyway.
|
| Or they just don't care because it's not already popular
| and unbuffered ECC RAM isn't even particularly widely
| available. The delta design cost of routing another 8
| data lines per DIMM channel is tiny. Especially on ATX
| boards and other larger formats. I could see some crazy
| packed mini-ITX layout where this might be a bit harder,
| but definitely not in the normal cases.
|
| (I've routed a rather dense 4-layer BGA credit card sized
| board; not exactly a motherboard, but I do have a bit of
| experience with this subject. It was definitely denser
| than a typical ATX board per layer.)
| simoncion wrote:
| > ...unbuffered ECC RAM isn't even particularly widely
| available.
|
| Every time I've gone looking for unbuffered ECC RAM over
| the past three or five years, I've had no trouble finding
| it. In my experience, the trick is to shop for "server"
| RAM, rather than "desktop" RAM.
|
| Are there speeds or capacities here that you'd
| particularly like to see that aren't present?
| <https://nemixram.com/server-memory/ecc-udimm/>
| rasz wrote:
| >You're blaming Intel's CPU lineup for people not using ECC
| RAM on their AMD builds?
|
| Yes. ECC was standard on first IBM PC 5150, on PS/2 line,
| on pretty much all 286 clones etc. Intel killed ECC on the
| desktop when moving to Pentium, prior to that all of their
| chipset products (486) supported it. 1995 artificial market
| segmentation shenanigans
| https://www.pctechguide.com/chipsets/intels-triton-
| chipsets-...
| formerly_proven wrote:
| > Intel has offered ECC support in a lot of their low-end
| i3 parts for a long time. They're popular for budget server
| builds for this reason.
|
| Intel removed ECC support in the 10th gen so you have to go
| for Xeon nowadays.
| jeffbee wrote:
| With DDR5 you can have (a form of) ECC on all current
| 12th-generation Core CPUs. That is, if you were able to
| find DDR5 DIMMs on the market, which you currently
| cannot.
| temac wrote:
| Not really: internal ECC in DDR5 is an implementation
| detail that is neither exposed on the bus nor giving you
| the real reliability and monitoring capability that real
| ECC terminated in the memory controller did. It is only
| there because the error rate would be absolutely horrific
| without, so you need internal ECC to get to basically the
| same point you were without ECC on DDR4.
| philistine wrote:
| Since the Mac Pro has ECC Ram, I would expect a future Apple
| Silicon Mac Pro to offer it as well with its desktop M1 chip,
| with the functionality trickling down the line in years to
| come.
| freemint wrote:
| DDR5 is a form of ECC and DDR5 is only supported on Intel so
| far.
| wtallis wrote:
| The DDR5 memory bus used by Intel's latest consumer
| processors does not have ECC enabled. The memory dies
| themselves have some internal ECC that is not exposed to
| the host system and is not related to the fact that they
| use a DDR5 interface; all state of the art DRAM now needs
| on-die ECC due to the high density.
| freemint wrote:
| So what it has on die ECC which allows to recover from
| radiation induced bitflips and stuff. Maybe to compensate
| for density the error correction is a bit more busy and
| can compensate less errors per minute but 0.5 ECC instead
| of full ECC on DDR4 (no random errors due to density) is
| still an improvement for most people in terms of immunity
| to unlucky cosmic rays.
| bluedino wrote:
| > Don't forget it's not just instruction sets; Intel is the
| reason we don't have ECC RAM on desktops.
|
| Of course we do: workstations.
|
| It's cheaper, that's why it isn't everywhere.
| marcan_42 wrote:
| Intel's lower end workstation chips are the same silicon,
| and thus the same manufacturing cost, as their desktop
| chips. They just artificially disable features like ECC for
| product segmentation. It is unconscionable that something
| as essential as ECC is crippled out of the consumer line-
| up.
| bluedino wrote:
| Except that the memory chips and motherboards also need
| to support ECC
| marcan_42 wrote:
| ECC costs $0 to support in motherboards (8 extra traces
| per DIMM slot; traces are free). Memory is where the
| consumer gets to choose whether to spend extra on ECC or
| not. There is absolutely no reason why consumers should
| be forced to pay extra for a CPU to get ECC when they are
| literally getting the same piece of silicon.
| jeffbee wrote:
| It's odd how indifferent you are being about the energy
| costs of ECC. Memory now dominates the energy story of
| many systems. Filling an x86 cache line from DDR4 costs
| 1000x as much energy as a double-precision multiplication
| operation. ECC memory costs 12.5% more energy. That's a
| big, big difference.
| marcan_42 wrote:
| I'm not saying everyone should use ECC, I'm saying ECC
| should be an option for everyone.
| wmf wrote:
| Just don't investigate AM4 motherboard compatibility and AGESA
| revisions... Absolutely no artificial segmentation here, no
| sir.
| PragmaticPulp wrote:
| > I always disliked Intel's strategy for market segmentation by
| disabling specific instructions,
|
| The efficiency cores don't have AVX-512 because they're low
| power, simplified cores.
|
| This mod required disabling the efficiency cores, reducing core
| count anyway. It wasn't actually a free upgrade.
|
| > I much prefer AMD's strategy of segmenting just by speed and
| number of cores
|
| I've been using ECC RAM with AMD consumer parts, but it's not
| all smooth sailing. The ECC support in their consumer parts
| isn't officially supported, so it has been extremely difficult
| to determine if it's even working at all. There are long forum
| threads where many people have tried to confirm it with mixed
| results. AMD may unofficially leave these features in place,
| but it turns out unofficial support isn't all that
| straightforward.
| marcan_42 wrote:
| It's not "unofficial" support, it's just the
| motherboards/OEMs don't bother to implement it or implement
| it in an opaque way. All the motherboards have to do to
| support this properly is have the extra RAM traces and an
| easy setup menu info line that tells you whether ECC is on or
| not. The CPU supports it fine.
|
| If you run Linux, you can use dmidecode and the EDAC driver
| to confirm if you have ECC enabled.
| Fnoord wrote:
| Nvidia do the same with disabling passthrough on consumer GPUs
| (gotta upsell Quatro).
|
| However I would say avoiding Intel completely is unnecessary.
| They're a good FOSS citizen when it boils to things like WLAN
| and GPU.
|
| Its just that with AMD, you can hardly go wrong on CPU and GPU
| side. Especially on Linux.
| jacquesm wrote:
| Yes, but with NVidia at least they get it right and do it
| prior to benchmarking and selling into the retail channel,
| rather than the other way around.
|
| And they - rightly - get plenty of flak for it.
| Fnoord wrote:
| Nvidia is great for Windows.
|
| As soon as you want to use Linux or macOS or Proxmox or
| anything, prepare for trouble. Its just not a FOSS-friendly
| company.
|
| Even on the Nvidia Shield TV (a great bang for the buck)
| they added commercials in the new UI. Though one might
| attribute that to Google, it is the only Nvidia device I
| bought past 5 years. And the only one which got a feature
| downgrade (or upgrade with a feature I don't want/like).
|
| With regards to Intel I remember my 4770k not having a
| specific feature (IIRC for hardware virtualization) which
| all non-k series had. And I found that out too late.
| jacquesm wrote:
| I've been using Nvidia cards for years on Linux and never
| had a single problem.
| eulers_secret wrote:
| I've used Nvidia on Linux since owning a 670. Their
| proprietary driver works only OK.
|
| You don't get flipped off by Linus T for having good
| drivers. You also don't have to wait _years_ for wayland
| to work if your driver is good.
| 1_player wrote:
| They didn't get flipped off for having bad drivers. They
| got flipped off for having closed source off-tree
| drivers.
|
| The NVIDIA driver on Linux is extremely good and in my
| experience of better quality than AMD, but sorely lacking
| in modern features such as Wayland support. We still
| don't have decent hardware encoding nor raytracing
| support on AMD, the champions of free software, while
| DLSS and NVENC have been running on Linux for a while
| now.
|
| I run AMD GPUs now but I'm quite done with the NVIDIA
| hate boner the Linux world has.
| jacquesm wrote:
| Ok. On my end the first one I had that could do fast
| compute was a GTX 280, then a 590, then a 1080ti which I
| still have.
|
| Maybe I've just been lucky but over all that time not a
| single glitch. The main gripe I have with them is having
| you jump through hoops to be allowed to download certain
| libraries. That's just a complete nuisance. And I think
| their TOS are unethical.
| hetspookjee wrote:
| haha it's staggering how big the presence of NVIDIA is in
| the deep learning community, yet after all these years
| it's still a painful process to get all the drivers up
| and running in the correct way. Their documentation
| surroundg CUDA and CUDNN are often out of date and/ or
| incomplete. Staggering
| jacquesm wrote:
| Interesting. I never had any issues, just download the
| driver, compile, install and done.
|
| It even works with multi-screen setups and using the rest
| of the memory for CUDA.
| tymscar wrote:
| Not anymore! Nvidia gpus can be passed through just fine with
| no fixes. I use mine like that daily
| MikeKusold wrote:
| Are you doing a full pass through or vGPU?
| tymscar wrote:
| Full pass through. Been doing it for years. I have a
| youtube video on how I did it on my channel. Same
| username
| jacobmartin wrote:
| Not the parent that you replied to, but I use full PCI
| passthrough for my Nvidia (RTX 3060) and it works just
| fine. I had no idea Nvidia had done this before or else I
| would've seriously reconsidered my choices for this
| build... Nevertheless it seems to work just fine.
| flatiron wrote:
| I'm waiting for amd to do TB4 so I can get a ryzen framework
| laptop.
| Fnoord wrote:
| Yeah, AMD Ryzen Framework laptop would be amazing. (Would
| also have ECC support?)
|
| They'd have to license TB from Intel. I don't know about
| TB4, but TB3 is royalty-free, just gotta pass a
| certification.
|
| TB3 or better would be great for eGPUs in general. Not sure
| I personally would need TB4.
|
| Also, I wonder if Framework would be usable with say
| Proxmox and macOS (or 'native' Hackintosh?)
| fsflover wrote:
| > AMD Ryzen Framework laptop would be amazing
|
| Except you cannot disable AMD PSP.
| Fnoord wrote:
| Lets compare it to all the other options, shall we?
| Perfect is the enemy of good, and your uncompromising way
| of doing things (including not loading microcode 'because
| it is closed source') is, to put it simple: harmful. And
| unusual, as well (its not a dealbreaker for many people).
| The laptop is modular and repairable, just like a
| Fairphone is modular and repairable. Its a great step
| forward and, while not perfect and possibly a reason to
| not buy it, a minor detail for most people. If we should
| follow FSF's principles we would not be able to load
| microcode to fix Spectre 'because it is proprietary' like
| Trisquel apparently does. Meanwhile, Librem 5 delivers
| awful performance, and Pinephone offer 30 days warranty.
| Both have killswitches though: great feature. Again,
| perfect is the enemy of good. Because if killswitches are
| a requirement for you, then these are about your two
| options. If a modular laptop is a requirement for you,
| you are bound to what? Old Thinkpads?
| fsflover wrote:
| > your uncompromising way of doing things (including not
| loading microcode 'because it is closed source')
|
| Libreboot disagrees with that:
| https://lists.gnu.org/archive/html/libreplanet-
| discuss/2022-....
|
| Also, FSF would never endorse an Intel CPU with disabled
| Intel ME. It must be fully removed to get the RYF
| certification.
| marcan_42 wrote:
| The FSF endorse ThinkPads with two or three chips running
| secret firmware blobs with full access to system memory
| via the LPC bus. They also endorse Bluetooth dongles
| running hundreds of kilobytes of proprietary firmware
| blob (which doesn't count for them because it's in ROM,
| not /lib/firmware).
|
| RYF certification is absolutely meaningless, both from
| the freedom and security/privacy perspectives. It is
| actually actively harmful to freedom, as it encourages
| manufacturers to hide blobs to get through its backdoors
| (ROM blobs are OK, RAM blobs are not), when doing the
| opposite is actually more accountable and open, and
| allows for user-controlled replacement of blobs with free
| versions in the future.
|
| Also, Stallman had personally said he wouldn't give the
| Novena open hardware laptop RYF certification unless they
| permanently fused off the GPU, "because otherwise users
| might be tempted to install the (optional, not
| distributed with the product or endorsed in any way) GPU
| blobs". Literally the same product segmentation nonsense
| we're bashing Intel for here. This was before open
| drivers became available, which they eventually did.
| Imagine how wrong the situation would've been if bunnie,
| the creator of Novena, had actually listened to this;
| "RYF" certified Novena owners would own crippled machines
| unable to use 3D acceleration, while the rest would own
| machines capable of running a 100% libre desktop
| including 3D acceleration.
|
| > Libreboot disagrees with that
|
| You do realize that that post is Leah politely saying
| that RYF sucks and is completely broken, and she's given
| up on strict compliance, right? FSF policy is indeed that
| you shouldn't upgrade your microcode (see e.g. endorsing
| linux-libre, which censors kernel messages telling you
| about important microcode updates).
| fsflover wrote:
| As my link already says, the FSF can and should be
| improved. However, I think there is a point in separating
| "hardware" from "software", where the former is not to be
| updated. If it can be updated by anyone, in any way, it's
| not hardware.
| marcan_42 wrote:
| If it can be updated by the user, even if the source code
| is not available, it is freer than if it cannot. Because
| then users can actually see the code, reverse engineer
| it, audit it, know they are running the exact version
| they expect, and potentially replace it with a free one.
|
| This concept that "if it's in ROM and cannot be updated
| it's not software, it's hardware" is asinine, against
| actual practical freedom for users, and also a net
| security negative. This whole rhetoric that updatability
| matters, or that somehow lack of source code means "only
| the manufacturer can update it" and that somehow "makes
| things less free for users" needs to stop. Users are
| still in control of updates, the thing isn't magically
| phoning home (corner cases of stuff with network access
| notwithstanding). Having access to the blob is a net
| positive on all fronts for users. This whole policy keeps
| trying to use this excuse as a rationale, but the reality
| is the only thing it achieves is convincing people that
| they aren't running blobs at all by condoning devices
| where the blobs aren't evident to users because they're
| not in their filesystem.
|
| This was all brought to its logical extreme of silliness
| with the Librem 5, which actively engineered an
| obfuscation mechanism for their RAM training blob to put
| itself in compliance with RYF, for absolutely no benefit
| to users: it's still running the same blob as it would've
| otherwise, and it's still updatable (you can even ignore
| the entire obfuscation and just flash your own bootloader
| that does it the normal way).
| fsflover wrote:
| > If it can be updated by the user, even if the source
| code is not available, it is freer than if it cannot.
|
| So we both say the same thing with different words. I
| agree, which is why there is a call to FSF for change.
|
| Concerning the Librem 5, if the proprietary blob can be
| isolated such that it can't access RAM or CPU, it's
| better for the user and makes the device more free, in my
| opinion.
| marcan_42 wrote:
| > So we both say the same thing with different words. I
| agree, which is why there is a call to FSF for change.
|
| I thought you were saying that hardware is whatever
| "cannot be updated". That's the argument the FSF uses to
| say ROM firmware is OK because it's not software. I'm
| saying that's not okay, because updatability is a _plus_
| , not a _minus_ , and ROMs are still software.
|
| > Concerning the Librem 5, if the proprietary blob can be
| isolated such that it can't access RAM or CPU, it's
| better for the user and makes the device more free, in my
| opinion.
|
| The obfuscation in the Librem 5 did absolutely nothing to
| isolate the proprietary blob in any way, shape, or form.
| It would always run on a dedicated CPU, from the get-go
| (and that CPU is part of the RAM controller, so it is a
| security risk for the entire system either way). They
| added a _third_ CPU in the loop to _load_ the blob,
| because somehow touching the blob from the main CPU gives
| it the digital equivalent of cooties in the FSF 's view,
| but adding this extra step of indirection makes it all
| OK. And then they put the blob in a separate Flash memory
| so it wouldn't live in the same flash as the main open
| firmware, because that somehow helps freedom too?
| Seriously, that whole story is just utterly stupid no
| matter which way you look at it.
| fsflover wrote:
| > I thought you were saying that hardware is whatever
| "cannot be updated".
|
| Yes, I'm saying that. But if you intentionally prevent
| users from updating it, then it does not turn software
| into hardware in my opinion.
|
| Do you have any link saying that the blobs are still
| being executed on the main CPU after the RAM training is
| finished? Upd: you replied here:
| https://news.ycombinator.com/item?id=29842166.
| zajio1am wrote:
| > This concept that "if it's in ROM and cannot be updated
| it's not software, it's hardware" is asinine, against
| actual practical freedom for users, and also a net
| security negative.
|
| There is a distinction between blob 'is in ROM and cannot
| be updated', 'is in EPROM and might be updated', and 'it
| is in RAM and must be provided at boot'.
|
| While one can argue about the second case, the third case
| is problematic for practical and legal reasons, as
| handling (using and distributing) requires accepting
| licence of the firmware, which affects distribution
| infrastructure of free Linux distributions. Some
| firmwares also do not allow redistributing, so they have
| to be downloaded from vendor website, which further
| complicates practical and legal matters and have privacy
| issues.
|
| The second case (it is in EPROM nad might be updated)
| does not have such effect directly, but leads to it
| indirectly, by allowing vendors to depend on cheap post-
| purchase fixes by firmware update, so they can offer less
| tested products, where firmware update is practically
| necessary due to original firmware being buggy, so
| essentially moving to the third case.
| Fnoord wrote:
| ..and still, a Fairphone is more free than almost every
| smartphone out there. And still, a Framework is more free
| than almost every laptop out there. Because they are
| easily repairable because of their modularity. Something,
| yes, Thinkpads used to be too. But you're bound to old,
| refurbished ones (X230/T430 apparently) [1]. You don't
| care about that, you only care about one thing, and the
| rest is seemingly rationalized as irrelevant. Again,
| perfect is the enemy of good, akin to release early,
| release often.
|
| [1] https://www.qubes-os.org/doc/certified-hardware/
| fsflover wrote:
| > Fairphone is more free than almost every smartphone out
| there
|
| Except it relies on proprietary drivers, which will not
| be updated by the vendor, resulting in a brick after some
| years. Librem 5 and Pinephone will receive software
| updates forever.
|
| > But you're bound to old, refurbished ones (X230/T430
| apparently) [1].
|
| Not necessarily: https://forum.qubes-os.org/t/community-
| recommended-computers....
|
| > Librem 5 delivers awful performance
|
| What do you mean? It can run 3D games and provides full
| desktop mode. What else do you need?
| marcan_42 wrote:
| > Librem 5
|
| Ah yes, that phone where the FSF told them they had to
| move a blob from the bootloader into an external Flash
| memory and load it through two layers of CPUs, because
| going through that pointless dance magically makes it
| Free(tm) (read: hidden enough that users won't notice so
| they won't realize they're still running a blob as part
| of something as critical as making the RAM work).
|
| Also that phone which runs a pile of other giant blobs,
| including the USB-PD controller and the baseband, of
| course.
| fsflover wrote:
| The point is that the proprietary software has no access
| to the RAM or CPU and plays absolutely no role whatsoever
| in the device usage. I personally agree that it can be
| called "hardware" and don't care that it has another CPU.
|
| The baseband is on the upgradable M.2 card, also has no
| access to anything. It can even be killed with a hardware
| switch. The best smartphone you can find if you care
| about it. Nobody says that other blobs are fine, but it's
| already a _huge_ step to the freedom.
| marcan_42 wrote:
| > The point is that the proprietary software has no
| access to the RAM or CPU and plays absolutely no role
| whatsoever in the device usage.
|
| The proprietary software _literally configures the RAM on
| the phone_. It is critical for making the RAM work, of
| course it has access to the RAM! Supposedly it should be
| quiesced after training, but I haven 't seen any security
| analysis that claims that firmware couldn't just take
| over the system while it runs.
|
| But they added an extra two layers of indirection, even
| though the blob ends up running on the same CPU with the
| same privileges in the end anyway, because all that
| obfuscation let them get in via the FSF's "secondary
| processor" exception somehow. Even though the end result
| is the same, and you're still running a blob to perform a
| critical, security-relevant task.
|
| If the goal is security ("blobs can't take over my
| system") and stuff running during the boot process
| doesn't count, then Apple's M1 machines are on precisely
| the same level as the Librem 5: they also run blobs on
| boot, and at runtime all remaining blobs on separate CPUs
| are sandboxed such that they can't take over the main
| RAM/CPU.
| fsflover wrote:
| You are of course right that technically it has the
| access to the RAM.
|
| > Supposedly it should be quiesced after training, but I
| haven't seen any security analysis that claims that
| firmware couldn't just take over the system while it
| runs.
|
| I was under impression that it was the whole point of the
| exercise. It would be interesting to know otherwise.
|
| > even though the blob ends up running on the same CPU
| with the same privileges in the end anyway
|
| This is not how I understood it. _The Librem 5 stores
| these binary blobs on a separate Winbond W25Q16JVUXIM TR
| SPI NOR Flash chip and it is executed by U-Boot on the
| separate Cortex-M4F core._ From here:
| https://source.puri.sm/Librem5/community-
| wiki/-/wikis/Freque....
| marcan_42 wrote:
| > I was under impression that it was the whole point of
| the exercise. It would be interesting to know otherwise.
|
| It absolutely wasn't. Look into it. In every case, the
| blob ends up running on the RAM controller CPU and
| supposedly finishes running and is done. The whole point
| of the exercise was obfuscating the process which is used
| to _get_ to that point such that it avoided the main CPU
| physically moving the bits of the blob from point A to
| point B. Really.
|
| > This is not how I understood it. _The Librem 5 stores
| these binary blobs on a separate Winbond W25Q16JVUXIM TR
| SPI NOR Flash chip and it is executed by U-Boot on the
| separate Cortex-M4F core._
|
| That is incorrect (great, now they either don't know how
| their own phone works or they're lying - see what I said
| about obfuscation? It's great for confusing everyone).
|
| The M4 core code is not proprietary; it's the pointless
| indirection layer they wrote and it is not loaded from
| that SPI NOR flash. It's right here:
|
| https://source.puri.sm/Librem5/Cortex_M4/-/tree/master
|
| _That_ open source code, which is loaded by the main CPU
| into the M4 core, is responsible for loading the RAM
| training blob from SPI flash (see spi.c) and into the DDR
| controller (see ddr_loader.c).
|
| The actual blob then runs on the PMU ("PHY Micro-
| Controller Unit") inside the DDR controller. This is an
| ARC core that is part of the Synopsys DesignWare DDR PHY
| IP core that NXP licensed for their SoC. Here, cpu_rec.py
| will tell you:
| firmware/ddr/synopsys/lpddr4_pmu_train_2d_imem.bin
| full(0x5ac0) ARcompact
| chunk(0x4e00;39) ARcompact
|
| The normal way this is done is the DDR training blob is
| just embedded into the bootloader like any other data,
| and the bootloader loads it into the PMU. Same exact end
| result, minus involving a Cortex-M4 core for no reason
| and minus sticking the blob in external flash for no
| reason. Here, this is how U-Boot does it on every other
| platform:
|
| https://github.com/u-boot/u-boot/blob/master/drivers/ddr/
| imx...
|
| Same code, just running on the main CPU because it is
| absolutely pointless running it on another core, unless
| you're trying to obfuscate things to appease the FSF. And
| then the blob gets appended to the U-Boot image post-
| build (remember this just gets loaded into the PMU, it
| never touches the main CPU's execution pipeline):
|
| https://github.com/u-boot/u-boot/blob/master/tools/imx8m_
| ima...
|
| Purism went out of their way and wasted a ton of
| engineering hours just to create a more convoluted
| process with precisely the same end result, because
| somehow all these extra layers of obfuscation made the
| blob not a blob any more in the FSF's eyes.
|
| The security question here is whether that blob, during
| execution, is in a position to take over the system,
| either immediately or somehow causing itself to remain
| executing. Can it only talk to the RAM or can it issue
| arbitrary bus transactions to other peripherals? Can it
| control its own run bit or can the main CPU always
| quiesce it? Can it claim to be "done" while continuing to
| run? Can it misconfigure the RAM to somehow cause
| corruption that allows it to take over the system? I have
| seen no security analysis to this effect from anyone
| involved, because as far as I can tell nobody involved
| cares about security; the whole purpose of this exercise
| obviously wasn't security, it was backdooring the system
| into RYF compliance.
| seba_dos1 wrote:
| > great, now they either don't know how their own phone
|
| Who's "they"? This is an unofficial community wiki.
| salawat wrote:
| ...I'm more curious why firmware is even closed source.
| Seems to me, just releasing the assembler, documentation
| and a datasheet should be basic courtesy.
|
| Then again everything I think that way about is
| apparently a mortal sin in the business world.
| loup-vaillant wrote:
| I hear hardware vendors are scared shitless of running
| awful of other hardware vendor's patents. If they
| released their firmware, this would give other folks
| ammunitions to attack them in court.
|
| Personally, I don't care much about firmware being
| proprietary, _if it cannot be used to meaningfully change
| the functionality of the chip_. But I do care about the
| chip ultimately following a public specification. I want
| an ISA just like x86, ARM, or RISC-V. Tell me the size
| and format of the ring buffer I must write to or read
| from, and I 'll program the rest.
|
| But even that is still too much to ask for apparently.
| Graphics cards are getting closer with lower level
| drivers like DX12 and Vulkan, but we're not quite there
| yet.
| salawat wrote:
| Unfortunately, from my understanding, hardware innovation
| seems to be on a trajectory whereby it is exactly going
| toward the case that most functionality is "soft" and
| thereby fundamental changes can be wrought through
| microcode. Just look at Nvidia's use of FALCON's.
| Hardware manufacturers seemingly WANT to be able to use
| the same piece of hardware for different jobs, and having
| blobs tooled to reconfigure it is the way to go.
|
| My biggest problem is that figuring this state of affairs
| out is like pulling teeth.
| flatiron wrote:
| if you are getting a framework to run a hackintosh why
| would you not get a macbook air? prolly cheaper in the
| long run and the M1 is a pretty great chip.
|
| i just can't wrap my hands around macOS and I used to
| work at apple. I'm just so used to linux for everything.
| philliphaydon wrote:
| The ryzen 6000 series for laptops will support usb 4 with
| tb3.
|
| What does tb4 offer over 3?
| lsaferite wrote:
| More guaranteed PCIe (32Gbps) bandwidth. Mandated wake
| from sleep over TB4 while closed. DMA Protection.
|
| It's mostly the same though and devices should be cross-
| compatible.
|
| Primarily you want TB4 for the stricter requirements for
| certification.
| stransky wrote:
| longer cables
| neogodless wrote:
| https://plugable.com/blogs/news/what-s-the-difference-
| betwee...
|
| The short answer for Thunderbolt 4 over Thunderbolt 3 is
| "a second 4K display" thanks to double the PCIe
| bandwidth, 16Gbps vs 32Gbps. (The main data channel is
| 40Gbps in both.)
|
| USB 4 is actually a bit more constrained than Thunderbolt
| 3, at 20Gbps data channel, and no guarantee of PCIe
| bandwidth.
|
| > Thunderbolt 4 guarantees support for one 8K display or
| two 4K displays. A USB4 port can only support one display
| with no mention of resolution minimums. And even that
| isn't required as some USB4 ports will not support video
| at all.
| philliphaydon wrote:
| So it sounds like TB3 + USB4 would be a good docking
| station, and TB4 just makes it a bit better. So I
| shouldn't be too worried if I upgrade my laptop and it
| only has TB3.
|
| Finding it hard to keep up with tech lately :D
| Fnoord wrote:
| Right now I use a machine with USB-C for video out at
| 1080p one monitor. But I don't use an eGPU with that, and
| its officially an Android device.
|
| I got 2x 1440p monitor and won't upgrade either any time
| soon. So I suppose if I were to use both of these, I'd be
| fine with TB3 and eGPU?
| rasz wrote:
| >Its just that with AMD
|
| - forbidding motherboard manufacturers to implement backward
| compatibility
|
| - PCIE 4 and 'PCI Express Resizable BAR' CPU features linked
| to the price of northbridge
| thejosh wrote:
| Didn't they turn this on last April for all cards?
| codeflo wrote:
| If all of this was planned (and I'm not saying it is), that would
| have been very clever. It would work like this:
|
| 1. You "accidentally forget" to disable the feature in hardware.
| Given how competitive the market is, motherboard manufacturers
| can be relied upon to enable it in their "1337 OVERCLOCKZ" mode.
| No conspiracy is needed.
|
| 2. You "tolerate" this practice just long enough to win the
| critical initial wave of reviews and benchmarks. Interested
| buyers will look at those charts for years to come.
|
| 3. And when you finally do turn the feature off to protect the
| market for your server chips, you can plausibly claim that you
| had explicitly forbidden using this configuration from the very
| beginning. None of this is your fault.
|
| .
|
| Edit/Addendum: Just to clarify my actual opinion, of course an
| honest mistake is more likely, at least for step 1. Very few
| "evil schemes" exist in reality because people aren't all that
| clever (and all that evil). But the possibility is interesting to
| speculate on.
| PragmaticPulp wrote:
| > 2. You "tolerate" this practice just long enough to win the
| critical initial wave of reviews and benchmarks. Interested
| buyers will look at those charts for years to come.
|
| That's not what happened here.
|
| The AVX-512 instructions not only weren't enabled by default,
| they couldn't be enabled at all unless you went out of your way
| to disable the efficiency cores completely. They also wouldn't
| benefit your workload unless disabling those extra cores was
| offset by the AVX-512 instructions on the remaining cores.
|
| None of the benchmarks you saw in reviews or marketing material
| would have used these instructions unless specifically called
| out by the reviewers as having made all of these changes.
|
| Benchmarks like the multi-core Geekbench would actually go
| _down_ , not up, with this enabled because you're giving up
| cores. Thermal performance would be worse because the
| efficiency cores were disabled.
|
| Intel never marketed the part with AVX-512. It was discovered
| by a reviewer poking around in the BIOS of a review board.
|
| > motherboard manufacturers can be relied upon to enable it in
| their "1337 OVERCLOCKZ" mode. No conspiracy is needed.
|
| Nope. You had to disable cores to enable AVX-512 and I doubt it
| would show up in any gaming or consumer benchmarks as a
| positive.
|
| The conspiracy theory about Intel doing this to mislead
| consumers not only doesn't make sense, it's completely wrong
| given how this worked and how it was discovered.
| im3w1l wrote:
| Comparing the facts to all the suspicious people here, it
| made me realize something. Our zeitgeist sure is a cynical
| one.
| jacquesm wrote:
| Let's see: Intel
|
| - ME https://news.ycombinator.com/item?id=21534199
|
| - Anti Trust:
| https://www.networkworld.com/article/2239461/intel-and-
| antit...
|
| - The cripple AMD function in their compiler:
| https://www.agner.org/optimize/blog/read.php?i=49
|
| I could keep this up quite a bit longer if you want. Intel
| well deserves any skepticism and cynicism it is targeted
| with.
| slibhb wrote:
| > I could keep this up quite a bit longer if you want.
| Intel well deserves any skepticism and cynicism it is
| targeted with.
|
| Intel deserves skepticism and cynicism based on what is
| true.
| mhh__ wrote:
| All companies deserve skepticism, to be clear. AMD have
| their management engine too. As far as I'm concerned the
| only reason why AMD's ME isn't as developed as Intel's is
| because their products have been such a joke leading up
| to Zen that no one with men in suits bothered asking them
| for the capabilites, or something like that.
| yjftsjthsd-h wrote:
| Intel has spent decades accumulating its reputation.
| Sometimes cynicism is appropriate.
| duxup wrote:
| Humans seem prone to find meaning and conspiracy amidst
| chaos and confusing circumstances.
| steveBK123 wrote:
| There's a saying I like - Humans are deterministic
| machines in a probabilistic world.
| hajile wrote:
| > The AVX-512 instructions not only weren't enabled by
| default, they couldn't be enabled at all unless you went out
| of your way to disable the efficiency cores completely. They
| also wouldn't benefit your workload unless disabling those
| extra cores was offset by the AVX-512 instructions on the
| remaining cores.
|
| If your code benefits from AVX-512, it'll probably benefit
| from turning off the efficiency cores too. Sixteen 256-bit
| AVX channels are the same as eight 512-bit channels in
| theoretical throughput. Because there are fewer load/store
| commands and fewer bunches of setup code to run, overall
| theoretical efficiency should be higher.
|
| Power efficiency was a killer at 14nm. The cores would
| downclock into oblivion when AVX-512 executed. Given the node
| shrink and shutting off half the cores, I don't see why this
| would happen here. Doing the same calculations with less
| power means a lot for the kinds of workloads that actually
| use AVX-512. Sure, idle performance may go down a bit, but
| once again, if you're running the kind of application that
| benefits from this, that's also probably not a top
| consideration either.
|
| The real solution would be for Intel to detect the presence
| of AVX-512 instructions then automatically and
| unconditionally pin the thread to the big cores. It wouldn't
| be hard either, just catch the unknown instruction exception
| and see if it is AVX-512 then move the thread.
| PragmaticPulp wrote:
| > If your code benefits from AVX-512, it'll probably
| benefit from turning off the efficiency cores too.
|
| Maybe if you have a poorly parallelized problem where the
| bulk of the code is spent in AVX-512 instructions _and_ you
| don't ever do anything else on the computer, but that's
| going to be rare for any consumer.
|
| > The real solution would be for Intel to detect the
| presence of AVX-512 instructions then automatically and
| unconditionally pin the thread to the big cores. It
| wouldn't be hard either, just catch the unknown instruction
| exception and see if it is AVX-512 then move the thread.
|
| "It wouldn't be hard" is a huge understatement. This is OS-
| level support, not something Intel can just do inside the
| CPU. And operating system vendors and developers are still
| catching up to basic Alder Like scheduler support without
| AVX-512 handing, so it's not reasonable to suggest this
| would be easy.
| gnufx wrote:
| I don't see how "poorly parallelized" is relevant
| compared with, say, a decent OpenMP BLAS which is likely
| to benefit from any reduced power consumption that allows
| a faster clock. In general on compute nodes it's worth
| turning off cores dynamically which aren't being used by
| jobs, like on a half-full node.
| pantalaimon wrote:
| There have been reports that it was possible to enable
| AVX512 on the big cores with the little ones enabled [0]
| by toggling some bits in the MSR.
|
| For a first step it would have been enough to manually
| pin threads to the big cores for adventurous users, but
| Linux already tacks which threads use AVX (since it comes
| with a clock penalty, so you want to isolate them if
| possible), it's not unreasonable to think that auto-
| setting CPU mask based on that should be possible too.
|
| [0] https://lkml.org/lkml/2021/12/2/1059
| amluto wrote:
| > The real solution would be for Intel to detect the
| presence of AVX-512 instructions then automatically and
| unconditionally pin the thread to the big cores. It
| wouldn't be hard either, just catch the unknown instruction
| exception and see if it is AVX-512 then move the thread.
|
| Having actually worked on this, it's quite a bit more
| complicated, although it is possible. For better or for
| worse, though, Intel made a decision that all cores would
| expose the same feature set, and now this is baked into
| Linux and probably other OSes at the ABI level, and
| changing it would be a mess.
|
| A correct heterogeneous ABI (and frankly a good AVX512 ABI
| at all) would require explicit opt-in, per process, for
| AVX512. Opting in would change the signal frame format, the
| available features, and the set of allowed cores.
| tmccrary55 wrote:
| Boost your benchmarks with AVX-512 by subscribing to
| Intel+
| amluto wrote:
| What benchmarks? Not a lot of real programs actually use
| AVX512, and a lot of the ones that did discovered that it
| made performance _worse_ , so they stopped.
|
| (The issue is that AVX512 (the actual 512-bit parts, not
| the associated EVEX and masking extensions) may well be
| excellent for long-running vector-math-heavy usage, but
| the cost of switching AVX512 on and off is extreme, and
| using it for things like memcpy() and strcmp() is pretty
| much always a loss except in silly microbenchmarks.)
|
| To be clear, I don't like the type of product line
| differentiation that Intel does, and I think Intel should
| have supported proper heterogenous ISA systems so that
| AVX512 and related technologies on client systems would
| make sense, but I don't think any of this is nefarious.
| hajile wrote:
| > What benchmarks? Not a lot of real programs actually
| use AVX512, and a lot of the ones that did discovered
| that it made performance worse, so they stopped.
|
| This paints a picture of what happened to Intel's fab
| process rather than AVX-512 itself.
|
| AVX-512 was proposed back in 2013. It was NOT designed
| for desktops. The original designs were for their Phi
| chips (basically turning a bunch of x86 cores into a
| GPU). These Phi chips ran between 1 and 1.5GHz, so power
| consumption and clocks were always matched up.
|
| Intel wanted to move these instructions into their HPC
| CPUs. The problem at hand was ultra-high turbo speeds.
| These speeds work because the heat is a bit spread out on
| the chip and a lot of pieces are disabled at any given
| time. With AVX-512, they had 30% of the core going wide
| open for the vector units (not to mention the added usage
| from saturating the load/store bandwidth).
|
| They wanted 10nm and then 7nm to fix these issues. At
| their predicted schedule, 10nm would have launched in
| 2015 and 7nm in 2017.
|
| Given the introduction of AVX-512 in 2013, they had
| plenty of time. In fact, the first official product was
| Knight's landing in 2016. Skylake with AVX-512 didn't
| launch until 2017 when they were supposed to be on 7nm.
|
| Intel was forced to backport their designs to
| 14nm++++++++++++ which forced a deal with the devil. They
| had to downclock the CPU to keep within thermal limits,
| but this slowed down EVERYTHING. Maybe they could have
| created a separate power plane for AVX, but this would be
| a BIG change (and probably politically infeasible).
|
| What happens with the downclocking? If you run dedicated
| AVX-512 loads, then maybe you should have been looking at
| Phi instead. If not, mixed loads suffered overall because
| of the lower clockspeeds.
|
| Their second revision of 10nm superfin is still a
| generation larger (well, probably a half-generation) than
| what they anticipated. There might still be downclocking,
| but I'd guess that it's nowhere near what previous
| iterations required.
|
| TL;DR -- AVX-512 was launched two nodes too early which
| screwed over performance. It should become acceptable
| either with 10nm Superfin or 7nm when it launches so the
| CPU doesn't have to downclock constantly.
| mhh__ wrote:
| I think the technology will be there to do the mixed
| instruction set stuff in 5 years, but for now I
| understand why Intel have played it safe.
| usefulcat wrote:
| > The real solution would be for Intel to detect the
| presence of AVX-512 instructions then automatically and
| unconditionally pin the thread to the big cores.
|
| Wouldn't it be entirely up to the OS to decide on what
| core(s) a thread will run?
| malkia wrote:
| This somewhat reminds of the following software optimization
| procedure: - Let's thread this thing.
| - OMG things are working N times faster (where N=cpus)
| - 1 year later, shit... In 0.1% of the cases there is race
| condition and we can't figure it out, random garbage to our
| data... - 1 week later rolled back to single thread
| and now everyone is complaing why things are slow...
| - This actually happened... (just not the same time frames)
| blackoil wrote:
| > 2. You "tolerate" this practice just long enough to win the
| critical initial wave of reviews and benchmarks. Interested
| buyers will look at those charts for years to come.
|
| Problem with this scheme is that it assumes reviewers will go
| to length of running a special hidden mode which may have value
| in some fringe cases and rave about it.
| [deleted]
| ashtonkem wrote:
| > Very few "evil schemes" exist in reality because people
| aren't all that clever (and all that evil).
|
| Also, groups of people are really bad at keeping secrets. In my
| experience most orgs that do bad things find a way to make
| everyone think that it's either actually a good thing, or that
| someone else is responsible for it. If you make people think
| there's a dark secret, they'll tell eventually.
| tehjoker wrote:
| This makes no sense. There are plenty of secrets and schemes
| out there. Often it's only necessary to keep a secret in the
| present. If it comes out years later, often people just
| ignore it.
| phkahler wrote:
| >> Very few "evil schemes" exist in reality because people
| aren't all that clever
|
| True, but those that try actual evil schemes must be watched
| for repeats. Remember Intel and Rambus?
| tjoff wrote:
| * 1. Very few "evil schemes" exist in reality because people
| aren't all that clever (and all that evil)*
|
| Uh, no. That is a very common technique commonly used the last
| 15 years.
|
| Many products have different parts, such as different types for
| display panels in different revisions. So a very common tactic
| is to only release the one with the superior part at first.
|
| And a few weeks/months later all the reviews have been written
| or at least have sourced their review sample you start pushing
| the cheaper version.
|
| Mobile phones, computer displays, TVs etc. suffer from this.
| Perhaps less so nowadays because most of those critical
| components are specified in the spec. sheet.
| Croftengea wrote:
| > just long enough to win the critical initial wave of reviews
| and benchmarks.
|
| Is the average set of tests by, say, LTT or THG influenced by
| presence of these instructions? (not being snarky, I really
| don't know).
| jsheard wrote:
| Even if a reviewers test suite does benefit from AVX512, it's
| unlikely that their Alder Lake reviews would actually be
| swung by it because AVX512 was always disabled in the
| standard configuration with all of the CPU cores enabled.
| Enabling AVX512 required disabling the efficiency cores
| altogether, and running with just the performance cores.
|
| It might have mattered with the lower end Alder Lake parts
| that only have performance cores, but those haven't been
| released or reviewed yet.
| square_usual wrote:
| Is AVX-512 even useful in benchmarks? I thought it was mostly
| used for specialized workloads, and was basically useless for
| end users and gamers.
| mhh__ wrote:
| Some parts of the game would probably benefit a bit from 512
| (also remember that it isn't just wider, it adds _a lot_ of
| new operations entirely), but there simply isn 't enough
| parallelism in the main loop of a game to make all that much
| difference.
|
| Also "useful" and benchmarks do not mix well.
| leeter wrote:
| For games generally those parts can be shifted to the GPU
| because they feed the rendering pipeline anyway. Doing so
| has other benefits of lowering latency for that data to be
| picked up by the graphics bits already there etc. Also the
| clock speed penalty for AVX512 is pretty steep so unless
| you're going to do a lot of AVX512 it really doesn't make
| sense either. Games generally aren't using double precision
| either which is the AVX512 bread and butter. To make it
| more fun even if they were... a 4x4 matrix is composed of
| 256bit elements so the vast majority of register usage
| wouldn't be 512 bit anyway. The main benefit would be the
| new instructions more than the register width.
|
| Long story short: Games generally aren't doing enough AVX
| to really benefit from it as implemented. They can be quite
| bursty in their use. Not to mention that Not every CPU even
| has AVX (because apparently Pentium Gold is a thing) even
| today.
|
| If Intel convinced AMD to pick up the 256 bit versions and
| removed the clock speed penalties I could see them getting
| more use. But at the moment it's really just a feature for
| the HFT market primarily and even they are extremely
| careful in use because of the clock speed penalty. To the
| point they will literally benchmark both ways to make sure
| the penalty is overcome by the throughput.
| jeffbee wrote:
| AVX-512 is so useful in benchmarks that sites like Anandtech
| intentionally build their benchmarks without it, because the
| presence of AVX-512 spoils the whole horse race narrative
| they are trying to sell.
| xlazom00 wrote:
| Specialized like software video decoding and encoding,....
| Kletiomdm wrote:
| I don't think that anyone who cares about specific CPU features
| would not know that it shouldn't work and therefore wouldn't
| risk getting it removed later
|
| Everyone else would not know that this might have a performance
| impact.
| mackal wrote:
| I wonder if it's more just they had higher yields if they
| treated the AVX-512 bits as dead silicon :P
| formerly_proven wrote:
| I've voiced this before but I think AVX-512 is completely
| irrelevant in the consumer space (which imho makes titles like
| "Intel artificially slows 12th gen down" incredulous). Even for
| commercial applications - sure there are some things that
| benefit from it. On the other hand, I work in HPC and even in
| (commercial, not nuke simulations or whatever the nuke powers
| do with their supercomputers) HPC applications AVX-512 is
| rarely used.
|
| Also... AVX-512 was never announced for 12th gen. No one
| claimed it was there, would work, would be stable or anything
| to that tune. No one should have had an expectation that a 12th
| gen CPU would do AVX-512 (and no one did). Intel even
| explicitly said pre-launch that it won't do AVX-512. Some
| people found post-launch that some BIOSes allow you to turn the
| E-cores off and that causes AVX-512 to be detected. IIRC not
| even the BIOS option claimed anything about AVX-512.
|
| There are a lot one can criticize about Intel. Many low hanging
| fruits. This isn't one of them.
| fivea wrote:
| > I've voiced this before but I think AVX-512 is completely
| irrelevant in the consumer space. Even for commercial
| applications - sure there are some things that benefit from
| it.
|
| It really matters nothing if a random person in the internet
| believes a feature is irrelevant. What matters is whether
| consumers were defrauded by seeing a feature they bought
| being arbitrarily disabled by the manufacturer just because
| it fits their product strategy.
|
| Personally I'd be pissed if I purchased one based on
| synthetic benchmarks which were then rendered irrelevant but
| this type of move.
|
| And by the way, it wasn't that long ago that so-called
| experts try to downplay the usefulness of a feature for end-
| users in consumer segments to be proven completely and
| unashamedly wrong, such as the transition to multi-processor
| and multi-core systems.
| charcircuit wrote:
| >by seeing a feature
|
| The other person just said that the feature wasn't
| advertised. Most people would not be aware it was even
| possible.
| windexh8er wrote:
| Many features of products aren't advertised explicitly.
| You don't expect those to be disabled, however, if part
| of the reason you purchased it was because you found out
| it was there.
|
| I can see Intel's rationale if it were never promised,
| but the timing of the removal of the feature is suspect.
| If Intel is doing this because it sees some level of
| future chip sales being negatively impacted by a feature
| that is in the chip that they previously gave customers
| access to, then they should be held accountable in my
| opinion. If nothing else people burned by this can chalk
| another one up to Intel's misguided sales team. It's as
| if Intel is asking for this sort of attention lately.
| PragmaticPulp wrote:
| > What matters is whether consumers were defrauded by
| seeing a feature they bought
|
| Nobody was defrauded.
|
| The CPU never advertised AVX-512 support. You could never
| enable it "for free". You had to disable the efficiency
| cores and use a BIOS where the manufacturers simply forgot
| to turn it off.
|
| There is no fraud here and it's weird to claim as much.
| vardump wrote:
| I could absolutely make a good use of AVX-512 for a lot of
| applications, including consumer space. It's even better than
| AVX2 when it comes to flexibility, and of course, it's double
| width.
|
| Of course because of lower clock speed and the associated
| penalties limit it a bit. But you can work around those
| limitations, for example perhaps by dynamically switching
| between AVX2 and AVX-512 paths depending on workload.
| junon wrote:
| Thing is, you probably wouldn't use most of the AVX-512
| instructions even provided. Some of them are ridiculously
| niche to the point I'd be really curious how _anyone_ has
| actually used them.
| babypuncher wrote:
| AVX-512 gives RPCS3 a pretty hefty boost, though that is
| arguably a niche consumer use case.
| PragmaticPulp wrote:
| > I've voiced this before but I think AVX-512 is completely
| irrelevant in the consumer spac
|
| Don't forget that reviewers made a huge deal out of the fact
| that CPUs downclock themselves when running AVX instructions.
|
| Several reviewers tried to make this into some sort of
| scandal at the time, which I'd guess contributed to Intel
| wanting to remove the feature from consumer CPUs.
|
| Of course, many of those same reviewers are now capitalizing
| on Intel removing AVX-512 support while ignoring the fact
| that you had to disable all of the efficiency cores if you
| wanted that feature (worsening thermals and performance in
| non-AVX) workloads.
| adgjlsfhk1 wrote:
| To me, it does seem really weird that they opted to
| downclock the entire CPU rather than increase the latency
| of the instructions that caused overheating. That feels
| like it would have been a much less disruptive solution
| that would work strictly better.
| colejohnson66 wrote:
| If you increase latency, the ROB fills up quicker, and
| could get full which would back up the pipe. It makes
| sense why they'd choose to use the default "downclock on
| overheat" rather than add more circuitry to deal with
| that specific issue.
| FrostKiwi wrote:
| > Don't forget that reviewers made a huge deal out of the
| fact that CPUs downclock themselves when running AVX
| instructions.
|
| It was and is kinda a huge deal. I followed the development
| of Corona Renderer for the past decade closely and one
| reoccuring theme with conflicting evidence was whether the
| inclusion of AVX was beneficial or detremental to
| performance due to resulting down clocking on certain
| platforms.
|
| And surprisingly, now quickly looking through the search
| function, of the corona renderer forum an update apparently
| even dropped AVX because of better performance. (
| https://forum.corona-
| renderer.com/index.php?topic=33889.msg1... ) Though I
| wonder if this is accurate or if the post is missing
| context...
| blacklion wrote:
| It is irrelevant because it is not wide-spread. AVX-512 is
| much nicer to program than nay previous Intel's vector
| instructions. It will be relevant if programmer could assume,
| that 50% of her auditory has CPU with AVX-512.
|
| If I want to play with it (as programmer who is interested in
| DSP on generic-purpose hardware) I need to rent special cloud
| instance for very non-hobby-friendly price. Additionally,
| benchamrks (of algorithms, not hardware) at shared instance
| is never good idea.
|
| I have old (but fast enough for most my tasks) i7-6700K at my
| desk now, and I've hoped to upgrade it to something with full
| AVX-512, but alas. What is cheapest way to have local AVX-512
| capable system (I know about exotic, low-power i3-8121U, lets
| ignore it)?
|
| Unless developers will be able to buy reasonable-priced
| (think: current i5/i7 non-extreme prices, middle-level MoBo,
| not low-end, but not gaming or server one) system, AVX-512
| will be completely irrelevant.
|
| Yes, GPUGP is affordable now (or not? Prices for video cards
| are insane!), but not all tasks goes well with GPUGP, where
| transaction cost is insane (memory is slow, but PCIe is much
| slower and has much larger latency).
|
| Update: And no, I don't need any "effective" cores at my
| desktop, thank you. I'm not sure, I need it on my laptop,
| either, but I'm pretty sure about my desktop.
| jeffbee wrote:
| If you just want to play with AVX-512, you can rent an Ice
| Lake Xeon on AWS EC2 for only 3C//hour, which strikes me as
| very hobby-friendly pricing.
| pantalaimon wrote:
| But why would you want to play around with AVX512 if you
| can't benefit from it on your local machine?
| smilekzs wrote:
| e.g. to develop software that works well on servers that
| do support them?
| gameswithgo wrote:
| Everything is rarely used until the instructions are
| ubiquitous.
| kergonath wrote:
| > On the other hand, I work in HPC and even in (commercial,
| not nuke simulations or whatever the nuke powers do with
| their supercomputers) HPC applications AVX-512 is rarely
| used.
|
| Even in the nuke simulations it is rarely used. More recent
| cores might be better, but the frequency drop and the
| associated latency kill performances on the clusters I know.
| And the new generation ones are AMD anyway.
| gnufx wrote:
| As ever, it depends, probably on whether your code is
| dominated by matrix-matrix linear algebra. BLIS DGEMM on my
| SKX workstation runs at ~88GF, or ~48 if I restrict it to
| using the haswell configuration (somewhat different on a
| typical compute node).
|
| But yes, I'd rather have twice the cores and memory
| bandwidth with AVX2. For those that don't know: non-
| benchmark code usually doesn't get close to peak floating
| point performance, constrained by memory bandwidth and/or
| inter-node communication. Everyone is trying to get the
| compute performance from GPUs anyway.
| jrockway wrote:
| > No one claimed it was there, would work, would be stable or
| anything to that tune.
|
| Isn't that true of like 99.9999% of CPU features? I have
| never seen an Intel presentation or piece of marketing
| material that said my software would be able to use the EAX
| register, and yet it keeps showing up year after year.
| dpark wrote:
| Intel explicitly states which instructions sets each
| processor supports. They don't list every random register
| or instruction. Those are captured in the detailed
| documents.
|
| e.g. https://www.intel.com/content/www/us/en/products/sku/2
| 26066/...
| gnufx wrote:
| > HPC applications AVX-512 is rarely used
|
| Your applications don't do linear algebra or FFT? AVX-512 is
| overrated for HPC generally, but it's surely going to be used
| by an optimized BLAS (unless there's only one FMA unit per
| core if the implementation is careful).
|
| Compute nodes actually should have a big.little structure
| with a service core for the batch daemon etc.
|
| Elsewhere, I see ~130 AVX512 instructions in this Debian
| system's libc.
| jacquesm wrote:
| I'd normally say this is paranoid but with Intel I'll have to
| give you the benefit of the doubt.
| zinekeller wrote:
| I'll be honest, this looks to be a simple mistake. Cutting
| AVX-512 literally (meaning in hardware) requires actual staff
| doing that cut plus it'll increase turnaround times, plus the
| efficiency cores don't have it. Unlike "let's bin perfectly
| functional processors into a lower-cored product", this is
| economically not logical for Intel, which often is the closest
| from the truth.
| pantalaimon wrote:
| Then why not leave it as an option for people who want to
| experiment with the feature?
| zinekeller wrote:
| Fair point, but again most Windows programs aren't designed
| with heterogeneous CPUs in mind, so probably they don't
| want to deal with the headaches, plus the space used for
| AVX-512 (note that it's now in microcode) can be repurposed
| for more important tasks.
|
| Of course, this could be evil Intel playing its tricks
| again but financially speaking removing AVX-512 do have
| tangible benefits to them in the form of increased
| microcode for the rest of the instructions (I want to see
| the detailed Alder Lake errata, maybe there's indeed a bug
| in another instruction that requires more microcode to
| implement - AVX-512 is an easy sacrifice for that).
| marcan_42 wrote:
| It's not in microcode, they're just disabling it in a
| microcode update. Microcode updates aren't replacements
| for microcode either, they're patches. The vast majority
| of microcode is in ROM and can never be changed
| wholesale; updates can just patch in hooks and change
| certain things, limited by the number of patch slots
| available. Doing this does not "free up" any microcode.
| It's purely disabling a feature, there is absolutely
| nothing to be gained in return.
| qayxc wrote:
| One word: support.
|
| Once you advertise a feature, you have to support it. It'd
| be pretty damn hard for Intel to explain why lower tier i5
| and i3 CPUs have a feature that higher tier i7 and i9 SKUs
| are missing unless you jump through hoops via hardware
| (e.g. disable E-cores) or software (e.g. making sure to
| only run your process on P-cores).
|
| If you want to experiment with AVX-512, just get an older
| CPU. Much less hassle for both Intel and their customers in
| this particular case.
| nullifidian wrote:
| They don't support CPUs that were overclocked, yet
| advertise that feature, so support isn't the reason.
| qayxc wrote:
| > yet advertise that feature
|
| An therein lies the difference. They advertise the
| feature, which is something they never did for AVX-512 on
| Alder Lake desktop. If they advertise a feature, they
| cannot disable it without getting into legal trouble.
|
| If you get K-SKU, you get an unlocked multiplier. That's
| guaranteed by Intel and that's were their support begins
| and ends.
|
| They cannot do the same for AVX-512, though, because
| that's not possible in their heterogeneous architecture.
| There's currently no desktop OS that supports different
| CPU cores (as in capabilities) on the same socket (or
| even board).
|
| Such feature would require kernel drivers, thread
| schedulers, and possibly compiler toolchains to be
| modified for at least Linux and Windows and _that 's_ the
| kind of support Intel would need to provide. They did
| provide a thread scheduler, but to my knowledge that
| didn't include any AVX-512 related logic.
| fivea wrote:
| > One word: support.
|
| That hypothesis is not credible. They can simply declare
| a feature is unsupported, and even that using it voids
| some guarantee. I mean, look at how overclocking is
| handled.
| qayxc wrote:
| You can still use the feature at your own risk - no one
| is forcing you to install the microcode update.
| bravo22 wrote:
| "cuts" are almost always fuses that are blown by ATEs during
| wafer manufacturing or post packaging tests. It is a bit
| flip.
|
| There is no extra cost.
| zinekeller wrote:
| And if it wasn't designed to be "blowed up" in the first
| place, in the hopes that the E-cores do have AVX-512? In
| that case, it's a literal cut and not simply blowing the
| fuse.
| marcan_42 wrote:
| They always have fuses for this stuff. They have fuses
| for everything. Modern silicon has piles and piles of
| chicken bits and fuses to make sure they cover all their
| bases. There is zero chance they didn't have a hard fuse
| to disable a known controversial/segmentable feature like
| AVX-512.
| bravo22 wrote:
| If it wasn't desigend to be disabled it would be buried
| under multiple and multiple metal layers. You can't "cut"
| it like a wire. Metal layers run all over the place [1].
|
| Therefore it would have a an e-fuse or a regular old
| 'current' fuse -- which is just passing a calculated
| amount of current through an intentionally thin
| interconnect so that it burns away.
|
| 1 - https://static.techspot.com/articles-
| info/1840/images/2019-0...
| ChuckMcM wrote:
| This sort of thing is always a bad look for Intel, die costs are
| relatively fixed and "upselling" some of the transistors on the
| die for increased margins will always fail against a competitor
| who is willing to enable all the things (and AMD certainly seems
| to be in that camp). What it means is that AMD has an "easy"
| market strategy for continuing to beat Intel, just out feature
| them at a lower margin, turn Intel's fab capacity into an anchor
| rather than an asset.
| tgsovlerkhgsel wrote:
| How much controls do users have over microcode updates? In other
| words, how hard is it to avoid or roll back an user-hostile
| update like this?
| qayxc wrote:
| Depending on the OS, microcode updates are under full user
| control.
|
| You can apply them or don't at your own discretion.
|
| The problem is knowing which particular update includes the
| change.
| marcan_42 wrote:
| You can only upgrade microcode on any given boot, not
| downgrade it. Therefore, if the BIOS has already upgraded
| your microcode for you on boot, you can't undo that from the
| OS.
| qayxc wrote:
| The BIOS doesn't upgrade microcode - the OS does.
|
| During boot there's usually no network connectivity, so how
| would the BIOS even know of an update, let alone acquire
| it?
| marcan_42 wrote:
| The BIOS itself comes with microcode. You upgrade your
| BIOS for whatever reason and it comes with new microcode
| applied on boot, which you can't disable. Sure, if you
| don't upgrade your BIOS you can keep using the old
| microcode. But you might have to to fix bugs or improve
| performance.
| numpad0 wrote:
| Common misconception about Intel microcode ROM is it's
| some sort of Flash memory, it's not. Update vaporizes
| each reset and the same "update" is loaded on each
| bootup.
| 323 wrote:
| It's explained in the article.
|
| You update your BIOS to get better DDR5 compatibility,
| but the new BIOS will also include a microcode update.
| zokier wrote:
| The CPU can update its microcode before executing BIOS,
| the BIOS can do its updates, and then the OS can do its
| stuff. It's all described in https://www.intel.com/conten
| t/www/us/en/developer/articles/t...
| qayxc wrote:
| That just when the update is applied, not how it gets
| there.
|
| The FIT is still updated externally (e.g. by the OS) and
| doesn't magically fill itself with new microcode.
| [deleted]
| tomxor wrote:
| They both have the opportunity to upgrade it.
|
| The BIOS holds a copy of the microcode and this can be
| upgraded by upgrading the BIOS/EFI (bundled with BIOS
| updates). The same for the OS.
|
| On boot the BIOS loads it's update first, then the OS,
| provided each is newer than the currently loaded
| microcode.
|
| So the idea is that if you don't like this update, you
| need to prevent your BIOS from updating (or downgrade
| it), and then also configure Linux to ether not upload
| it's microcode, or load a specific one.
| qayxc wrote:
| I misread that there, so disregard the first reply.
|
| Point still stands: no one forces you to upgrade your BIOS
| (unless there's problems) and even then, patching the BIOS
| upgrade itself is still an option.
|
| So it remains under user control.
| sesuximo wrote:
| Really missed an opportunity to give something to people for free
| jacquesm wrote:
| Or to get some good PR, which they are in real need of.
| PragmaticPulp wrote:
| No, the AVX-512 instructions weren't free. You had to disable
| all of the E cores. It only made sense if you could afford to
| give up the extra cores to accelerate a few instructions in a
| smaller number of threads.
|
| It wasn't a net win for the average person. I doubt many people
| would ever do this.
| oofabz wrote:
| E cores are only present on the high-end Alder Lake SKUs. If
| you have a mid-range chip with only P cores like the
| i5-12400, enabling AVX-512 really is free.
|
| Such P-core-only chips are attractive to Linux users. Intel
| Thread Director, their Alder Lake schedule software, does not
| work as well on Linux as it does on Windows. It has a
| tendency to schedule processes on the E cores even when P
| cores are available, which is bad for performance.
|
| There is significant overlap between users interested in
| AVX-512 and users interested in Linux performance, making the
| mid-range Alder Lake chips especially appealing to this
| group. Unfortunately they top out at 6 cores with no ECC, so
| it's not a perfect match.
| pantalaimon wrote:
| So why not leave that option for people who want to write and
| test AVX512 code?
|
| It's almost like Intel doesn't want people to use that
| feature
| ComputerGuru wrote:
| The AVX-512 rollout has been a complete disaster. Look at how
| quickly AVX-2 became widespread and targetable. My all metrics,
| AVX-512 adoption has been abysmal and most of the blame lies
| squarely on Intel's shoulders. And even now that platforms are
| beginning to support it, developers just aren't interested
| because AVX-2 + optimizations got them most of the way there.
| Then there's the heavy performance hit that regular code
| intermixed with AVX-512 incurs.
| mhh__ wrote:
| That performance hit doesn't necessarily exist on a given
| workload, because the hysteria over downclocking was mainly
| with the very early desktop implementations of AVX-512. You'd
| have to measure it and see nowadays (and consider that the
| downclocking may be simply due to yourself actually using all
| the execution units at once)
| ComputerGuru wrote:
| Whether the hit is there or not is besides the point so long
| as it's _perceived_ to be there by a not insignificant
| portion of the developers that might otherwise chase after an
| AVX-512 implementation.
|
| (Citation needed for my initial claim that the perception of
| the performance hit exists within that population.)
| pantalaimon wrote:
| Since Linux loads the microcode on it's own, can this be used to
| apply an old microcode or is there a prevention against
| downgrading the version?
| marcan_42 wrote:
| I'm pretty sure downgrades are blocked. You need to hack the
| BIOS so the (volatile) upgrade never happens on boot.
| nullify88 wrote:
| I wouldnt call it hacking. If you can find or backup the bios
| image for your motherboard, you can replace the microcode in
| it and reflash. Bios images are modular and the microcode is
| a replacable component.
| marcan_42 wrote:
| Hacking BIOS images isn't always that easy. Often they are
| signed (EFI Capsules) and the standard update utility in
| the BIOS menu will reject modified versions. I had to use a
| convoluted flashing process to make a trivial patch to a
| BIOS a few years back for this reason.
| csdvrx wrote:
| You mean, SOIC clip + flashrom?
| 3836293648 wrote:
| Microcode updates are applied at runtime, every time you boot
| them. At least on the CPU side. The motherboard might apply
| them before the OS loads though, so who knows what happens
| there
| marcan_42 wrote:
| Thats what I said; the updates are volatile but the OS
| can't undo an update that the BIOS already applied on boot.
| timpattinson wrote:
| I think for the subset of people that care about having
| AVX512 on their 12900K, and bother to disable the E-cores to
| achieve it, they will be using enthusiast motherboards that
| don't block bios upgrades in this way.
| marcan_42 wrote:
| Microcode downgrades are blocked by the CPU itself; the
| BIOS doesn't get a say. You could downgrade the whole BIOS
| to stop the upgrade being applied on boot (maybe, some
| motherboards may try to block this too), but then you need
| to choose between uncrippled microcode and BIOS bugfixes
| and improvements.
| christkv wrote:
| Is this not the definition of bait and switch? How can this not
| end up in a lawsuit and investigation from regulatory bodies ?
| gambiting wrote:
| It was never advertised, and Intel very specifically said
| AVX-512 instructions will not be available. Some motherboard
| manufacturers made them available anyway, now Intel is fixing
| that.
|
| It's like buying a car with an advertised speed limiter of
| 155mph, then finding that actually it can go 170mph, and then
| the manufacturer fixes the speed limiter with a software
| update. Yes we know the car _could_ go faster, but the speed
| limited version is the one that was actually advertised.
| adrian_b wrote:
| Intel very specifically said that AVX-512 instructions will
| not be available only a few weeks before the Alder Lake
| launch.
|
| While it cannot be said that Intel advertised AVX-512 for
| Alder Lake, at all previous disclosures it was said that the
| Golden Cove cores have AVX-512 and the Gracemont cores do not
| have it.
|
| It was clearly said that in hybrid configurations AVX-512
| will be disabled, because for Microsoft it is a too difficult
| task to implement scheduling on a system with heterogeneous
| cores.
|
| Whether AVX-512 can be enabled by disabling the Gracemont
| cores was not said, but everybody interpreted that saying
| nothing about this means that it will be possible to enable
| AVX-512, because Alder Lake is a replacement for Rocket Lake
| and Tiger Lake, both of which have AVX-512.
|
| This is one of a very few cases, if not the only case, when
| Intel replaced a CPU product without preserving backward
| software compatibility.
|
| If this was their intention from the beginning, then they
| certainly should have said it much earlier, not just
| immediately prior to launch.
| gambiting wrote:
| >>Whether AVX-512 can be enabled by disabling the Gracemont
| cores was not said
|
| Reading up on it, I was under the impression that's exactly
| how it worked, no? If you disable the efficiency cores in
| BIOS, the enable AVX-512 option would appear on selected
| few motherboards. You can't have all cores enabled _and_
| keep AVX-512 enabled at the same time.
| adrian_b wrote:
| This was possible on most motherboards at launch.
|
| Now however, Intel has issued a BIOS update that no
| longer allows enabling AVX-512 when the Gracemont cores
| are disabled.
|
| The motherboards produced from now on will have the new
| BIOS version.
|
| Keeping the original BIOS on the existing Alder Lake
| motherboards is not a good choice, because the new BIOS
| version also improves stability in certain memory
| configurations.
| solarkraft wrote:
| Not sure how others do it, but I don't only buy products
| based on advertisements, but rather by _the actual properties
| the product has_.
|
| But it's not like this changes anything legally. Products get
| worse due to software changes all the time (especially
| through cloud services shutting down) and as far as I know
| there haven't been any successful lawsuits about it.
| jacquesm wrote:
| > now Intel is fixing that
|
| That's not a fix. That's a disabling. A fix repairs
| something.
| protastus wrote:
| If AVX-512 was not in scope yet left turned on, parts may
| operate outside design limits, there's system-level
| validation that may not have been run, and specs that are
| at risk. Think about the scale of all PC OEMs that use
| these CPUs, rather than a specific product.
|
| For example, it's known that AVX-512 is power hungry and
| significantly increases TDP (heat). Do all Alder Lake based
| products have the thermal headroom? If CPUs operate outside
| of the envelope communicated by Intel, they may not be
| entirely stable, and parts could fail earlier than
| expected.
| gambiting wrote:
| Bringing a product in line with specification is very much
| fixing it. Even if the fix happens to be disabling
| something.
| adrian_b wrote:
| There was no specification about this feature until
| immediately prior to launch.
|
| When a company announces with years in advance a new
| product (e.g. Alder Lake), that is the replacement for
| their previous product (i.e. Rocket Lake & Tiger Lake)
| and in that series of products the most valuable feature
| is backward software compatibility and the new product
| will no longer have this feature, one would expect that
| the company should publicize vigorously the fact that the
| replacement product will not match the features of the
| replaced products.
| gambiting wrote:
| Sure, but that's a customer expectation vs legal
| obligation.
|
| Like, my sister bought the new M1 MacBook Air, only to
| discover that it doesn't support dual external screens -
| while her previous Air did. So there was absolutely an
| expectation there that any new MacBook Air would also
| support dual screens, right?
|
| But, at the end of the day - it is mentioned in the spec
| sheet. She could have checked. Just assuming that a
| feature is there is not enough. What's more - if dual
| screens worked originally, and then they stopped working
| after an update - that wouldn't be a bait and switch
| either, it would be a fix to bring the computer back in
| line with its spec.
|
| And yes, I agree that it would suck.
| marcan_42 wrote:
| The difference is dual screens don't work on the M1 MBA
| because it physically only has two display controllers in
| the silicon, one of them wired to the internal panel, and
| the other muxed to the two TB ports. The M1 Pro has two
| external display controllers, and the M1 Max four. I can
| show you where they are on the die shots, and that it's
| not artificial crippling.
|
| But what Intel is doing here is disabling existing
| silicon that exists and works.
| gambiting wrote:
| Sure, but as a customer that's irrelevant, right? You had
| a MacBook Air, used it with two displays, then you buy a
| new MacBook Air and bam, it doesn't work with two
| displays. The technical nitty gritty is not really
| relevant - at least the comment I was replying to sounded
| this way. That what the customer expects is more
| important than what is on the spec sheet? The fact that
| previous Intel CPU supported AVX-512, and the new one
| doesn't - so whether this fact is or isn't mentioned in
| the spec sheet is not important, because what the
| customer expects should be ultimately what decides what
| is "ok".
| jacquesm wrote:
| Yes, but once it has shipped and people may have come to
| depend on it it will break stuff, not fix stuff.
|
| Besides the obvious benefit of first having benchmarks
| out there claiming these chips are better than they
| really are. So this change just benefits Intel, and
| nobody else. If it would be a fix then it would be that
| something that was advertised did not work, and now it
| does.
| Someone wrote:
| > Yes, but once it has shipped and people may have come
| to depend on it it will break stuff, not fix stuff.
|
| That may apply to every change, including things
| everybody agrees on to be bug fixes. If, for example, you
| improve the number of correctly returned bits for
| computing _sin_ , that can break programs, for example
| games that want to keep world models in exact sync across
| systems.
|
| From what I read here, intel didn't advertise this
| feature and it wasn't easily discovered. If so, I don't
| think customers have any claim against Intel for the
| CPUs.
|
| _If_ motherboard vendors advertised/promoted it (could be
| as simple as blogging about it), I would think people who
| bought a motherboard because it allows activating this
| feature will have a case against them (in the EU and
| possibly elsewhere, the seller is responsible for the
| product being fit for purpose, not the manufacturer, so
| it would be the seller, but let's ignore that)
|
| Legally, there also is the issue who applies that update.
| From what I read, that's the motherboard. Here again, I
| would say that, if customers have a case, it would be
| against whomever sold them the motherboard (e.g. if that
| silently applies the update) I don't see any indication
| that Intel, as a CPU manufacturer, forces existing
| customers to install this.
| gambiting wrote:
| To add to this - even if the motherboard manufacturer
| allows this despite Intel's advice, in their stock
| configuration none of those CPUs allow AVX-512
| instructions out of the box. You need to knowingly and
| conciously go into the BIOS and disable all efficiency
| cores first, to make this option even appear. So "as
| sold" the product doesn't support AVX-512 instructions
| and the advertising is absolutely correct. Just like you
| can overclock the CPU but it doesn't come overclocked out
| of the box(and Intel assumes no liability if you do
| overclock it).
| gambiting wrote:
| >>Yes, but once it has shipped and people may have come
| to depend on it it will break stuff, not fix stuff.
|
| If people depend on functionality that is explicitly
| unsupported then I don't know what to say other than that
| I don't see how that's Intel's responsibility. If you buy
| a CPU that doesn't support AVX-512 instructions in order
| to use AVX-512 instructions then ...I think you're the
| one who is wrong here.
|
| To go back to my car analogy - if you buy a car that
| isn't type approved for towing, and yet you install a tow
| bar anyway, you can't complain to the manufacturer if
| stuff breaks.
|
| >>Besides the obvious benefit of first having benchmarks
| out there claiming these chips are better than they
| really are
|
| Are any of the published benchmarks using AVX-512
| instructions that were used by Intel in advertising, and
| were those in fact available at the time when the
| benchmarks were ran?
|
| >>If it would be a fix then it would be that something
| that was advertised did not work, and now it does
|
| Those CPUs were not compliant with their own published
| spec, now they are - it is absolutely a fix.
| josephcsible wrote:
| This has nothing to do with advertisement. You should not be
| able to retroactively make products that you've already sold
| to people be less useful, whether or not the feature you're
| clawing back was advertised.
| gambiting wrote:
| I disagree completely. The product as sold wasn't compliant
| with its own technical spec sheet - now after the update it
| is. The feature isn't supported by that CPU and it should
| have never been exposed to the consumer. If you discovered
| that the CPU has 8 cores despite being sold as a 6 core,
| then removing 2 cores with a BIOS update wouldn't be
| "making the product less useful" either. It just makes the
| product exactly as specified in its technical documentation
| - and that is how it should be.
|
| Let me play a devil's advocate here - what if using the
| feature actually damages the processor after a while? AVX
| instructions always generate a tonne of heat. Maybe that's
| the reason why it was meant to be unsupported in the first
| place. Should Intel be allowed to fix the CPU and bring it
| in line with spec, or is that "making the product less
| useful"?
|
| Edit: also - Intel isn't forcing anyone to install this
| BIOS update. If you want to keep it with AVX instructions
| available, at the cost of disabling all efficiency cores -
| sure, keep it that way.
| josephcsible wrote:
| > If you discovered that the CPU has 8 cores despite
| being sold as a 6 core, then removing 2 cores with a BIOS
| update wouldn't be "making the product less useful"
| either.
|
| Yes it would.
|
| > what if using the feature actually damages the
| processor after a while?
|
| Tell the owners this, and let them choose whether or not
| to take that risk.
|
| > Intel isn't forcing anyone to install this BIOS update.
|
| But then you'll be forever vulnerable to whatever the
| next variant of Spectre is. You can't cherry pick just
| the good parts of the update while excluding the bad
| parts.
| antman wrote:
| Good example. It's like buying a car and finding out it has a
| nice radio that was not advertised. After a few months the
| car dealership sends someone to break into the car and rip it
| off. Fair enough since it wasn't advertised, right?
| gambiting wrote:
| Well, breaking into your car is still illegal, so no, not
| fair enough.
|
| More like you buy a car without paying for a satellite
| radio(or that feature isn't advertised and isn't in the
| spec sheet), and then that feature gets removed next time
| you bring the car in for service. That is absolutely fine.
| nitrogen wrote:
| Based on the lawsuit about Tesla removing features from a
| used car it seems like it's not fine.
| [deleted]
| MauranKilom wrote:
| This is an undocumented feature, from what I gather. So it's
| missing the "bait" part of "bait and switch".
| alophawen wrote:
| The performance benchmarks of the Alder Lake is far from
| undocumented at this point.
| detaro wrote:
| You had to explicitly turn it on through hacks in the BIOS
| some motherboard makers added. If people enable a feature
| that's explicitly not supported on their benchmark rigs
| without disclosing it, that's not the vendors fault.
| dijit wrote:
| I'm not sure if benchmarks that are out have the AVX-512
| enabled or not.
|
| If they do then this is indeed fraud.
|
| They can always claim that those were engineering samples and
| not meant to be tested, but, I think they knew.
|
| But: that's assuming a lot. It's possible that the ES chips
| sent to reviewers also had AVX-512 disabled, and that
| benchmarks did not make use of the instruction anyway.
| my123 wrote:
| > I'm not sure if benchmarks that are out have the AVX-512
| enabled or not.
|
| No, because half the cores on those don't even have AVX-512
| in the first place. (to enable AVX-512, you have to disable
| all the eCores)
|
| As such, running those systems with AVX-512 turned on was
| academic, but not practically used.
| jsnell wrote:
| It's not even undocumented, but explicitly a documented as a
| feature those CPUs don't have. Originally it was supposed to
| be fused off physically, I wonder why it wasn't.
| sodality2 wrote:
| To get the positive wave of initial reviews from people who
| enable it.
| bee_rider wrote:
| Who did this? Phoronix got AVX-512 working, but was very
| clear that this was unexpected and had a good chance of
| going away before release. If anything, this is a better
| test for your reviewers -- if a reviewer has published
| benchmarks using these unsupported features without
| mentioning it, you should keep that in mind when
| listening to their reviews in the future.
| Cloudef wrote:
| Easier to sell the same product twice just by flipping a
| bit in software
| davidlt wrote:
| It could be that decision to not support AVX512 was made
| very late in product development and thus early batches
| didn't have chicken bit disabled or/and fused it off.
|
| There might be gazillions of reason why this was done.
| causi wrote:
| And here I thought it might've been Intel's way of apologizing
| after robbing us of performance with the Spectre and Meltdown
| debacles.
| qayxc wrote:
| Much ado about nothing, IMHO.
|
| AVX-512 isn't advertised by Intel as a 12th gen desktop CPU
| feature and doesn't provide any advantages for most desktop users
| anyway.
|
| Sure, there's niche applications like some particular PS3
| emulator, but I don't share the author's opinion w.r.t. to
| performance and efficiency in _desktop_ use.
| nullifidian wrote:
| It doesn't provide any advantages because there is no install
| base, so desktop software doesn't utilize it. It's just a wider
| SIMD instruction set, and I'm confused why people relegate it
| to HPC only category.
| CodesInChaos wrote:
| AFAIK it downclocks the CPU, even if the AVX512 instructions
| are only a tiny fraction of instructions. So you only benefit
| from AVX512 if you use it enough. While 128 bit SIMD give you
| a speedup if you use it for a few functions that need it,
| without slowing down the rest of your code or even unrelated
| applications.
| dragontamer wrote:
| There's significant downclocking only on on Skylake-X (aka:
| 1st generation teething problems).
|
| On more recent cores, there's only ~100 MHz of
| downclocking, which is so small you can pretty much ignore
| it.
| jeffbee wrote:
| It's amazing how sticky the myth has become, though. On
| Ice Lake Xeon there is basically no penalty for using the
| AVX-512 unit.
| dragontamer wrote:
| It was true for several years, because Intel got "stuck"
| on Skylake due to issues in their manufacturing.
|
| If Intel was able to get Icelake out on time, it may have
| reduced the "stickiness" of the myth.
| jeffbee wrote:
| True. But the power penalty from 512 was already quite
| reduced in Cannon Lake (2018) and completely gone in
| Tiger Lake (2020).
| nullifidian wrote:
| In heavily vectorized applications you still get a sizable
| speed up over AVX2, e.g. in tasks like video encoding. In
| mixed/real-time applications this issue could be solved by
| dedicating/pinning a thread with the AVX-512 payload to a
| core, automatically or manually. Not sure if such policies
| have been implemented in practice.
| josephcsible wrote:
| I wouldn't say that Intel taking away a working, if niche,
| feature from their hardware retroactively after users bought it
| is "nothing".
| qayxc wrote:
| It worked for one SKU at launch and by disabling up to 50% of
| the cores of the others. Calling that a working feature is a
| bit of a stretch already.
| xlazom00 wrote:
| Any idea how big area from CPU silicon AVX-512 really is? If it
| share some parts with AVX-2 or not
| bee_rider wrote:
| The article as some bits of the chip marked off (with big red
| X's). They look reasonably sized to be the AVX-512 parts, in
| the sense that they are quite large but not the majority of the
| chip or anything.
| NavinF wrote:
| I'm not surprised. A lot of people were burned last year when
| their entire CPU downclocked as soon as one application started
| using AVX-512. That killed all interest even before Alder Lake.
|
| Also see Linus's rants in this thread:
| https://www.realworldtech.com/forum/?threadid=193189&curpost...
|
| And the discussion: https://news.ycombinator.com/item?id=23809335
|
| Today only benchmarks and HPC workloads use AVX-512. I'm sure
| Intel is happy to force HPC customers to pony up for data center
| CPUs.
| qayxc wrote:
| > Today, only benchmarks and HPC workloads make use of it.
|
| From what I've heard, there's also a PS3 emulator that benefits
| greatly from it and its users are quite annoyed by this [0].
|
| According to the sources, users would need to disable E-cores
| and preferably HT as well to get the best results, though.
| Pretty niche and not helpful in general (since disabling up to
| 50% of the cores/threads doesn't seem great for other use
| cases, but hey - 10% more performance in PS3 emulation).
|
| [0]
| https://www.tomshardware.com/news/ps3-emulation-i9-12900k-vs...
| PragmaticPulp wrote:
| Enabling AVX-512 also requires the efficiency cores to be
| disabled. A lot of people are getting upset about this, but it
| wasn't really a free benefit for anyone except those with very
| specific AVX-512 workloads that didn't benefit from the extra
| cores.
| mhh__ wrote:
| Technically Intel actually did the things Linus asked for and
| did AVX-512 as well (on the server at least): The Golden Cove
| cores are extremely wide and brought a big single thread
| performance boost.
| [deleted]
| soheil wrote:
| "I hope AVX512 dies a painful death, and that Intel starts fixing
| real problems instead of trying to create magic instructions to
| then create benchmarks that they can look good on...
|
| I absolutely destest FP benchmarks, and I realize other people
| care deeply. I just think AVX512 is exactly the wrong thing to
| do. It's a pet peeve of mine. It's a prime example of something
| Intel has done wrong, partly by just increasing the fragmentation
| of the market." Linus T
| jiggawatts wrote:
| Linus Torvald's opinion of AVX512 just doesn't matter.
|
| It's like asking a 3D game programmer their opinion about
| FPGAs.
| wellthisisgreat wrote:
| is z690 platform worth upgrading to from z590 11700k -> 12700k?
| Brave-Steak wrote:
| blackoil wrote:
| Funny how many times on this site I read about futility of
| AVX-512. But now that a hack enabling it is fixed, suddenly
| everyone needs it.
| GeekyBear wrote:
| Once power and heat in a chip go high enough, Electromigration
| causes reliability issues.
|
| It was already thought that this would become a problem to worry
| about at smaller process nodes, but does AVX-512 on Alder Lake
| push voltage and heat enough that it is an issue already?
|
| >Electromigration is the movement of atoms based on the flow of
| current through a material. If the current density is high
| enough, the heat dissipated within the material will repeatedly
| break atoms from the structure and move them. This will create
| both 'vacancies' and 'deposits'. The vacancies can grow and
| eventually break circuit connections resulting in open-circuits,
| while the deposits can grow and eventually close circuit
| connections resulting in short-circuit.
|
| https://www.synopsys.com/glossary/what-is-electromigration.h...
|
| >Aging Problems At 5nm And Below
|
| https://semiengineering.com/aging-problems-at-5nm-and-below/
___________________________________________________________________
(page generated 2022-01-07 23:00 UTC)