[HN Gopher] Intel's "Cripple AMD" Function (2019)
___________________________________________________________________
Intel's "Cripple AMD" Function (2019)
Author : phab
Score : 251 points
Date : 2022-04-06 17:27 UTC (5 hours ago)
(HTM) web link (www.agner.org)
(TXT) w3m dump (www.agner.org)
| mhh__ wrote:
| This could arguably dated 2009 as that is when it was originally
| discovered (approx.).
| https://www.agner.org/optimize/blog/read.php?i=49
| ReleaseCandidat wrote:
| It has been discovered before that, at least in 2005:
|
| https://techreport.com/news/8547/does-intels-compiler-crippl...
| snvzz wrote:
| RISC is less amenable to this category of BS.
|
| Looking forward to RISC-V pushing x86 into retrocomputing
| territory.
| mechanical_bear wrote:
| So it appears not only is this posting from 2019, but the most
| recent information they reference is 2010. This seems to be no
| longer relevant? I'd love it if submissions on HN had a small
| blurb from the author explaining why their submission is
| interesting/relevant.
| shrx wrote:
| It certainly is still relevant. Please read the article (and
| the 2020 update below) before commenting on it.
| mechanical_bear wrote:
| I didn't state it was not relevant, I asked. Glad to see
| there is an update. My main point stands though, I'd love to
| see an explanation with posts.
| Anunayj wrote:
| yup, infact recently MATLAB applied a fix to this for their
| software [1]
|
| [1] https://www.extremetech.com/computing/308501-crippled-no-
| lon...
| MikePlacid wrote:
| Thank you for the link. It helped me to find the actual
| performance difference. It is significant:
|
| _AMD's performance improves by 1.32x - 1.37x overall...
| changing what looked like a narrow victory [for Intel] over
| the 3960X and a good showing against the 3970X into an all-
| out loss._
| https://www.extremetech.com/computing/302650-how-to-
| bypass-m...
| nonplus wrote:
| There are 2020 updates around MKL (But you may be correct that
| that content is about 2019 MKL optimizations).
|
| At any rate though, based on Intel's track record I think this
| content is still relevant and of value to engineers who don't
| have domain knowledge in compilers or work downstream.
| bee_rider wrote:
| Has anyone tried a recent version of MKL on AMD? I assume they
| were shunting AMD off into an AVX codepath because pre-Zen AMD
| lacked AVX2 (well, Excavator had I guess...).
|
| If they are sending Zen down the generic AVX2 codepaths by
| default and those are competitive with, say, openBLAS, that seems
| reasonable, right?
|
| Hopefully BLIS will save us all from this kind of confusion
| eventually.
| penguin_booze wrote:
| This sounds to me very much like VW cheat devices: detect the
| current situation, and "act accordingly".
| JoshTriplett wrote:
| If Intel had shipped a library/compiler that _did_ just use
| feature flags and didn 't check the CPU vendor, and the resulting
| code used features that on AMD ran much more slowly than the
| equivalent unoptimized code, would people blame AMD for the slow
| instructions, or blame Intel for releasing a library/compiler
| that they didn't optimize for their competitor's processor?
|
| This isn't a hypothetical; quoting
| https://en.wikipedia.org/wiki/X86_Bit_manipulation_instructi... :
|
| > AMD processors before Zen 3[11] that implement PDEP and PEXT do
| so in microcode, with a latency of 18 cycles rather than a single
| cycle. As a result it is often faster to use other instructions
| on these processors.
|
| There's no feature flag for "technically supported, but slow,
| don't use it"; you have to check the CPU model for that.
|
| All that said, the _right_ fix here would have been to release
| this as Open Source, and then people could contribute
| optimizations for many different processors. But that would have
| required a decision to rely on winning in hardware quality,
| rather than sometimes squeezing out a "win" via software even in
| generations where the hardware quality isn't as good as the
| competition.
| jdsully wrote:
| There are feature flags for formerly slow instructions that are
| now fast. E.g. rep mov
|
| https://www.phoronix.com/scan.php?page=news_item&px=Intel-5....
| [deleted]
| Certified wrote:
| Last time this came up on Hacker News I discovered SolidWorks
| 2021 was using an older MKL library that supports the
| MKL_DEBUG_CPU_TYPE=5 environment variable. I'm on an AMD cpu and
| measured a small solidworks fps and rebuild time improvement with
| the flag enabled
| eatonphil wrote:
| The first comment seems to suggest that flag no longer works.
|
| https://www.agner.org/forum/viewtopic.php?t=6#p82
| Certified wrote:
| Multiple versions of MKL dlls exist in the install directory
| of Solidworks 2021. Indeed, the dlls supporting FloXpress and
| simulation seem to be the updated MKL version that no longer
| support the flag. However, the main executable only seems to
| call sldmkl_parts.dll. It appears to be MKL version
| 2018.1.156 that does support the flag
| bee_rider wrote:
| It would depend on the version of MKL. If Solidworks has
| (just for example) statically linked to or bundled in an old
| version of MKL, then it should work there, still.
| dang wrote:
| Related:
|
| _Intel 's "cripple AMD" function (2019)_ -
| https://news.ycombinator.com/item?id=24307596 - Aug 2020 (104
| comments)
|
| _Intel 's "Cripple AMD" Function_ -
| https://news.ycombinator.com/item?id=21709884 - Dec 2019 (10
| comments)
|
| _Intel 's "cripple AMD" function (2009)_ -
| https://news.ycombinator.com/item?id=7091064 - Jan 2014 (124
| comments)
|
| _Intel 's "cripple AMD" function_ -
| https://news.ycombinator.com/item?id=1028795 - Jan 2010 (80
| comments)
| midjji wrote:
| So blacklist intel compiler in favor or GCC and CLANG, seems
| entirely reasonable!
| phkahler wrote:
| Worth noting that Intel has dropped their "old" compiler and the
| newer "Intel" compilers are LLVM based. IMHO they will likely be
| pulling similar anti-AMD tricks with it and they are keeping
| their paid version closed source - which is allowed by LLVMs
| license.
|
| RMS was right that compilers should be GPL licensed to prevent
| exactly this kind of thing (and worse things which are haven't
| happened yet).
|
| On another compiler related note, I find it insane that GCC had
| not turned on vectorization at optimization -O2 for the x68-64
| targets. The baseline for that arch has SSE2, so vectorization
| has always made sense there. The upcoming GCC 12 will have it
| enabled at -O2. I'd bet the Intel compiler always did
| vectorization at -O2 for their 64bit builds.
| pcwalton wrote:
| > RMS was right that compilers should be GPL licensed to
| prevent exactly this kind of thing (and worse things which are
| haven't happened yet).
|
| The problem with this is that it wouldn't solve the problem in
| question: Intel would just have stuck with their old compiler
| backend instead of LLVM.
|
| Besides, LLVM wouldn't have gotten investment to begin with if
| it were GPL licensed, since the entire reason for Apple's
| investment in LLVM is that it wasn't GPL. Ultimately, LLVM
| itself is a counterexample to RMS's theory that keeping
| compilers GPL can force organizations to do things: given deep
| enough pockets, a company can overcome that by developing non-
| GPL competitors.
| iib wrote:
| From what I saw of him, he never said that it is impossible
| to build non-GPL compilers, he said that the work free
| software developers do should not "help" proprietary
| software.
|
| So yes, he basically said that if you want to develop a
| proprietary compiler, it should cost you, and not take GCC as
| a base to freeload. Intel basing their new compilers on LLVM
| clearly saved them effort.
| bayindirh wrote:
| While clang is getting this support because it's not GPL,
| it's also providing a well deserved competition for GCC, and
| clang's presence woke the GCC devs to build a better
| compiler.
|
| All in all I avoid non-GPL compilers for my code, but I'm
| happy that clang acted as a big (hard) foam cluebat for GCC.
|
| In my opinion, we need a well polished GNU/GPL toolchain both
| to show it's possible, and provide a good benchmark to
| compete with. This competition is what drives us forward.
| Beltalowda wrote:
| > Besides, LLVM wouldn't have gotten investment to begin with
| if it were GPL licensed. The entire reason for Apple's
| investment in LLVM in the first place is that it wasn't GPL.
|
| I don't think that's the case; Apple/LLVM actually offered to
| sign over the copyright to the FSF, under the GPL; from
| https://gcc.gnu.org/legacy-ml/gcc/2005-11/msg00888.html
|
| > The patch I'm working on is GPL licensed and copyright will
| be assigned to the FSF under the standard Apple copyright
| assignment. Initially, I intend to link the LLVM libraries in
| from the existing LLVM distribution, mainly to simplify my
| work. This code is licensed under a BSD-like license [8], and
| LLVM itself will not initially be assigned to the FSF. If
| people are seriously in favor of LLVM being a long-term part
| of GCC, I personally believe that the LLVM community would
| agree to assign the copyright of LLVM itself to the FSF and
| we can work through these details.
|
| The reason people worked on LLVM/clang is that GCC was (and
| to some degree, is) not very good in various areas, and had a
| difficult community making fixing those issues hard. There's
| a reason a lot of these newer languages like Swift, Rust, and
| Zig are based on LLVM and not GCC. See e.g. https://undeadly.
| org/cgi?action=article&sid=20070915195203#p... for a run-down
| (from 2007, I'm not sure how many of these issues persist
| today; gcc has not stood still either of course, error
| messages are much better than they were in 2007 for example).
|
| GPL3 changed things a bit; I'm not sure Lattner would have
| made the same offer with GPL3 around, but that was from 2005
| when GPL3 didn't exist yet. But the idea that LLVM was
| _primarily_ motivated by license issues doesn 't seem to be
| the case, although it was probably seen as an additional
| benefit.
| mistrial9 wrote:
| > x68-64 targets
|
| thats a typo .. are you showing a case of AVX instructions not
| generated by GCC? where are the details here? Is SSE2 from
| twenty years ago?
|
| https://en.wikipedia.org/wiki/SSE2
| Kon-Peki wrote:
| This has been discussed on HN before.
|
| I don't condone Intel behavior, but let's be honest here: AMD
| underinvests in software and expects others to pick up the slack.
| That isn't acceptable.
| hpcjoe wrote:
| AMD[1], NVidia[2] do "make" their own compilers. AMD is
| notorious for a "build it and they will come" mentality.
| Despite the fact that this hasn't worked. AMD needs to make it
| easy to adopt their hardware, and the way this is done is with
| software.
|
| When they finally get to the point that their driver/libs are
| as easy to install as Nvidia's , it might be too late. I've
| argued this with AMD folks before.
|
| The barriers to adoption need to be low. Friction needs to be
| low. They need to target ubiquity[3].
|
| [1] https://developer.amd.com/amd-aocc/
|
| [2] https://developer.nvidia.com/nvidia-hpc-sdk-downloads
|
| [3] https://blog.scalability.org/2008/02/target-ubiquity-a-
| busin...
| Cloudef wrote:
| AMD is best on the linux right now. But thats mostly thanks
| to them opening up their hardware for driver developers.
| hpcjoe wrote:
| I was getting better performance out of the NVidia HPC SDK
| compilers, but then again, the old PGI compilers it is
| based upon (with an LLVM backend now), have always been my
| go-to for higher performance code.
|
| I've got some Epycs and Zen2s at home here, and I have both
| compilers. Haven't done testing in recent months, but
| they've been updating them, so maybe I should look into
| that again. Thanks for the nudge!
| ReleaseCandidat wrote:
| > NVidia[2] do "make" their own compilers.
|
| Actually Nvidia bought the Portland Compilers And Intel's
| Fortran compiler is (has been, now its backend is LLVM) MS's
| compiler via DEC/Compaq/and HP - MS Visual Fortran 4 -> DEC
| Visual Fortran 5 -> Compaq Visual Fortran 6 -> Intel Visual
| Fortran ;).
| hpcjoe wrote:
| I know about the PGI purchase ... was unaware of the Intel
| link to MSFT. Huh.
| salawat wrote:
| You call Nvidia driver installation easy? Every bit of "ease"
| about that is hardly Nvidia's doing.
| hpcjoe wrote:
| I'm not sure of what issue you have with my statement. For
| me, it is a painless download + sh NVIDIA-....run. I have
| mostly newer GPUs, though the 3 systems (1 laptop and 2
| desktops) with older GTX 750ti and GT 560m run the nouveau
| driver (as Nvidia dropped support for those).
|
| Its a 13 year old laptop, and still running strong (linux
| though). Desktops are Sandy Bridge based. The RTX2060 and
| RTX3060 are doing fine with the current drivers. I usually
| only update when CUDA changes.
|
| But yeah, its pretty simple. I can't speak to non-linux
| OSes generally, though my experiences with windows driver
| updates have always been fraught with danger.
|
| My zen2 laptop has an inbuilt Renior iGPU, and I use it
| with the NVidia dGPU also built (GTX 1660ti). I leverage
| the Linux Mint OSes packaging system there for the GPU
| switcher. I run the AMD on the laptop panel and the NVidia
| on the external display. Outside of weirdness with kernel
| 5.13, I've not had any problems with this setup.
| MereInterest wrote:
| There's a variety of options that are available here, and I
| don't buy the argument that AMD's behavior is automatically
| unethical.
|
| A. Company makes and sells hardware, and offers no software.
|
| B. Company makes and sells uniquely featured hardware, and
| offers software that uses those unique features.
|
| C. Company makes and sells hardware that adheres to an industry
| standard, and offers software that targets hardware adhering to
| that standard.
|
| D. Company makes and sells hardware that adheres to an industry
| standard, then uses their position in related markets to give
| themselves an unfair advantage in the hardware market.
|
| Of these, options A, B, and C are all acceptable options. AMD
| has traditionally chosen option A, which is a perfectly
| reasonable option. There's no reason that a company is
| obligated to participate in a complementary market. Option D is
| the only clearly unethical option.
| ReleaseCandidat wrote:
| > AMD has traditionally chosen option A, which is a perfectly
| reasonable option
|
| AMD has optimized libraries https://developer.amd.com/amd-
| aocl/ and their own compilers: https://developer.amd.com/amd-
| aocc/
| ncmncm wrote:
| Intel's legitimate course is to make their CPUs run _actually
| faster_ than the competition, instead of tricking people into
| running slower code on the competition.
| bri3d wrote:
| I think it's more nuanced than that:
|
| In the past, AMD just straight up had horrible software.
|
| More recently, AMD have been investing more in open software,
| probably with the goal that indeed, a community form and they
| get "leverage" / ROI for their investment.
|
| On the flip side, Intel invest heavily in high-quality but
| jealously guarded and closed source software.
|
| With this nuance, I'm not so sure it's clear cut which one is
| "acceptable," and it's an interesting ethical question about
| Open Source and open-ness in general.
| midjji wrote:
| AMD still has horrible software, compare cuda to whatever
| crap AMD thinks you should use. Truth is its even hard to say
| what their alternative is, not to mention how horribly poorly
| they support what is, or at least should be their second if
| not most important/lucrative target.
| post_break wrote:
| And Intel has sandbagged us with 4 cpu cores for ages, leading
| to software that isn't being optimized for more cores. Suddenly
| AMD starts pushing many cores with high single core performance
| and Intel magically turns hyperthreading on for lower tier cpus
| and starts putting out way more cores.
| CalChris wrote:
| AMD pays substantial royalties to Intel for x86.
|
| https://jolt.law.harvard.edu/digest/intel-and-the-x86-archit...
|
| However, this will become moot as even Intel is shifting
| towards LLVM.
|
| https://www.intel.com/content/www/us/en/developer/articles/t...
| stavros wrote:
| What should they have done instead? Built a compiler with a
| "cripple Intel" function? So people would have to download the
| executable that's fastest on their CPU, even though they use
| the same instruction set?
|
| The issue here is that they used a slower code path even on
| CPUs that could run the faster one, just because they were made
| by a competitor.
|
| You say "AMD should have made their own compiler", but why?
| What else should they have made? An OS? An office suite? Why?
| benbenolson wrote:
| Very likely, this was not done intentionally.
|
| I think we can simply imagine a common scenario: some
| employee working for Company X, developing a compiler suite,
| and adding necessary optimizations for Company X's
| processors. Meanwhile, Company Y's processors don't get as
| much focus (perhaps due to the employee not knowing about
| Company Y's CPUIDs, supported optimizations for different
| models, etc.). Thus, Company Y's processors don't run as
| quickly with this particular library.
|
| Why does this have to be malicious intent? Surely it's not
| surprising to you that Company X's software executes quicker
| on Company X's processors: I should hope that it does! The
| same would hold true if Company Y were to develop a compiler;
| unique features of their processors (and perhaps not Company
| X's) should be used to their fullest extent.
| magila wrote:
| No, this was definitely intentional. Intel is doing extra
| work to gate features on the manufacturer ID when there are
| feature bits which exist specifically to signal support for
| those features (and these bits were defined by Intel
| themselves!).
|
| If they had fixed the issue shortly after it was publicly
| disclosed it might have been unintentional, but this issue
| has been notorious for over a decade and they still refuse
| to remove the unnecessary checks. They know what they're
| doing.
| DiabloD3 wrote:
| Thats not how these CPUs work.
|
| The CPUID instruction allows software to query the CPU on
| if an instruction set is supported. Code emitted by Intel's
| compiler would only query if the instruction set exists if
| the CPU is from Intel, instead of just always detecting.
|
| AMD can choose to to implement (or not) any instruction set
| that Intel specifies, and Intel can choose to implement (or
| not) any instruction set AMD specifies, however, it would
| in 100% of cases be wrong to check who made the CPU instead
| of checking the implemented instruction set. AMD implements
| MMX, SSE1-4, AVX1 and 2. Any software compatible with these
| _must_ work on AMD CPUs that also implement these
| instructions.
|
| If AMD ever chooses to sue Intel over this (likely as a
| Sherman Act violation, same as the 2005 case), a court
| would likely side with AMD due to the aforementioned
| previous case: Intel has an established history of
| violating the law to further its own business interests.
| mtklein wrote:
| I'm with you generally, but having written some code
| targeting these instructions from a disinterested third-
| party perspective, there are big enough differences in
| some instructions in performance or even behavior that
| can sincerely drive you to inspect the particular CPU
| model and not just the cpuid bits offered.
|
| Off the top of my head, SSSE3 has a very flexible
| instruction to permute the 16 bytes of one xmm register
| at byte granularity using each byte of another xmm
| register to control the permutation. On many chips this
| is extremely cheap (eg 1 cycle) and its flexibility
| suggests certain algorithms that completely tank
| performance on other machines, eg old mobile x86 chips
| where it runs in microcode and takes dozens or maybe even
| hundreds of cycles to retire. There the best solution is
| to use a sequence of instructions instead of that single
| permute instruction, often only two or three depending on
| what you're up to. And you could certainly just use that
| replacement sequence everywhere, but if you want the best
| performance _everywhere_, you need to not only look for
| that SSSE3 bit but also somehow decide if that permute is
| fast so you can use it when it is.
|
| Much more seriously, Intel and AMD's instructions
| sometimes behave differently, within specification. The
| approximate reciprocal and reciprocal square root
| instructions are specified loosely enough that they can
| deliver significantly different results, to the point
| where an algorithm tuned on Intel to function perfectly
| might have some intermediate value from one of these
| approximate instructions end up with a slightly different
| value on AMD, and before you know it you end up with a
| number slightly less than zero where you expect zero, a
| NaN, square root of a negative number, etc. And this sort
| of slight variation can easily lead to a user-visible
| bug, a crash, or even an exploitable bug, like a buffer
| under/overflow. Even exhaustively tested code can fail if
| it runs on a chip that's not what you exhaustively tested
| on. Again, you might just decide to not use these
| loosely-specified instructions (which I entirely support)
| but if you're shooting for the absolute maximum
| performance, you'll find yourself tuning the constants of
| your algorithms up or down a few ulps depending on the
| particular CPU manufacturer or model.
|
| I've even discovered problems when using the high-level C
| intrinsics that correspond to these instructions across
| CPUs from the same manufacturer (Intel). AVX512 provided
| new versions of these approximations with increased
| precision, the instruction variants with a "14" in their
| mnemonic. If using intrinsics, instruction selection is
| up to your compiler, and you might find compiling a piece
| of code targeting AVX2 picks the old low precision
| version, while the compiler helpfully picks the new
| increased-precision instructions when targeting AVX-512.
| This leads to the same sorts of problems described in the
| previous paragraph.
|
| I really wish you could just read cpuid, and for the most
| part you're right that it's the best practice, but for
| absolutely maximum performance from this sort of code,
| sometimes you need more information, both for speed and
| safety. I know this was long-winded, and again, I
| entirely understand your argument and almost totally
| agree, but it's not 100%, more like 100-epsilon%, where
| that epsilon itself is sadly manufacturer-dependent.
|
| (I have never worked for Intel or AMD. I have been both
| delighted and disappointed by chips from both of them.)
| djmips wrote:
| I don't think you read the article. Go read it first before
| you make your hypothesis. If it was as easy to fix as using
| a environment variable (which no longer works) then it was
| done intentionally.
| bally0241 wrote:
| I don't think the fact that it can be enabled/disabled by
| environmental variable indicates malicious intent. It
| could be as simple as that Intel doesn't care to test
| there compiler optimizations on competitors' CPU's. If
| have to distribute two types of binaries (one which were
| optimized but could break, vs un-optimized and unlikely
| to break), I would default over to distributing the un-
| optimized version. Slow is better than broken.
|
| I understand some end users may not be able to re-compile
| the application for there machines, but I wouldn't say
| its Intel's fault, but rather the distributors of that
| particular application. For example, if AMD users want
| Solidworks to run faster on their system, they should ask
| Dassault Systemes for AMD-optimized binaries, not the
| upstream compiler developers!
|
| Anyways, for those compiling their own code, why would
| anyone expect an Intel compiler to produce equally
| optimized code for an AMD cpu? Just use gcc/clang or
| whatever AMD recommends.
| brasic wrote:
| https://news.ycombinator.com/newsguidelines.html
|
| > Please don't comment on whether someone read an
| article.
| colejohnson66 wrote:
| The thing is: the bits to check for SSE, SSE2, ..., AVX,
| AVX2, AVX-512? They're in the same spot on Intel and AMD
| CPUs. So you don't need to switch based on manufacturer.
| The fact that they force a `GenuineIntel` check makes it
| seems malicious to many.
| vlovich123 wrote:
| All browsers pretend to be MSIE (and all compilers
| pretend to be GCC). You'd think AMD would make it trivial
| to change the vendor ID string to GenuineIntel for
| "compatibility".
| not2b wrote:
| AMD should concentrate on making LLVM and GCC work great on
| AMD processors, by contributing the needed code. They are
| already making some contributions but could be doing more,
| and they could be funding experts to work on that and giving
| those experts the information they need.
| ReleaseCandidat wrote:
| They do. Actually their own (LLVM based) compilers are
| about as fast as GCC and LLVM
|
| https://www.phoronix.com/scan.php?page=article&item=aocc32-
| c...
| Jap2-0 wrote:
| I don't know if it necessarily says much that their LLVM-
| based compiler is about as fast as LLVM.
| DiabloD3 wrote:
| But they already do this. AMD is one of the largest
| corporate contributors to LLVM and GCC.
|
| It's Intel that tends to phone this in and make everyone
| else pick up the slack.
| jcranmer wrote:
| Per
| https://www.phoronix.com/scan.php?page=news_item&px=LLVM-
| Rec..., Intel actually contributes (slightly) more to
| LLVM than AMD does.
| pedrocr wrote:
| To fix this problem AMD would have to work on making LLVM
| and GCC work great on _Intel_ processors. That would be the
| only way to make people not use the Intel compiler for
| extra performance and ending up with binaries that are
| crippled for AMD. Clearly that 's not a solution for this
| problem.
| mhh__ wrote:
| AMD's software offerings (e.g. look at uProf vs vTune) are
| functional at best. Intel's are much easier to use, have a
| lot more documentation, and actually make your life easier
| versus having basically just a firehose of data.
| amelius wrote:
| I think it's great if a hardware company leaves the software
| for others. This leads to open specifications.
| ethbr0 wrote:
| At the firmware / driver level, fully open specifications for
| high performance hardware is an impossible dream.
|
| At best, detailed documentation is a lower priority item
| below "make it work" and "increase performance".
|
| At worst, it requires exposing trade secrets.
|
| _Edit_ : It'd probably be more productive for everyone if we
| set incentives and work such that the goal we want (compilers
| that produce code that runs optimally on Intel, AMD, and
| other architectures) isn't contingent on Intel writing them
| for non-Intel architectures. (Said somewhat curmudgeonly,
| because everyone complains about things like this, but also
| doesn't really how insanely hard and frustratingly edge-case-
| ridden compiler work is)
| sedatk wrote:
| No, just don't falsely market your product as fair or
| neutral.
| dodobirdlord wrote:
| It's the Intel MKL, I don't think Intel has ever even
| endorsed using it on other vendors CPUs, much less claimed
| that it is "fair" or "neutral".
| ReleaseCandidat wrote:
| Well: On November 12, 2009 AMD and
| Intel Corporation announced a comprehensive settlement
| agreement to end all outstanding legal disputes between
| the companies, including antitrust and patent cross
| license disputes. In addition to a payment of $1.25B that
| Intel made to AMD, Intel agreed to abide by an important
| set of ground rules that continue in effect until
| November 11, 2019. Customers and Partners
| With respect to customers and partners, Intel must not:*
| [...] Intentionally include design/engineering
| elements in its products that artificially impair the
| performance of any AMD microprocessor.
|
| https://www.amd.com/en/corporate/antitrust-ruling
|
| I like that 'in effect until November 11, 2019.' part :D
| mhh__ wrote:
| If Intel did that there probably wouldn't be a software suite
| at all for their processors.
|
| Compare to vTune just about all open source profilers are
| either a bad joke or like programming in Basic in a C++ age.
| IntelThrowaway1 wrote:
| The thing that gets me about Intel's culture, as someone who
| worked there, was that Intel as an organisation was completely
| unable to actually accept they'd done anything wrong. Ever.
|
| There are lots of cases where Intel has either screwed up or
| done things that were unarguably anti-competitive. It happens
| at every company, I don't like Uber, but I'm not going to blame
| Uber today for the fuckery that Kalanick got up to.
|
| In each case you could ask the Intel HR, or Intel senior
| management what they thought about it and it was never Intel's
| fault. The answers to any questions about this sort of stuff
| would be full of pettifogging, passsive voice, and legalese.
| The result was the internal culture was an extremely low trust
| environment since you _knew_ people were willing to be
| transparantly intellectually dishonest to further their
| careers. I haven 't been there since Gelsinger arrived but I
| hope that changes, I wonder how much it can change in the legal
| environment we're in.
| kmeisthax wrote:
| I don't think this is dishonesty - it's auteur mentality. In
| Intel's view, AMD was a second-source vendor that went rogue,
| and gets to free-ride on their patents because Intel couldn't
| be arsed to extend x86 to 64-bit. If they had their way,
| they'd own the x86 ISA interface and all their competition
| would be incompatible architectures that you have to
| recompile for. Crippling AMD processors with their C compiler
| wasn't dishonest, it was DRM to protect their """intellectual
| property"""[0].
|
| Gelsinger was the head designer on the 486, so he was around
| during the time when Intel was obsessed with keeping
| competition out of their ISA and probably has a case of
| auteur mentality, too.
|
| [0] In case you couldn't tell, I _really hate_ this word. The
| underlying concepts are, at best, necessary evils.
| holdenk wrote:
| Huh I had wondered why I saw so many Python packages blacklist
| MKL now I know why.
| dbcurtis wrote:
| The philosophy behind MKL is that each CPU vendor provides an
| MKL for their CPU. If you expect to mix and match MKLs and
| CPUs, you don't understand the goals of MKL.
| monocasa wrote:
| Are there any implementations of MKL other than Intel's?
| ReleaseCandidat wrote:
| No. There are AMD's AOCL and Apple's 'Accelerate', but of
| subsets of the MKL only AFAIK.
|
| https://developer.amd.com/amd-aocl/
| https://developer.apple.com/documentation/accelerate
| stephencanon wrote:
| Accelerate and MKL have some overlap (notably BLAS,
| LAPACK, signal processing libraries and basic vectorized
| math operations), but each also contains a whole bunch of
| API that the other lacks. Neither is a subset of the
| other.
|
| They both contain a sparse matrix library, but exactly
| what operations are offered is somewhat different between
| the two. They both have image processing operations, but
| fairly different ones. Accelerate has BNNS, MKL has its
| own set of deep learning interfaces...
| wyldfire wrote:
| Each CPU vendor or each CPU architecture? (genuinely asking,
| I don't know how it's intended)
| wmf wrote:
| Each vendor. Intel BLAS (MKL) has Intel-specific
| optimizations and AMD BLAS has AMD-specific optimizations.
|
| Intel is still acting in bad faith by allowing MKL to run
| in crippled mode on AMD. They should either let it use all
| available instructions or make it refuse to run.
| danieldk wrote:
| The latest oneMKL versions have sgemm/dgemm kernels for
| Zen CPUs that are almost as fast as the AVX2 kernels
| (that require disabling Intel CPU detection on Zen).
| bee_rider wrote:
| The expectation in the HPC community is that an interested
| vendor will provide their own BLAS/LAPACK implementation
| (MKL is a BLAS/LAPACK implementation, along with a bunch of
| other stuff), which is well-tuned for their hardware. These
| sort of libraries aren't just tuned for an architecture,
| they might be tuned for a given generation or even
| particular SKUs.
| hallway_monitor wrote:
| I learned about this recently when trying to optimize ML
| test architecture running on Azure. It turns out having
| access to Ice Lake chips would allow optimizations that
| should decrease compute time and therefore cost by
| 20-30%.
| bee_rider wrote:
| Some AVX-512 stuff I guess?
|
| AVX-512 had a rough rollout, but it seems like it is
| finally turning into something nice.
| ReleaseCandidat wrote:
| That would be 'each CPU vendor provides an optimized BLAS
| library for their CPU'. The problem is that Intel's MKL is
| more than just BLAS.
|
| But AMD does have its own optimized libraries:
|
| https://developer.amd.com/amd-aocl/
| marginalia_nu wrote:
| I'm getting flashbacks to the AARD code and Microsoft's attempts
| to sabotage DR-DOS.
___________________________________________________________________
(page generated 2022-04-06 23:01 UTC)