https://www.tomshardware.com/tech-industry/artificial-intelligence/ai-engineers-build-new-algorithm-for-ai-processing-replace-complex-floating-point-multiplication-with-integer-addition Skip to main content (*) ( ) Open menu Close menu Tom's Hardware [ ] Search Search Tom's Hardware [ ] RSS US Edition flag of US flag of UK UK flag of US US flag of Australia Australia flag of Canada Canada * * Best Picks * Raspberry Pi * CPUs * GPUs * 3D Printers * News * Coupons * More + Newsletter + PC Components + SSDs + Motherboards + PC Building + Monitors + Laptops + Desktops + Cooling + Cases + RAM + Power Supplies + 3D Printers + Peripherals + Overclocking + About Us + Reviews Forums Trending * Intel Arrow Lake * AMD EPYC * Nvidia Blackwell * Intel Lunar Lake * AMD Instinct MI355X 1. Tech Industry 2. Artificial Intelligence AI engineers claim new algorithm reduces AI power consumption by 95% -- replaces complex floating-point multiplication with integer addition News By Jowi Morales published 17 October 2024 Addition is simpler than multiplication, after all. * * * * * * * Comments (10) When you purchase through links on our site, we may earn an affiliate commission. Here's how it works. addition sign floating on hand (Image credit: Shutterstock) Engineers from BitEnergy AI, a firm specializing in AI inference technology, has developed a means of artificial intelligence processing that replaces floating-point multiplication (FPM) with integer addition. The new method, called Linear-Complexity Multiplication (L-Mul), comes close to the results of FPM while using the simpler algorithm. But despite that, it's still able to maintain the high accuracy and precision that FPM is known for. As TechXplore reports, this method reduces the power consumption of AI systems, potentially up to 95%, making it a crucial development for our AI future. Since this is a new process, popular and readily available hardware on the market, like Nvidia's upcoming Blackwell GPUs, aren't designed to handle this algorithm. So, even if BitEnergy AI's algorithm is confirmed to perform at the same level as FPM, we still need systems that could handle it. This might give a few AI companies pause, especially after they just invested millions, or even billions, of dollars in AI hardware. Nevertheless, the massive 95% reduction in power consumption would probably make the biggest tech companies jump ship, especially if AI chip makers build application-specific integrated circuits (ASICs) that will take advantage of the algorithm. Power is now the primary constraint on AI development, with all data center GPUs sold last year alone consuming more power than one million homes in a year. Even Google put its climate target in the backseat because of AI's power demands, with its greenhouse gas emissions increasing by 48% from 2019, instead of declining year-on-year, as expected. The company's former CEO even suggested opening the floodgates for power production by dropping climate goals and using more advanced AI to solve the global warming problem. But if AI processing can be more power efficient, then it seems that we can still get advanced AI technologies without sacrificing the planet. Aside from that, this 95% drop in energy use would also reduce the burden that these massive data centers put on the national grid, reducing the need to build more energy plants to power our future quickly. While most of us are amazed by the additional power that new AI chips bring every generation, true advancement only comes when these processors are more powerful and more efficient. So, if L-Mul works as advertised, then humanity could have its AI cake and eat it, too. Stay On the Cutting Edge: Get the Tom's Hardware Newsletter Get Tom's Hardware's best news and in-depth reviews, straight to your inbox. [ ][ ]Contact me with news and offers from other Future brands[ ]Receive email from us on behalf of our trusted partners or sponsors[Sign me up] By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over. Jowi Morales Jowi Morales Social Links Navigation Contributing Writer Jowi Morales is a tech enthusiast with years of experience working in the industry. He's been writing with several tech publications since 2021, where he's been interested in tech hardware and consumer electronics. More about artificial intelligence shutterstock_2284126663 AI researcher scrapes usable data from a 35-second screen recording for less than one cent via Google Gemini Microsoft Microsoft Azure CTO claims distribution of AI training is needed as AI datacenters approach power grid limits Latest Intel Pentium 4 Former Intel CPU engineer details how internal x86-64 efforts were suppressed prior to AMD64's success See more latest > See all comments (10) [ ] 10 Comments Comment from the forums * yahrightthere Seeing is believing, where's the white paper on this? As I could not find it. As for the load on the grid, have seen many reports of data center's inking deals to get this that & the other nuclear sites back up & running on line, as well as adding new nuclear sites & including small modular reactors, which would add to the infrastructure of the grid & reduce the load. It's understood that all this will take time, money & efforts from all facets to accomplish this. Reply * ekio If that can apply to ClosedAI, Meta, Google and co, that would be a game changer but without proof, no beliefs. Reply * nitrium "potentially up to 95%". I mean that corporate speak, for anywhere from 0% to 95%. The "Up to" number is not something anyone cares about. What's the average saving for typical AI workloads? Reply * Mama Changa They don't say at what level of precision. Is it like fp4, fp16 etc. Also, have they never heard of fixed point math? Reply * Li Ken-un "Work smarter not harder." The operating cost to feed the power-hungry algorithms should convince them if the 95% reduction is true. What's the relatively fixed cost of investment into the hardware and nuclear power plants compared to the ongoing cost of feeding the less efficient algorithms? Reply * JTWrenn Not sure if promising or just a flare for the hope of capital investment. The hedged wording and apparently no fully working product screams "please invest in us" to me. Reply * AkroZ Here is the paper: https://arxiv.org/html/2410.00907v2 I have read it, it's interesting but it is listing only the advantages and not the downfalls, basically this is a paper to ask for investments. They demonstrate higher precision than FP8 with theorically less costly operations but their implementation is a FP32 meaning that it use 4 times more memory and they do not calculate the potential energy drain of those memory operations. This is not considered for inference but only for the execution of models (as memory is the main limiting factor), notably for AI processor unit. Reply * bit_user AkroZ said: Here is the paper: https://arxiv.org/html/2410.00907v2 Thanks for this! @yahrightthere take note! AkroZ said: I have read it, it's interesting but it is listing only the advantages and not the downfalls, basically this is a paper to ask for investments. They do list its limitations. AkroZ said: They demonstrate higher precision than FP8 with theorically less costly operations but their implementation is a FP32 meaning that it use 4 times more memory and they do not calculate the potential energy drain of those memory operations. They merely prototyped it on existing hardware. Nvidia GPUs, to be precise. Nvidia doesn't support general arithmetic on lower-precision data types than that. From briefly skimming the paper, I think they're actually proposing to implement it at 16 bit, but they also work out the implementation cost at 8-bit. AkroZ said: This is not considered for inference but only for the execution of models "inference" is the term used for what I think you mean by "execution of models". Here's what the abstract says: "We further show that replacing all floating point multiplications with 3-bit mantissa -Mul in a transformer model achieves equivalent precision as using float8_e4m3 as accumulation precision in both fine-tuning and inference." So, they claim that it's applicable to both inference and a subset of training work (i.e. fine-tuning). Reply * bit_user Mama Changa said: They don't say at what level of precision. The paper mostly focuses on comparing it against different fp8 number formats. Mama Changa said: Also, have they never heard of fixed point math? What good would that do? The problem with fp multiplication is in the mantissa, which is actually cheaper than multiplying fixed-point, since it's fewer bits. Reply * ex_bubblehead yahrightthere said: Seeing is believing, where's the white paper on this? As I could not find it. As for the load on the grid, have seen many reports of data center's inking deals to get this that & the other nuclear sites back up & running on line, as well as adding new nuclear sites & including small modular reactors, which would add to the infrastructure of the grid & reduce the load. It's understood that all this will take time, money & efforts from all facets to accomplish this. Billions of $$ and decades to implement. I'm not holding my breath. Reply * View All 10 Comments Show more comments Most Popular [missing-im] Samsung reportedly delays purchase of fab tools for fab in Texas -- chipmaker allegedly faces challenges with securing clients [missing-im] Intel Core 200U CPU spotted with Alder Lake silicon -- Core 7 250U shows identical configuration as the Core 7 150U, Core i7-1355U and Core i7-1255U [missing-im] This Spooky Raspberry Pi Halloween eye uses AI to stalk you around the room [missing-im] Intel CEO displays Panther Lake sample at Lenovo event [missing-im] Turn your monitor into a powerful all-in-one computer with the MSI Cubi NUC 1M mini PC [missing-im] AMD Ryzen AI 300 MAX series APU support spotted in latest chipset driver from Asus [missing-im] Microsoft Azure CTO claims distribution of AI training is needed as AI datacenters approach power grid limits [missing-im] University student builds simple raycaster maze demo with transparency support in Microsoft Excel [missing-im] Arm wants to sell directly to Chinese customers, sidestep Arm China [missing-im] Qualcomm abruptly cancels Snapdragon X Elite dev kit -- refunds customers for mini PC, ends sales and support for the device immediately [missing-im] Asus ROG Thor III PSUs come with magnetic OLED displays -- and power ratings up to 1,600W Tom's Hardware is part of Future US Inc, an international media group and leading digital publisher. Visit our corporate site. * Terms and conditions * Contact Future's experts * Privacy policy * Cookies policy * Accessibility Statement * Advertise with us * About us * Coupons * Careers (c) Future US, Inc. Full 7th Floor, 130 West 42nd Street, New York, NY 10036. []