https://www.tomshardware.com/pc-components/cpus/china-planning-1600-core-chips-that-use-an-entire-wafer-similar-to-american-company-cerebras-wafer-scale-design Skip to main content (*) ( ) Open menu Close menu Tom's Hardware [ ] Search Search Tom's Hardware [ ] RSS US Edition flag of US flag of UK UK flag of US US flag of Australia Australia flag of Canada Canada * * Reviews * Best Picks * Raspberry Pi * CPUs * GPUs * Coupons * Newsletter * More + News + PC Components + SSDs + Motherboards + PC Building + Monitors + Laptops + Desktops + Cooling + Cases + RAM + Power Supplies + Peripherals + Overclocking + 3D Printers + About Us Forums Trending * Best of CES 2024 * Ryzen 8000G * RTX 40-series Super * 14th-Gen Raptor Lake * PCIe 5.0 SSDs When you purchase through links on our site, we may earn an affiliate commission. Here's how it works. 1. PC Components 2. CPUs China planning 1,600-core chips that use an entire wafer -- similar to American company Cerebras 'wafer-scale' designs News By Anton Shilov published 4 January 2024 Bigger is better. * * * * * * * Comments (23) GlobalWafers (Image credit: GlobalWafers) Scientists from the Institute of Computing Technology at the Chinese Academy of Sciences introduced an advanced 256-core multi-chiplet and have plans to scale the design up to 1,600-core chips that employ an entire wafer as one compute device. It is getting harder and harder to increase transistor density with every new generation of chips, so chipmakers are looking for other ways to increase performance of their processors, which includes architectural innovations, larger die sizes, multi-chiplet designs, and even wafer-scale chips. The latter has only been managed by Cerebras so far, but it looks like Chinese developers are looking towards them as well. Apparently, they have already built a 256-core multi-chiplet design and are exploring ways to go wafer-scale, using an entire wafer to build one large chip. Scientists from the Institute of Computing Technology at the Chinese Ac ademy of Sciences introduced an advanced 256-core multi-chiplet compute complex called Zhejiang Big Chip in a recent publication in the journal Fundamental Research, as reported by The Next Platform. The multi-chiplet design consists of 16 chiplets containing 16 RISC-V cores each and connected to each other in a conventional symmetric multiprocessor (SMP) manner using a network-on-chip so that the chiplets could share memory. Each chiplet has multiple die-to-die interfaces to connect to neighbor chiplets over a 2.5D interposer and the CAS researchers say that the design is scalable to 100 chiplets, or to 1,600 cores. Zhejiang (Image credit: Science Direct) Zhejiang chiplets are reportedly made on a 22nm-class process technology, presumably by Semiconductor Manufacturing International Corp. (SMIC). We are not sure how much power would a 1,600-core assembly interconnected using an interposer and made on a 22nm production node would consume. However, as The Next Platform points out, there is nothing that stops CAS to produce a 1,600-core wafer-scale chip, which would greatly optimize their power consumption and performance due to reduced latencies. The paper explores the limits of lithography and chiplet technology and discusses the potential of this new architecture for future computing needs. Multi-chiplet designs could be used to build processors for exascale supercomputers, the researchers note, something that AMD and Intel do today. "For the current and future exascale computing, we predict a hierarchical chiplet architecture as a powerful and flexible solution," the researchers wrote. "The hierarchical-chiplet architecture is designed as many cores and many chiplets with hierarchical interconnect. Inside the chiplet, cores are communicated using ultra-low-latency interconnect while inter-chiplet are interconnected with low latency beneficial from the advanced packaging technology, such that the on-chiplet latency and the NUMA effect in such high-scalability system can be minimized." Meanwhile, the CAS researchers propose to use multi-level memory hierarchy for such assemblies, which could potentially introduce difficulties with programming of such devices. "The memory hierarchy contains core memory [caches], on-chiplet memory and off-chiplet memory," the description reads. "The memory from these three levels vary in terms of memory bandwidth, latency, power consumption and cost. In the overview of hierarchical-chiplet architecture, multiple cores are connected through cross switch and they share a cache. This forms a pod structure and the pod is interconnected through the intra-chiplet network. Multiple pods form a chiplet and the chiplet is interconnect through the inter-chiplet network and then connects to the off-chip(let) memory. Careful design is needed to make full use of such hierarchy. Reasonably utilizing the memory bandwidth to balance the workload of different computing hierarchy can significantly improve the chiplet system efficiency. Properly designing the communication network resource can ensure the chiplet collaboratively performing the shared-memory task." The Big Chip design could also take advantage of such things as optical-electronic computing, near-memory computing, and 3D stacked memory. However, the paper stops short of providing specific details on the implementation of these technologies or addressing the challenges they might pose in the design and construction of such complex systems. Meanwhile, The Next Platform assumes that CAS has already built its 256-core Zhejiang Big Chip multi-chiplet compute complex. From here, the company can explore performance of its chiplet design and then make decisions regarding system-in-packages with a higher number of cores, different classes of memory, and wafer-scale integration. Stay on the Cutting Edge Join the experts who read Tom's Hardware for the inside track on enthusiast PC tech news -- and have for over 25 years. We'll send breaking news and in-depth reviews of CPUs, GPUs, AI, maker hardware and more straight to your inbox. [ ][ ]Contact me with news and offers from other Future brands[ ]Receive email from us on behalf of our trusted partners or sponsors[Sign me up] By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over. Anton Shilov Anton Shilov Social Links Navigation Freelance News Writer Anton Shilov is a Freelance News Writer at Tom's Hardware US. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends. See more CPUs News More about cpus Core i9-13900K QS tested Intel's next-gen Arrow Lake CPUs might come without hyperthreaded cores -- leak points to 24 CPU cores, DDR5-6400 support, and a new 800-series chipset 5th Generation Xeon Emerald Rapids CPU Intel's Granite Rapids listed with huge L3 cache upgrade to tackle AMD EPYC - software development emulator spills the details Latest Nvidia RTX Remix RTX Remix, the tool being used to create Half-Life 2 RTX, enters open beta today -- now anyone can remaster old DirectX 8 or 9 games See more latest > See all comments (23) [ ] 23 Comments Comment from the forums * bit_user I think this doesn't have too much potential, as a general-purpose architecture. The main problem is how to connect up enough RAM to support all of those cores running general-purpose workloads. Even if you could connect the RAM and have enough bandwidth, maintaining cache coherency over 1600 cores would seem to be quite taxing. Right now, the best way to use such dense compute is in dataflow computing, like what Cerebras does. Reply * TCA_ChinChin bit_user said: I think this doesn't have too much potential, as a general-purpose architecture. The main problem is how to connect up enough RAM to support all of those cores running general-purpose workloads. Even if you could connect the RAM and have enough bandwidth, maintaining cache coherency over 1600 cores would seem to be quite taxing. Right now, the best way to use such dense compute is in dataflow computing, like what Cerebras does. I think this is more of a proof of concept and research design rather than something pushed to premature product. I imagine this would lead to domestic capability in the same realm as what Cerebras does currently, when it actually matures. Reply * toffty Two main issues with this approach are: 1. Keeping lanes the same length to each core to memory 2. Cooling such a behemoth Let alone imperfections. Depending on the transistor size, they'll never get a fully working chip Reply * bit_user toffty said: Two main issues with this approach are: 1. Keeping lanes the same length to each core to memory Why? toffty said: 2. Cooling such a behemoth They can run it at a low enough clock speed to make the heat manageable. Here are some specs on Cerebras' CS-1: https://www.eetimes.com/powering-and-cooling-a-wafer-scale-die/ This shows exploded views + info about the CS-2: https://www.cerebras.net/cs2virtualtour toffty said: Let alone imperfections. Depending on the transistor size, they'll never get a fully working chip Cerebras reported full yields of their WSE-1. They built enough redundancy into each die that they didn't even have to disable any of them. Reply * ThomasKinsley Let me get this straight. China is preparing to produce a wafer chip, the likes of which only Cerebras has made. And this is happening amid a CIA investigation into a potential leak of Cerebras technology into China's hands from a United Arab Emirates company headed by an ethnic Chinese CEO who renounced his American citizenship for UAE citizenship? Reply * George3 ThomasKinsley said: Let me get this straight. China is preparing to produce a wafer chip, the likes of which only Cerebras has made. And this is happening amid a CIA investigation into a potential leak of Cerebras technology into China's hands from a United Arab Emirates company headed by an ethnic Chinese CEO who renounced his American citizenship for UAE citizenship? You apparently failed to understand that they already have a 256 core model that they hope they can increase further. If they copied, they would already be eating whole silicon wafers, with no intermediate stages. Reply * ThomasKinsley George3 said: You apparently failed to understand that they already have a 256 core model that they hope they can increase further. If they copied, they would already be eating whole silicon wafers, with no intermediate stages. The 256 core model is not at wafer scale as the new 1,600 core chip is. The article indicates Cerebras finally figured out how to do it after overcoming significant manufacturing complexity. The timing of this is peculiar given that there is an international investigation analyzing whether G42 gave China Cerebras IP. It's not proof, but it's indicative that there may have been a technology transfer. Reply * eryenakgun bit_user said: I think this doesn't have too much potential, as a general-purpose architecture. The main problem is how to connect up enough RAM to support all of those cores running general-purpose workloads. Even if you could connect the RAM and have enough bandwidth, maintaining cache coherency over 1600 cores would seem to be quite taxing. Right now, the best way to use such dense compute is in dataflow computing, like what Cerebras does. Current Reply * eryenakgun bit_user said: I think this doesn't have too much potential, as a general-purpose architecture. The main problem is how to connect up enough RAM to support all of those cores running general-purpose workloads. Even if you could connect the RAM and have enough bandwidth, maintaining cache coherency over 1600 cores would seem to be quite taxing. Right now, the best way to use such dense compute is in dataflow computing, like what Cerebras does. Current GPUs has 10k-20k cores in it, while all of them connected in common RAM bus. It is not a rocket science. Just connect CPUs together. And link them to RAM. Reply * Notton To me, this looks like an experiment to test how good the domestic chip production is. Even if it doesn't work well, they will gain experience from it. Reply * View All 23 Comments Show more comments Most Popular [missing-im]Nvidia's laptop GPUs appear in desktop PC graphics cards in China -- upping the power limits unleashes more gaming performance, but these cards come with risks By Anton ShilovJanuary 22, 2024 [missing-im]AMD's customers begin receiving the first Instinct MI300X AI GPUs -- the company's toughest competitor to Nvidia's AI dominance is now shipping By Anton ShilovJanuary 22, 2024 [missing-im]Amazon sold a fake RTX 4090 FrankenGPU cobbled together using a 4080 GPU and board -- scam card was found in a returns pallet deal By Mark TysonJanuary 22, 2024 [missing-im]Intel's next-gen Arrow Lake CPUs might come without hyperthreaded cores -- leak points to 24 CPU cores, DDR5-6400 support, and a new 800-series chipset By Anton ShilovJanuary 21, 2024 [missing-im]Nvidia's fabled Nintendo 3DS prototype has been leaked -- Rare Gaming Dump reveals the MG20 By Christopher HarperJanuary 21, 2024 [missing-im]Windows 11 squeezed into a mere 100MB using text-only trick -- Tiny11 maker NTDEV takes Windows install image challenge to the extreme By Roshan Ashraf ShaikhJanuary 21, 2024 [missing-im]Intel's Granite Rapids listed with huge L3 cache upgrade to tackle AMD EPYC - software development emulator spills the details By Christopher HarperJanuary 21, 2024 [missing-im]New memory card uses DNA to store your data -- Biomemory's card costs $1,100 to store one kilobyte of data By Christopher HarperJanuary 21, 2024 [missing-im]Raspberry Pi 5 squares off against a scrawny Intel CPU -- Intel N100 quad-core Alder Lake-N chip proves to be a strong competitor By Ash HillJanuary 20, 2024 [missing-im]ASRock prepped two new exotic Radeon GPUs that'll be hard to find -- China-exclusive Radeon RX 7900 GRE comes in Steel Legend and Challenger flavors By Aaron KlotzJanuary 20, 2024 [missing-im]SSD overclocking yields big performance gains -- overclocked SSD controller and NAND may lead to premature degradation or data loss, though By Roshan Ashraf ShaikhJanuary 20, 2024 Tom's Hardware is part of Future US Inc, an international media group and leading digital publisher. Visit our corporate site. * Terms and conditions * Contact Future's experts * Privacy policy * Cookies policy * Accessibility Statement * Advertise * About us * Coupons * Careers (c) Future US, Inc. Full 7th Floor, 130 West 42nd Street, New York, NY 10036.