[HN Gopher] Will Supercapacitors Come to AI's Rescue?
___________________________________________________________________
Will Supercapacitors Come to AI's Rescue?
Author : mfiguiere
Score : 32 points
Date : 2025-05-06 19:30 UTC (3 hours ago)
(HTM) web link (spectrum.ieee.org)
(TXT) w3m dump (spectrum.ieee.org)
| Animats wrote:
| Is that kind of load variation from large data centers really a
| problem to the power grid? There are much worse intermittent
| loads, such as an electric furnace or a rolling mill.
| paulkrush wrote:
| Edit: It's interesting the GPU's are causing issues on the grid
| before they cause issues with the data center's power.
| mystified5016 wrote:
| Read the article.
| toast0 wrote:
| I suspect it's more of a problem for the data center's energy
| bill. My understanding is that large electric customers pay a
| demand charge in addition to the volumetric charge for the
| kWh's use at whatever rates given time of use / wholesale
| rates. The demand charge is based on the maximum kW used (or
| sometimes just the connection size) and may also have a penalty
| rate if the power factor is poor. Smoothing over small duration
| surges probably makes a lot of things nicer for the rate payer,
| including helping manage fluctuations from the utility.
|
| There's probably something that could be done on the individual
| systems so that they don't modulate power use quite so fast,
| too; at some latency cost, of course. If you go all the way to
| the extremes, you might add a zero crossing detector and use it
| to time clock speed increases.
| timewizard wrote:
| If you have a working thermometer you can predict when
| furnaces are going to run.
|
| If you want to smooth out data centers then you need hourly
| pricing to force them to manage their demand into periods
| where excess grid capacity is not being used to serve
| residential loads.
| hinkley wrote:
| Large customers pay not by wattage but by... I'm spacing on
| the word but essentially how much their power draw fucks up
| the sine waves for voltage and current in the power grid.
|
| I imagine common power rail systems in hyperscaler equipment
| helps a bit with this, but for sure switching PSUS chop up
| the input voltage and smooth it out. And that leads to very
| strange power draws.
| murderfs wrote:
| You're probably thinking of power factor, which is usually
| not a big deal for datacenters. All of your power supplies
| are going to have active PFC, and anything behind a double
| conversion UPS is going to get PFC from the UPS. The
| biggest contributors are probably going to be the fans in
| the air conditioning units.
| Animats wrote:
| This isn't about power factor. That's a current vs.
| voltage thing within one cycle. It's about demand side
| ramp rate - how fast load can go up and down.
|
| Ramp rate has been a generation side thing for a century.
| Every morning, load increases from the pre-dawn low, and
| which generators can ramp up output at what speed
| matters. Ramp rate is usually measured in
| megawatts/minute. Big thermal plants, atomic and coal,
| have the lowest ramp rates, a few percent per minute.
|
| Ramp rate demand side, though, is a new thing. There are
| discussions about it [1] but it's not currently something
| that's a parameter in electric energy bills.
|
| [1] https://www.aceee.org/files/proceedings/2012/data/pap
| ers/019...
| oakwhiz wrote:
| There is often a demand flux surcharge as well. Not just
| demand but delta in demand over some time period.
| changoplatanero wrote:
| Yes its a problem for the grid and the power companies don't
| allow large clusters to oscillate their power like this. The
| solution that AI have to do during their training big runs is
| to fill in the idle time on the GPUs with dummy operations to
| keep the power load constant. Having capacitors would be able
| to save on power usage.
| nancyminusone wrote:
| Inb4 a startup is created to sell power load idle cycle
| compute time in AI training data centers.
| mystified5016 wrote:
| Those loads aren't nearly as intermittent. Your furnace likely
| runs for tens of minutes at a time. These datacenters are
| looking at second-to-second loads.
|
| Drawing high intermittent loads at high frequency likely makes
| the utility upset and leads to over-building supply to the
| customer to cope with peak load. If you can shave down those
| peaks, you can use a smaller(cheaper) supply connection. A
| smoother load will also make the utility happy.
|
| Remember that electricity generation cannot ramp up and down
| quickly. Big transient loads can cause a lot of problems
| through the whole network.
| paulkrush wrote:
| "Thousands of GPUs all linked together turning on and off at the
| same time." So supercapacitors allow for simpler software?,
| reduced latency? at a low cost?
| mjevans wrote:
| They service 'spot demand moderation' as an extension of UPS
| and power smoothing. In this case it's flattening out spikes to
| smooth slopes.
| sonium wrote:
| Or you simply use the pytorch.powerplant_no_blow_up operator [1]
|
| [1] https://www.youtube.com/watch?v=vXsT6lBf0X4
| janalsncm wrote:
| Pretty much. From the article:
|
| > Another solution is dummy calculations, which run while there
| are no spikes, to smooth out demand.
| 0cf8612b2e1e wrote:
| One solution is to rely on backup power supplies and batteries to
| charge and discharge, providing extra power quickly. However,
| much like a phone battery degrades after multiple recharge
| cycles, lithium-ion batteries degrade quickly when charging and
| discharging at this high rate.
|
| Is this really a problem for an industrial installation? I would
| imagine that a properly sized facility would have adequate
| cooling + capacity to only run the batteries within optimal spec.
| Solar plants are already charging/discharging their batteries
| daily.
| jeffbee wrote:
| In addition to what you said, nothing is forcing or even
| encouraging anyone to use lithium-ion batteries in fixed
| service, such as a rack full of computers.
| pixl97 wrote:
| Eh, I think part of the problem here is the speed of load
| switching. From the article it looks like the loads could
| generate dozens to hundreds of demand spikes per minute. With
| most battery operated loads that I've ever messed with we're
| not switching loads like that. It's typically 'oh a fault,
| switch to battery' then some time later you check the power
| circuit to see if it's up and switch back.
|
| This looks a whole lot more like high frequency load smoothing.
| Really it seems to me like a continuation of a motherboard.
| Even if you have a battery backup on your PC you still have
| capacitors on the board for voltage fluctuations.
| lstodd wrote:
| in a properly designed install you can actually use the
| compressors and fans for smoothing load spikes. won't be much,
| but why not.
|
| edit: otherwise I'm not getting what the entire article is
| about. it's as contrary to what I know about datacenter design
| as it can get.
|
| it's.. just wrong.
| touisteur wrote:
| I'm thinking of sequences of 'put the sharded dataset through
| the ten thousand 2kW GPUs, then wait on network - all-reduce
| - then spike again - a mostly-synchronous all-on/all-off
| loop. Watching how quick they get to boost-frequency I can
| see where the worries come from.
| lstodd wrote:
| does anyone actually do those kind of loads over entire
| dcs?
|
| because if so, I have some nice east-european guys to teach
| them proper load-balancing.
| 0cf8612b2e1e wrote:
| Wouldn't that situation arise when a company is training
| their top end model? Facebook/Google/DeepSeek probably
| trained on thousands of collocated GPUs. The bigger the
| cluster, the bigger the sync delays between batches as
| the model data gets shunted back and forth.
| lawlessone wrote:
| No.
| eikenberry wrote:
| https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...
| amelius wrote:
| Maybe a superconducting superinductor would be a better fit.
| lstodd wrote:
| that would be a blackhole bomb.
| janalsncm wrote:
| I am curious about what the load curves look like in these
| clusters. If the "networking gap" is long enough you might just
| be able to have a secondary workload that trains intermittently.
|
| Slightly related, you can actually hear this effect depending on
| your GPU. It's called coil whine. When your GPU is doing
| calculations, it draws more power and whines. Depending on your
| training setup, you can hear when it's working. In other words,
| you want it whining all the time.
| touisteur wrote:
| You might need more memory for this secondary training
| workload. But yeah, donating/selling the 'network' time for
| high-intensity, low memory footprint workloads (thinking number
| crunching, monte-carlo stuff, maybe brute-force through a
| series of problems...) might end up making sense.
| hulitu wrote:
| > Will Supercapacitors Come to AI's Rescue?
|
| Yes, just like the octopussies. /s
| blt wrote:
| What is causing demand bursts in AI workloads? I would have
| expected that AI training is almost the exact opposite. Load a
| minibatch, take a gradient step, repeat forever. But the article
| claims that "each step of the computation corresponds to a
| massive energy spike."
| wmf wrote:
| If the cores go idle (or just much less loaded) in between
| steps because they're waiting for network communication that
| would cause the problem.
| sdenton4 wrote:
| Bad input pipelines are a big cause of spikiness - you might
| have to wait a non-trivial fraction of a second for the next
| batch of inputs to arrive. If you can run 20+ training steps
| per second on. adecent batch size, it can take some real
| engineering to get enough data lined up and ready to go fast
| enough. (I work on audio models, where data is apparently quite
| heavy compared to images or text...)
| tzs wrote:
| > Another solution is dummy calculations, which run while there
| are no spikes, to smooth out demand. This makes the grid see a
| consistent load, but it also wastes energy doing unnecessary
| work.
|
| Oh god...I can see it now. Someone will try to capitalize on the
| hype of LLMs and the hype of cryptocurrency and to build a
| combined LLM training and cryptocurrency mining facility that
| that runs the mining between train spikes.
| ludicity wrote:
| Oh man, I really, really wish that you hadn't said this and
| also that you were wrong.
| ijustlovemath wrote:
| YCW27
| candiddevmike wrote:
| From the same founders who brought you (or didn't, actually)
| maritime fusion
| FridgeSeal wrote:
| It's ok though because YC invests in the _team_ , better
| just give them another chance!!
| Merrill wrote:
| Wouldn't it be better to arrange the network and software to run
| the GPUs continuously at optimal usage?
|
| Otherwise a lot of expensive GPU capital is idle between bursts
| of computation.
|
| Didn't DeepSeek do something like this to get more system level
| performance out of less capable GPUs?
___________________________________________________________________
(page generated 2025-05-06 23:00 UTC)