hngopher.com

       [HN Gopher] Will Supercapacitors Come to AI's Rescue?
       ___________________________________________________________________
        
       Will Supercapacitors Come to AI's Rescue?
        
       Author : mfiguiere
       Score  : 32 points
       Date   : 2025-05-06 19:30 UTC (3 hours ago)
        
 (HTM) web link (spectrum.ieee.org)
 (TXT) w3m dump (spectrum.ieee.org)
        
       | Animats wrote:
       | Is that kind of load variation from large data centers really a
       | problem to the power grid? There are much worse intermittent
       | loads, such as an electric furnace or a rolling mill.
        
         | paulkrush wrote:
         | Edit: It's interesting the GPU's are causing issues on the grid
         | before they cause issues with the data center's power.
        
           | mystified5016 wrote:
           | Read the article.
        
         | toast0 wrote:
         | I suspect it's more of a problem for the data center's energy
         | bill. My understanding is that large electric customers pay a
         | demand charge in addition to the volumetric charge for the
         | kWh's use at whatever rates given time of use / wholesale
         | rates. The demand charge is based on the maximum kW used (or
         | sometimes just the connection size) and may also have a penalty
         | rate if the power factor is poor. Smoothing over small duration
         | surges probably makes a lot of things nicer for the rate payer,
         | including helping manage fluctuations from the utility.
         | 
         | There's probably something that could be done on the individual
         | systems so that they don't modulate power use quite so fast,
         | too; at some latency cost, of course. If you go all the way to
         | the extremes, you might add a zero crossing detector and use it
         | to time clock speed increases.
        
           | timewizard wrote:
           | If you have a working thermometer you can predict when
           | furnaces are going to run.
           | 
           | If you want to smooth out data centers then you need hourly
           | pricing to force them to manage their demand into periods
           | where excess grid capacity is not being used to serve
           | residential loads.
        
           | hinkley wrote:
           | Large customers pay not by wattage but by... I'm spacing on
           | the word but essentially how much their power draw fucks up
           | the sine waves for voltage and current in the power grid.
           | 
           | I imagine common power rail systems in hyperscaler equipment
           | helps a bit with this, but for sure switching PSUS chop up
           | the input voltage and smooth it out. And that leads to very
           | strange power draws.
        
             | murderfs wrote:
             | You're probably thinking of power factor, which is usually
             | not a big deal for datacenters. All of your power supplies
             | are going to have active PFC, and anything behind a double
             | conversion UPS is going to get PFC from the UPS. The
             | biggest contributors are probably going to be the fans in
             | the air conditioning units.
        
               | Animats wrote:
               | This isn't about power factor. That's a current vs.
               | voltage thing within one cycle. It's about demand side
               | ramp rate - how fast load can go up and down.
               | 
               | Ramp rate has been a generation side thing for a century.
               | Every morning, load increases from the pre-dawn low, and
               | which generators can ramp up output at what speed
               | matters. Ramp rate is usually measured in
               | megawatts/minute. Big thermal plants, atomic and coal,
               | have the lowest ramp rates, a few percent per minute.
               | 
               | Ramp rate demand side, though, is a new thing. There are
               | discussions about it [1] but it's not currently something
               | that's a parameter in electric energy bills.
               | 
               | [1] https://www.aceee.org/files/proceedings/2012/data/pap
               | ers/019...
        
           | oakwhiz wrote:
           | There is often a demand flux surcharge as well. Not just
           | demand but delta in demand over some time period.
        
         | changoplatanero wrote:
         | Yes its a problem for the grid and the power companies don't
         | allow large clusters to oscillate their power like this. The
         | solution that AI have to do during their training big runs is
         | to fill in the idle time on the GPUs with dummy operations to
         | keep the power load constant. Having capacitors would be able
         | to save on power usage.
        
           | nancyminusone wrote:
           | Inb4 a startup is created to sell power load idle cycle
           | compute time in AI training data centers.
        
         | mystified5016 wrote:
         | Those loads aren't nearly as intermittent. Your furnace likely
         | runs for tens of minutes at a time. These datacenters are
         | looking at second-to-second loads.
         | 
         | Drawing high intermittent loads at high frequency likely makes
         | the utility upset and leads to over-building supply to the
         | customer to cope with peak load. If you can shave down those
         | peaks, you can use a smaller(cheaper) supply connection. A
         | smoother load will also make the utility happy.
         | 
         | Remember that electricity generation cannot ramp up and down
         | quickly. Big transient loads can cause a lot of problems
         | through the whole network.
        
       | paulkrush wrote:
       | "Thousands of GPUs all linked together turning on and off at the
       | same time." So supercapacitors allow for simpler software?,
       | reduced latency? at a low cost?
        
         | mjevans wrote:
         | They service 'spot demand moderation' as an extension of UPS
         | and power smoothing. In this case it's flattening out spikes to
         | smooth slopes.
        
       | sonium wrote:
       | Or you simply use the pytorch.powerplant_no_blow_up operator [1]
       | 
       | [1] https://www.youtube.com/watch?v=vXsT6lBf0X4
        
         | janalsncm wrote:
         | Pretty much. From the article:
         | 
         | > Another solution is dummy calculations, which run while there
         | are no spikes, to smooth out demand.
        
       | 0cf8612b2e1e wrote:
       | One solution is to rely on backup power supplies and batteries to
       | charge and discharge, providing extra power quickly. However,
       | much like a phone battery degrades after multiple recharge
       | cycles, lithium-ion batteries degrade quickly when charging and
       | discharging at this high rate.
       | 
       | Is this really a problem for an industrial installation? I would
       | imagine that a properly sized facility would have adequate
       | cooling + capacity to only run the batteries within optimal spec.
       | Solar plants are already charging/discharging their batteries
       | daily.
        
         | jeffbee wrote:
         | In addition to what you said, nothing is forcing or even
         | encouraging anyone to use lithium-ion batteries in fixed
         | service, such as a rack full of computers.
        
         | pixl97 wrote:
         | Eh, I think part of the problem here is the speed of load
         | switching. From the article it looks like the loads could
         | generate dozens to hundreds of demand spikes per minute. With
         | most battery operated loads that I've ever messed with we're
         | not switching loads like that. It's typically 'oh a fault,
         | switch to battery' then some time later you check the power
         | circuit to see if it's up and switch back.
         | 
         | This looks a whole lot more like high frequency load smoothing.
         | Really it seems to me like a continuation of a motherboard.
         | Even if you have a battery backup on your PC you still have
         | capacitors on the board for voltage fluctuations.
        
         | lstodd wrote:
         | in a properly designed install you can actually use the
         | compressors and fans for smoothing load spikes. won't be much,
         | but why not.
         | 
         | edit: otherwise I'm not getting what the entire article is
         | about. it's as contrary to what I know about datacenter design
         | as it can get.
         | 
         | it's.. just wrong.
        
           | touisteur wrote:
           | I'm thinking of sequences of 'put the sharded dataset through
           | the ten thousand 2kW GPUs, then wait on network - all-reduce
           | - then spike again - a mostly-synchronous all-on/all-off
           | loop. Watching how quick they get to boost-frequency I can
           | see where the worries come from.
        
             | lstodd wrote:
             | does anyone actually do those kind of loads over entire
             | dcs?
             | 
             | because if so, I have some nice east-european guys to teach
             | them proper load-balancing.
        
               | 0cf8612b2e1e wrote:
               | Wouldn't that situation arise when a company is training
               | their top end model? Facebook/Google/DeepSeek probably
               | trained on thousands of collocated GPUs. The bigger the
               | cluster, the bigger the sync delays between batches as
               | the model data gets shunted back and forth.
        
       | lawlessone wrote:
       | No.
        
         | eikenberry wrote:
         | https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...
        
       | amelius wrote:
       | Maybe a superconducting superinductor would be a better fit.
        
         | lstodd wrote:
         | that would be a blackhole bomb.
        
       | janalsncm wrote:
       | I am curious about what the load curves look like in these
       | clusters. If the "networking gap" is long enough you might just
       | be able to have a secondary workload that trains intermittently.
       | 
       | Slightly related, you can actually hear this effect depending on
       | your GPU. It's called coil whine. When your GPU is doing
       | calculations, it draws more power and whines. Depending on your
       | training setup, you can hear when it's working. In other words,
       | you want it whining all the time.
        
         | touisteur wrote:
         | You might need more memory for this secondary training
         | workload. But yeah, donating/selling the 'network' time for
         | high-intensity, low memory footprint workloads (thinking number
         | crunching, monte-carlo stuff, maybe brute-force through a
         | series of problems...) might end up making sense.
        
       | hulitu wrote:
       | > Will Supercapacitors Come to AI's Rescue?
       | 
       | Yes, just like the octopussies. /s
        
       | blt wrote:
       | What is causing demand bursts in AI workloads? I would have
       | expected that AI training is almost the exact opposite. Load a
       | minibatch, take a gradient step, repeat forever. But the article
       | claims that "each step of the computation corresponds to a
       | massive energy spike."
        
         | wmf wrote:
         | If the cores go idle (or just much less loaded) in between
         | steps because they're waiting for network communication that
         | would cause the problem.
        
         | sdenton4 wrote:
         | Bad input pipelines are a big cause of spikiness - you might
         | have to wait a non-trivial fraction of a second for the next
         | batch of inputs to arrive. If you can run 20+ training steps
         | per second on. adecent batch size, it can take some real
         | engineering to get enough data lined up and ready to go fast
         | enough. (I work on audio models, where data is apparently quite
         | heavy compared to images or text...)
        
       | tzs wrote:
       | > Another solution is dummy calculations, which run while there
       | are no spikes, to smooth out demand. This makes the grid see a
       | consistent load, but it also wastes energy doing unnecessary
       | work.
       | 
       | Oh god...I can see it now. Someone will try to capitalize on the
       | hype of LLMs and the hype of cryptocurrency and to build a
       | combined LLM training and cryptocurrency mining facility that
       | that runs the mining between train spikes.
        
         | ludicity wrote:
         | Oh man, I really, really wish that you hadn't said this and
         | also that you were wrong.
        
         | ijustlovemath wrote:
         | YCW27
        
           | candiddevmike wrote:
           | From the same founders who brought you (or didn't, actually)
           | maritime fusion
        
             | FridgeSeal wrote:
             | It's ok though because YC invests in the _team_ , better
             | just give them another chance!!
        
       | Merrill wrote:
       | Wouldn't it be better to arrange the network and software to run
       | the GPUs continuously at optimal usage?
       | 
       | Otherwise a lot of expensive GPU capital is idle between bursts
       | of computation.
       | 
       | Didn't DeepSeek do something like this to get more system level
       | performance out of less capable GPUs?
        
       ___________________________________________________________________
       (page generated 2025-05-06 23:00 UTC)