[HN Gopher] Better PC cooling with Python and Grafana
       ___________________________________________________________________
        
       Better PC cooling with Python and Grafana
        
       Author : naggie
       Score  : 122 points
       Date   : 2024-03-03 17:04 UTC (5 hours ago)
        
 (HTM) web link (calbryant.uk)
 (TXT) w3m dump (calbryant.uk)
        
       | naggie wrote:
       | I got annoyed at my fan speeds so decided to experiment with
       | controlling my fans with Python and measuring the results.
        
         | jtriangle wrote:
         | You likely have a setting for fan ramp time in your bios,
         | usually in seconds. So setting your pump to always run at 100%
         | and your fans to ramp slowly, say 10 seconds or longer, and
         | using a minimum fan speed that is as high as tolerable would
         | likely work as a no-additional-code solution.
        
           | jeffbee wrote:
           | An AIO pump running flat out will draw 25W all by itself. An
           | idle CPU can be below 100mW (Intel ... AMD idle powers are
           | slightly higher).
           | 
           | Setting your AIO pump to static full power will greatly
           | increase the idle power consumption of the machine.
        
         | fwip wrote:
         | Looks pretty cool, the self-calibration routine is very nice
         | too.
         | 
         | My only worry is that rapid changes in pump speed might cause
         | extra mechanical stress or wear on the pump, but I have no data
         | to back that up. I've just heard that water pumps sometimes
         | behave in counter-intuitive ways - e.g: sometimes running at a
         | higher speed is better for longevity than a lower one.
        
       | hnuser123456 wrote:
       | Please do some measuring of core temperature response to load
       | before and after re-pasting / upgrading the thermal compound.
       | Something between a large grain of rice and a pea, and I like to
       | clean the CPU and cooler cold plate completely, paste the CPU,
       | then press and smear around the cooler onto the CPU before
       | mounting it, to ensure full surface area coverage with a thin
       | layer of compound.
        
         | majesticmerc wrote:
         | Genuine question: I've known the law of "pea sized amount" for
         | thermal paste for 20 years or so. Does it still hold true for
         | modern (and larger) CPU dies? I haven't upgraded in a long time
         | so genuinely don't know, but also wouldn't want to use outdated
         | knowledge!
        
           | jtriangle wrote:
           | You want to use as little paste as possible while fully
           | covering the heat spreader.
           | 
           | Most modern coolers will provide sufficient pressure to
           | spread a pea sized amount of thermal paste to cover the whole
           | cooler.
           | 
           | There are also thermal pads, both reusable and single use,
           | that perfom well and don't require any guesswork.
           | 
           | If you want to paste, noctua has the best paste in terms of
           | thermal resistance, but mx4 or mx5 both perform well, as does
           | cryorig and a bunch of others.
        
           | dist-epoch wrote:
           | Video with spread patterns under transparent plate:
           | 
           | https://www.youtube.com/watch?v=wn2ln04dquM
        
           | sorenjan wrote:
           | I've seen reports and debates that modern heat spreaders are
           | too large for the pea method, and that an X pattern is more
           | common now.
           | 
           | Here's one of the test videos you can find on Youtube about
           | the subject: https://youtu.be/aaxBYrZFJZM?t=199
        
             | chfalck wrote:
             | Thanks for posting an actual test rather than perpetuating
             | tradition without verification
        
               | foofie wrote:
               | Science!
        
             | parineum wrote:
             | Giving a single moment of thought to what happens when the
             | paste is squished to this should lead everyone to the
             | conclusion that a square that is equidistant from the edge
             | to the center is the right pattern with maybe a dot it the
             | middle for piece of mind.
             | 
             | That wasn't tested there.
        
               | feitingen wrote:
               | A square is closed and will probably result in a bubble
               | of air being trapped.
        
               | parineum wrote:
               | You don't have to close it nor do you have to put it on
               | straight down.
               | 
               | It's pretty easy to get this right by using your head
               | rather than following some ethos.
        
             | ComputerGuru wrote:
             | I clicked eagerly and appreciated the approach but not the
             | analysis of the results. The only judging criteria was
             | "coverage" (enough but not too much) but no thought was
             | given to thickness, and some application patterns lend
             | themselves to thicker results (just compare the five dots
             | pic to the full coverage or three lines pic). You can see
             | from the results that the "winners" have a fairly thick
             | layer of substrate. You want the absolute thinnest layer of
             | thermal paste that will achieve full coverage.
        
           | justsomehnguy wrote:
           | You need to understand what is thermal paste is used for and
           | then you would understand what for the most of the time pea
           | size is okay.
           | 
           | It's just a compound what _helps_ with a heat transfer. Both
           | CPU IHS and the radiator are not ideal surfaces, so if you
           | install one on one without TP then there would be bubbles of
           | air trapped in imperfections of the surfaces. They are not a
           | problem per se but they do make the heat transfer worse, so
           | IHS is hotter than it could be and a bit less effective to
           | transfer the heat to the radiator.
           | 
           | You have two options to improve _the heat transfer_ :
           | 
           | a) polish both the IHS and the radiator to a perfect surface,
           | ie they would should mirror like; and then you should use a
           | torque screwdriver to ensure what the cooler is tightened and
           | leveled exactly right;
           | 
           | b) or you can just slap some thermal paste between them and
           | call it done.
           | 
           | And in the second case you need 'just enough' of TP to smooth
           | out the imperfections, but if you slpat too much then again
           | you are making _the heat transfer_ between the IHS and the
           | radiator worse.
           | 
           | So if you ever find yourself minding about how much TP should
           | you use then just start with a pea size, place the radiator
           | (you can even screw it if that's your jam) then remove it and
           | look at how the TP spread around the IHS. If it covers 90%
           | with a thin film (and your radiator isn't the lowest shit
           | tier polished with the old rusted raspel) when you are fine
           | enough.
           | 
           | The only difference between 20yo and now is what IHS are way
           | more larger so now you may find a need to use... a larger pea
           | as a reference.
        
         | naggie wrote:
         | As it happens, I did this morning!
         | 
         | I switched from years old Arctic silver 5 to Noctua NT-H1. It
         | resulted in a dramatic difference. 64c loaded vs 84c -- I now
         | suspect I had an air bubble which may invalidate the initial
         | motivation for the work in the first place :-)
        
           | KennyBlanken wrote:
           | The first thing I noticed was that your idle temperatures are
           | _exceedingly_ high, especially for an AIO. My air-cooled
           | ryzen 5 idles at around 35c with nearly zero fan speed on a
           | $30 air cooler.
           | 
           | Most AIOs need servicing after a few years- find instructions
           | on how to disassemble yours, clean the water block, flush the
           | radiator, and refill with a deionized-water/glycol mix.
        
       | mckirk wrote:
       | For people that want to do something similar on Windows, I can
       | wholeheartedly recommend FanControl [1]. It's sadly not open-
       | source, but it works great, and is quite pleasant to interact
       | with.
       | 
       | [1]: https://getfancontrol.com/
        
         | haunter wrote:
         | It's open source
        
           | giobox wrote:
           | No it isn't. I use and like the app also, but the GitHub repo
           | link on their page takes you to a separate release artifact
           | management repo, _not_ the source code itself.
           | 
           | > https://getfancontrol.com/
           | 
           | Their linked GitHub repo, which only has a compiled zip
           | uploaded for release artifact creation, _not the code_ :
           | 
           | > https://github.com/Rem0o/FanControl.Releases
           | 
           | From the repo:
           | 
           | "Sources for this software are closed."
           | 
           | I wonder why the author hasn't open sourced it - there
           | doesn't appear to be any commercial aspect to the tool, and
           | the author makes a point of it being "free" on the site.
        
             | soultrees wrote:
             | AI ownership most likely. Preventing LLMs from using it's
             | code in an output and not putting it in the 'safe to spew'
             | bucket.
        
               | kingkongjaffa wrote:
               | This tool has been like this for years even before LLM
               | was popular.
               | 
               | Not everything is about LLMs.
        
         | pryelluw wrote:
         | Curious as to why you think it's sad it is not open source .
        
           | LastTrain wrote:
           | Why some people prefer open source software is well
           | established by now - so if you have a bone to pick with this
           | position you don't need to wait for an answer, go nuts.
        
             | pryelluw wrote:
             | No bone to pick. Genuinely curious as to why this
             | individual feels sad about it not being open source.
        
           | foofie wrote:
           | > Curious as to why you think it's sad it is not open source
           | .
           | 
           | You're commenting on a discussion on how someone leveraged
           | this sort of open source software to improve cooling.
        
             | pryelluw wrote:
             | Well thank you for pointing that out. I missed that
             | context.
        
           | mckirk wrote:
           | Apart from what foofie already pointed out, I think it is
           | 'sad' in the sense that a project like this is
           | 
           | - one you might expect to be open source, since there is no
           | monetary interest behind it besides an option to donate, and
           | 
           | - one that would potentially lend itself quite well to being
           | open source, since it would offer a great base for other
           | people to tinker around with their own cooling setups.
        
       | jerrygenser wrote:
       | Another thing I've had great success with on my AMD 7700X is to
       | use AMD Ryzen Master to reduce the TDP.
       | 
       | The chip comes to consumers overclocked by default with a TDP of
       | 105W. I suspect this is the case so that it can beat Intel on
       | benchmarks on "default" settings.
       | 
       | You can set it to "eco mode" and have it run at 65W or 45W TDP.
       | Under load, this only results in like a 5% reduction in
       | performance for a dramatic reduction in electricity consumption,
       | fan speed, heat, etc.
       | 
       | Not sure if the 5500x series chips are overclocked but using eco
       | mode could be a good approach.
        
         | jtriangle wrote:
         | You can also undervolt most modern AMD parts and actually gain
         | performance
        
           | yjftsjthsd-h wrote:
           | How does that work? Is it just that it can run at high speed
           | sustained if it doesn't have to dissipate as much heat?
        
             | parineum wrote:
             | The chips overclock themselves under load but it's
             | unsustainable. Limiting their TDP can allow them to sustain
             | their overclocks longer/indefinitely.
        
             | jtriangle wrote:
             | Yes, exactly. Modern chips thermal throttle by default,
             | their turbo clocks/voltages aren't sustainable with most
             | cooling solutions, so, reducing the voltage the chip runs
             | at lowers the power consumed and allows the chip to
             | maintain a higher clock for longer durations.
             | 
             | There is, of course, a risk that the lower voltage won't be
             | stable, but, as long as you stress test your system as you
             | would while overclocking, you'll be able to find a lower
             | voltage that is workable.
             | 
             | The 7000 series ryzen chips have this built into most
             | motherboard bioses by the name 'pbo offset' (or similar)
             | which modifies the stock voltage curve lower, and can set
             | new temperature targets if desired. I'm running my 7950x at
             | PBO level 3 with an 85c target, works well with no more
             | fuss than a couple reboots really.
        
               | asmor wrote:
               | > their turbo clocks/voltages aren't sustainable with
               | most cooling solutions
               | 
               | And also, starting with 7nm it becomes a lot harder to
               | transfer that heat out of the chip, even if you have
               | thermal mass. The IHS itself becomes a bottleneck.
               | 
               | On X3D chips it gets even worse, as the cache acts as a
               | heat shield too - these chips actually are pre-binned
               | running at relatively low voltages, which is why they
               | usually don't see as huge gains in as others in curve
               | optimizer.
        
         | stanac wrote:
         | AMD 7700X is probably factory overclocked 7700, it would be
         | cheaper to just buy 7700. Probably the only difference is that
         | some of the 7700 chips cannot guaranty same clocks and
         | stability when overclocked to 7700X specs.
        
           | toast0 wrote:
           | The current price difference on Amazon is $5. But the 7700X
           | doesn't include a cooler. If you're not planning to use the
           | AMD cooler, may as well pay an extra $5 and get a processor
           | that binned better (probably), not have the extra cooler, and
           | you can fiddle with voltage and TDP/power limits from there.
           | 
           | Depending on your cooling, having a 100Mhz higher max clock
           | could be worth $5.
        
             | Scoundreller wrote:
             | I know someone that bought an AMD CPU but it was a factory-
             | sealed empty box. Turned out sending out a cooler-free
             | product in the same packaging allowed for the CPU to be
             | taken out of the box's window without breaking the seal.
             | Hopefully they've updated the packaging.
        
           | MenhirMike wrote:
           | I don't know about the 7700, but I have a 7900 and it makes
           | the 7900X look utterly ridiculous in comparison, especially
           | because it comes with a decent cooler (incl. RGB if people
           | care for that). So yeah, I'd not waste extra money on the X
           | variant. (Though for games, the 7800X3D is probably the best
           | of the bunch)
        
         | this_user wrote:
         | I had a similar problem to what the article describes with a
         | Ryzen 9. The reason was the the CPU would constantly
         | automatically spike the clock frequency to overclock the CPU.
         | The solution was just to disable this specific feature in the
         | BIOS.
        
         | LegitShady wrote:
         | Just an FYI AMD Ryzen Master installer contains a dark pattern.
         | 
         | When you first launch it you have to scroll down the
         | disclaimer/license to check off the "I agree to terms and
         | conditions" box (which is obviously unchecked by default).
         | 
         | When you do, it creates the "install button" you can click but
         | the checked box now sits beside text about sending AMD
         | information and if you aren't looking you may assume its still
         | the same text as when you checked the box.
         | 
         | The end effect is to get the user to agree to send data without
         | them noticing.
        
         | pja wrote:
         | Yes, you can do the same with the 5000 series. The 5600X can be
         | restricted to 45W max tdp & the 5800X to 65W.
        
       | asmor wrote:
       | Didn't pick the right fans for going extra slow. These run at
       | 1700 RPM by default, whereas Noctua has a version - even in the
       | redux line - that runs at 1200 RPM. Though the non-redux like
       | gets even slower - so much that Noctua includes a "low noise
       | adapter" (presumably a resistor).
        
       | anticodon wrote:
       | On AMD it helps to have pretty recent kernel and
       | amd_pstate=active string in the kernel boot params. I haven't
       | checked the temperatures but I think I've started to hear the fan
       | noise less after enabling it. This option was finally implemented
       | in kernel 6.1 or 6.2. I don't remember the exact version, it
       | happened only 1-1.5 years ago.
        
       | js2 wrote:
       | I geeked out on something like this for my TrueNAS server setting
       | fan speed based on drive temperature with a PID controller. e.g.
       | 
       | https://github.com/dak180/TrueNAS-Scripts/blob/master/FanCon...
       | 
       | (That's not mine. I think I wrote a variation in Python.)
       | 
       | Then I realized: my server is in the unfinished part of my
       | basement where I can't hear it anyway. Let's just run the fans at
       | 80% speed all the time since that's sufficient to keep the drives
       | cool.
        
       | ltbarcly3 wrote:
       | Just buy a large AIO water cooler. Replace the included fans with
       | Noctua. That is the full story of how I made my computer silent
       | despite having a CPU with a TDP of over 200W.
        
         | asmor wrote:
         | Didn't read the article, did you?
        
       | AceJohnny2 wrote:
       | In conclusion, this is why Systems (Thermal) Control is a
       | profession.
       | 
       | Not dissing on the author's efforts, quite the contrary! But they
       | demonstrate the rabbit hole that is second order effects (like
       | multi-fan beat frequency) and number of parameters to take into
       | account (like "... _A solution to [when to enable Passive Mode]
       | may be to detect if the computer is in use (mouse movements)_ ")
        
       | Arch-TK wrote:
       | With a Noctua NH-D15 and about 30 minutes spent tweaking the
       | motherboard fan curves I was able to get my 5950x to not thermal
       | throttle without producing any noticeable fan noise.
       | 
       | This seems extremely over-engineered and sounds like it could
       | have been solved by using Noctua or similar quiet fans.
        
         | ComputerGuru wrote:
         | The author specifically mentions (multiple times) that they are
         | using Noctua fans...
        
           | asmor wrote:
           | Not exactly the optimal Noctua fan, considering it's a high
           | RPM redux.
        
       | cinntaile wrote:
       | It says 5959x in the article but it should be 5950x.
        
       | pstrateman wrote:
       | This is pretty cool but honestly just setting the pump speed to a
       | constant that's not annoying and setting the fans to a constant
       | that's not annoying is likely to get the same result.
        
       | craftoman wrote:
       | Better cooling with Python, Grafana, Prometheus on top of
       | Kubernetes with enchanted AI. Who needs PID these days?
        
       | belter wrote:
       | Another way to help with cooling..."Energy Efficiency across
       | Programming Languages" - https://greenlab.di.uminho.pt/wp-
       | content/uploads/2017/10/sle...
        
         | moffkalast wrote:
         | Now we just need a follow up that takes the average dev time
         | for each language and show which one has the optimal ratio
         | based on these two values. It should be possible to draw a
         | curve based on how long and how often the program will run to
         | plot exactly when it makes sense to switch to something more
         | efficient but also more dev intensive. Sort of in this fashion:
         | https://xkcd.com/1205/
         | 
         | E.g. assembly would have very low energy use by itself, but
         | would require an inordinate amount of human energy (~8.7
         | MJ/day) invested to get that end result, making it very
         | inefficient when the whole picture is considered. Unless that
         | code runs everywhere constantly for years of course.
        
         | austinjp wrote:
         | Their 2021 update here:
         | 
         | https://www.sciencedirect.com/science/article/abs/pii/S01676...
         | 
         | Link to full paper via sci-hub:
         | 
         | https://sci-hub.st/10.1016/j.scico.2021.102609
         | 
         | This appears to be a web page where the authors have posted
         | links to their research, data, updates, etc:
         | 
         | https://sites.google.com/view/energy-efficiency-languages/
         | 
         | For transparency, I've posted these here recently:
         | 
         | https://news.ycombinator.com/item?id=39286827
        
       | kelvie wrote:
       | I've been using an esp32-based fan controller with esphome and
       | inlined C++ for my waterloop for a while, with a custom (but
       | super simple) temp control algorithm as well:
       | 
       | https://github.com/kelvie/esphome-config/blob/master/pc-fan-...
       | 
       | The main reason for doing this was so that I didn't have to
       | connect the controller to my main PC via USB to program it (I can
       | change the target points via MQTT/wifi).
       | 
       | Playing around with this stuff on my laptop I've also noticed
       | that you have to be careful what calls you make when querying
       | system status on a loop, some things (like weirdly,
       | `powerprofilectl get`) even when called every 5 seconds drains a
       | surprising amount of battery, so in a sense, your tool may start
       | to affect the "idle" power consumption somewhat, and you need to
       | test that.
        
       | selimnairb wrote:
       | Who has time for all this rigamarole. Just use an ARM CPU.
        
       | dvdkon wrote:
       | It's surprising that there are no PC fan controllers that would
       | use some variant of PID control with a temperature target.
       | Traditional fan curves are simple, but the result isn't very
       | intuitive.
       | 
       | And many desktop motherboards manage to screw up even the basic
       | fan curve, offering users control of only two points within
       | strict bounds, no piecewise linear curves or hysteresis settings.
       | 
       | I started a fan controller project some 4 years ago and it's now
       | sadly in limbo, waiting for me to solve power filtering issues
       | for the makeshift power supply it grew into. Maybe I should just
       | limit myself to 4-pin fans...
        
       ___________________________________________________________________
       (page generated 2024-03-03 23:00 UTC)