[HN Gopher] Since when did SSDs need water cooling?
___________________________________________________________________
Since when did SSDs need water cooling?
Author : LinuxBender
Score : 65 points
Date : 2023-05-27 12:41 UTC (10 hours ago)
(HTM) web link (www.theregister.com)
(TXT) w3m dump (www.theregister.com)
| jeffbee wrote:
| This author is confused, or the article is just badly written.
| The thing that draws all the power in these newer SSDs is the
| controller, not the memory. Shoveling 2 million IOPS to the host
| CPU is a difficult task ... your high-power host CPU can barely
| keep up. But the article goes on and on about the flash and
| hardly mentions the controller.
| justin66 wrote:
| > This author is confused, or the article is just badly
| written.
|
| I see you have discovered _The Register._ Welcome!
| zokier wrote:
| I'd question could we not use more the powerful, well-cooled,
| cpu that we already have in our computers instead of pushing
| the ssd controller complexity and power ever further. What if
| we used something like UBIFS or F2FS and removed/simplified
| FTL?
| phire wrote:
| Once you get to this level of performance, the bulk of a SSD
| controller ASIC is essentially just high-performance
| switching fabric; directing the flow of data between the 8-12
| channels of NAND flash, the PCIE bus, it's internal buffers
| and the DRAM.
|
| If you have any experience with high performance networking
| equipment, you know that pure switching fabric ASICs
| generates a lot of heat on its own. Hell, even a dumb 5-port
| gigabit ethernet switch generates a surprising amount of
| heat, they are always warm to the touch.
|
| I really doubt that handling the FTL layer on the controller
| adds that much extra power draw. A dumb PCIE <-> NAND
| switching ASIC will also have cooling problems.
| trogdor wrote:
| I recently upgraded my home network to 10 gigabit and was
| surprised by the amount of heat generated by 10 gig
| switches. Why do pure switching fabric ASICs generate so
| much heat?
| nine_k wrote:
| Gates consume electricity when they switch: some energy
| is needed to flip a FET from "open" to "closed" or back.
| Then the gate stays in a particular state for some time,
| allowing the circuit to operate.
|
| The faster you switch a gate, the more often you have to
| pay the switching price, which cannot go too low, else
| the thermal noise would overcome it. So you spend roughly
| 10x the energy switching a 10 Gbps stream as you use for
| 1 Gbps stream. Newer, smaller gates consume less energy
| switching, but not 10x less.
| Sakos wrote:
| SSD controllers are just ASICs. I'm not sure why you want to
| go from a chip tailored for a specific task to a general
| purpose one that already has way too much on its plate. Then
| there's things like latency and how the controller abstracts
| away the details of how an SSD works internally. All that
| complexity doesn't go away by putting it on your CPU. You're
| just moving it from one place to another for no benefit, and
| adding other complexities.
|
| Could ask the same thing about all the extra silicon in GPUs
| that adds hardware acceleration for video encoding/decoding.
| jeffbee wrote:
| The point of doing it in software on the host kernel would
| be to allow the flash layer and the filesystem to co-
| evolve, instead of being agnostic at best and antagonistic
| at worst.
| Sakos wrote:
| Who's going to write the code for it? Microsoft? Does
| every manufacturer write their own kernel-level driver?
| What happens to Linux/Unix? I don't want any of these
| manufacturers anywhere near the kernel, or even doing any
| more in software than they already do. Samsung isn't
| exactly known for code quality.
|
| This is a fantasy with questionable benefits at best that
| don't outweigh the downsides.
| jeffbee wrote:
| > Microsoft?
|
| Yes. That seems ideal to me. Microsoft, Apple, open
| source contributors. Today what you have is a closed-
| source translation layer written by the kinds of people
| who write PC BIOSes, i.e. the biggest idiots in the
| software industry. I would be _much_ happier with an OS
| vendor flash storage stack. For all I know, I am already
| using something like that from Apple. And I assure you
| that large-scale server builders like Amazon and Google
| are already doing it this way.
| mistrial9 wrote:
| calling those layers of people idiots is doing them a
| favor, excusing the de facto practices as being simply
| dumb. The truth includes a different layer, the business
| of business, who pays, who gets to do what. Booting your
| own hardware is the subject, and the actors there are not
| the ones that come to mind, thinking of consumer
| advocacy.
|
| The largest companies have other alignments that are not
| often discussed openly.
| [deleted]
| [deleted]
| zokier wrote:
| > don't want any of these manufacturers anywhere near the
| kernel, or even doing any more in software than they
| already do. Samsung isn't exactly known for code quality.
|
| That seems such a bizarre take. You think it's better
| that the crappy code is given to you as blackbox firmware
| with no oversight rather than in the open written to
| kernels quality standard where it can at least
| hypothetically be improved?
| numpad0 wrote:
| Speaking from memory and coarse grained understanding,
|
| 1) MLC/TLC/QLC work more like 4/8/16-tone grayscale e-paper
| than flash: e.g. 0x10 = (1,0,1,0), that's "4 level/bits per
| cell". And it's not a single pulse of 0x10 voltage into a
| memory cell, more like repetitive pulses from 0b1111 to
| enough millivolts below 0b1011. Readout is probably more
| complicated, let alone lifecycle management. Those businesses
| might be more involved than it's worth filesystem researchers
| time.
|
| 2) It was often said, at least years ago, that the
| considerable fraction of heat in NVMe SSDs comes from PCIe
| serialization/deserialization(SerDes), rather than payload
| data processing or NAND programming.
|
| If both of above are true, maybe it's PCIe that should be
| replaced, with something more like the original PCI?
| nine_k wrote:
| The original PCI was parallel. You can't have an
| excessively fast parallel bus, because the tiniest
| differences between lanes make different pins receive the
| signal out of sync. This is why RAM interface is so hard to
| get right, and the lanes there are as short as possible.
| KennyBlanken wrote:
| Because the controller and flash are on the same physical
| device in a very small amount of space, at least somewhat
| thermally coupled, not just by the PCB, but by "heat spreaders"
| often sold with the drive or part of the motherboard. Google
| around and you'll see lots of thermal camera images of M2
| drives.
|
| As the article points out, these drives consume up to ~10W
| under load. That's actually a lot of power for something with
| very, very little thermal mass - around 10 grams, and a heat
| capacity of around 400J/KgC is common for PCBs and chips.
| 0.4J/gC means that for just one second under full load, if the
| heat is generated evenly across the entire device, it will heat
| up 2.5 degrees C. Assuming no cooling, that's 24 seconds until
| it hits is thermal throttling point.
|
| From the article:
|
| > The amount of activity taking place on the gumstick-sized M.2
| form factor means higher temps not only for the storage
| controller, but for the NAND flash itself.
|
| > NAND, Tanguy explains, is happiest within a relatively narrow
| temperature band. "NAND flash actually likes to be 'hot' in
| that 60deg to 70deg [celcius] range in order to program a cell
| because when it's that hot, those electrons can move a little
| bit easier," he explained.
|
| > Go a little too hot -- say 80degC -- and things become
| problematic, however. At these temps, you risk the SSD's built-
| in safety mechanisms forcibly powering down the system to
| prevent damage. However, before this happens users are likely
| to see the performance of their drives plummet, as the SSD's
| controller throttles itself to prevent data loss.
|
| FYI, Tanguy according to his linkedin is the principle product
| engineer for Micron.
| Out_of_Characte wrote:
| Although, with flash memory cells nearing their physical limits
| in lithography, pretty soon you'll need active cooling for
| bigger stacks.
| marcosdumay wrote:
| I doubt one is far from the other.
|
| Shoveling IOPS into a bus is an easily parallelizable problem,
| while NAND-flash memory has a very high theoretical floor on
| its capacitance. Any good engineer would optimize the CPU part
| up to the point where it's only a bit worse than the flash, and
| stop there because there isn't much gain on going further.
|
| If that's the case, you will see the CPU being the bottleneck
| on your device, but it's actually the memory that constrains
| the design.
|
| That is, unless the CPU comes from some off the shelve design
| that can't be changed due to volume constraints. But I don't
| think SSDs have that kind of low volume.
| zinekeller wrote:
| > That is, unless the CPU comes from some off the shelve
| design that can't be changed due to volume constraints.
|
| Most SSDs (with exceptions like Samsung's) simply use
| SiliconMotion's IP
| (https://www.siliconmotion.com/products/client/detail) for
| their controllers.
|
| > But I don't think SSDs have that kind of low volume.
|
| If a custom design adds a cent or two to the BOM then it
| doesn't matter, but when you need to verify that the changes
| works as intented _and_ that the data isn 't corrupted
| (beyond specifications) that's a lot of cents to be saved.
| Plus, SiliconMotion can request to TSMC to fabricate it at a
| lower cost per unit (because there is only one pattern to
| manufacture) than to customise the controllers for each
| drive.
| wtallis wrote:
| You're vastly overestimating Silicon Motion's market share.
| Samsung, Micron, Western Digital, SK Hynix(+Intel), and
| Kioxia all use in-house SSD controller designs for at least
| some of their product line. Among second-tier SSD brands
| that don't have in-house chip design or fabrication, Phison
| is dominant for high-performance consumer SSDs.
|
| Speaking about SSD controllers in general: they _do_ use
| off-the-shelf ARM CPU core designs (eg. Cortex-R series),
| but those are usually the least important IP blocks in the
| chip. The ARM CPU cores are mostly handling the control
| plane rather than the data plane, and the latter is what is
| performance-critical and power-hungry when pushing many GB
| /s.
| frou_dh wrote:
| Design priorities probably get warped by doing well at artificial
| benchmarks/torture-tests in reviews coming to the fore.
| wtallis wrote:
| Absolutely. CrystalDiskMark is bad for the consumer SSD market.
| tinus_hn wrote:
| Unfortunately the alternative is trusting the manufacturers
| data which leads to cheating.
| wtallis wrote:
| The choices aren't exactly between a bad benchmark and no
| benchmark at all. And the widespread use of CrystalDiskMark
| as a _de facto_ standard by both independent testers and
| drive vendors has done nothing to slow the rise of behavior
| that an informed consumer would consider to be cheating.
| ksec wrote:
| There were early PCI-E 5.0 SSD samples pushing closer to
| theoretical max of 16GB/s but were consuming up to 25W. The
| current PCI-E 5.0 are only about ~11GB/s. But stays within a 12W
| power envelop.
|
| I do wonder if we have hit law of diminishing return. With Games
| optimised for System on PS5 and Xbox's DirectStorage, developer
| are already showing 80-95% of load time are spent on CPU already.
| swarnie wrote:
| What's the issue with a jump from 12w to 25w?
| ilyt wrote:
| The M.2 sockets on motherboards don't exactly have great
| cooling, and SSDs don't usually even come with radiator.
|
| The positioning can also be pretty iffy, mine have one next
| to CPU, another just under GPU (no chance getting fan there)
| and those are the "fast" (directly connected to CPU) ones!
|
| The another 2 slots are again under GPU (one filled with
| wifi/bt card), and only last 2 are far away from other hot
| components and get its own heatsink. but those are not
| directly connected to CPU
| kdmytro wrote:
| I don't think this would be a factor. Some PCI-E 4.0 SSDs
| already come with metal heatsinks out of the box. If future
| SSDs needs addtitional cooling, this will be communicated
| to the buyer.
|
| I think that the bigger question is whether 25W can be
| phisically supplied to the drives by contemporary
| motherboards. What is the power limit for the m.2 ports?
| ilyt wrote:
| At least according to wikipedia each pin is rated up to
| 0.5A with I think nine 3.3v pins so technically just
| around ~15W peak
|
| Technically that's what U.2 (2.5 inch form factor for
| SSDs) would be for.
|
| They get 5V/12V and thicker connector, I severely doubt
| M.2 could swing 25W as it only has 3.3V on it
| wtallis wrote:
| We're talking about an SSD form factor that's 22x80mm and is
| fed by a couple of card edge pins carrying 3.3V. 12W was
| already pushing it.
| formerly_proven wrote:
| According to the one-page datasheet of a Foxconn M.2 M-key
| socket, maximum current per pin is 0.5 A (they're tiny,
| after all). Since M.2 M-key has a total of nine pins
| carrying 3.3 V this would limit power to 15 W before any
| heat dissipation considerations plus connector derating
| because the toasty SSD is heating the connector up.
| [deleted]
| KennyBlanken wrote:
| ...on a device that weighs about 10 grams with a heat
| capacity likely around 0.4J/gram-degrees-C.
|
| 10Ws for such a device if I did the math right is around a
| 2.5C/sec rise in device temperature.
| londons_explore wrote:
| As soon as SSD's are faster, developers will find ways to waste
| more space and do more IO operations...
| j16sdiz wrote:
| For those don't read the article:
|
| > "NAND flash actually likes to be 'hot' in that 60deg to 70deg
| [celcius] range in order to program a cell because when it's that
| hot, those electrons can move a little bit easier," he explained.
| ... Go a little too hot -- say 80degC -- and things become
| problematic
| londons_explore wrote:
| Sure... erasure energy is lower when it's hot... But there are
| lots of other downsides, like a much reduced endurance, and
| more noise in sense amplifiers meaning there is a higher chance
| of needing to repeat read operations.
| rowanG077 wrote:
| I don't understand why pushing 16gb/s requires so much power. A
| fully custom ic where the data path is in silicon should be able
| to handle that speed no sweat.
| wtallis wrote:
| SSD controllers aren't just moving a lot of data. Between the
| PCIe PHY and the ONFI PHY there's a lot of other functionality.
| In particular, doing LDPC decoding at 16GB/s (128Gb/s) is not
| trivial.
| andromeduck wrote:
| ECC/Crypto is pretty energy intensive - other bookkeeping like
| wear leveling and r/w disturb is also quite complicated.
| formerly_proven wrote:
| One of the issues with M.2 in desktop PCs is how buried from the
| airflow they are, and often they're literally on the exhaust side
| of the GPU (many GPUs exhaust on both long edges, and many
| motherboards just so happen to have an M.2 slot under the PEG) or
| in the air-flow dead-zone between PEG and CPU cooler.
|
| Overall the AT(X) form factor, with extension cards slotting in
| at a 90deg angle, just doesn't work all that well for efficient
| heat removal. DHE takes away I/O slot space and requires high
| static pressures (so high fan RPMs), it works for headless
| servers, but that's about it. The old-fashioned way of a
| backplane and orthogonal airflow does work much better for stuff
| like this; but it also requires a card cage and is not very
| flexible in terms of card dimensions. The one saving grace of ATX
| is that cards and their cooling solutions can grow in length and
| height, GPUs are much taller than a normal full-height card, and
| many are much longer than a full-length card is supposed to be as
| well.
| jackmott42 wrote:
| This isn't really an issue because everyone's M.2 is working
| fine. You have to construct absurd scenarios to cause problems.
| Use a case with bad airflow, a hot GPU, and a workload that is
| pushing the gpu and m.2 and cpu to their limits indefinitely,
| which isn't a real life thing.
|
| and if it IS a real life thing because you have some special
| use case, you use a case with good airflow.
| zamadatix wrote:
| I think this depends on what the definition of fine and
| problems is. IMO most don't even notice when their drive
| throttles due to thermals so it's probably fine that the
| drives get hot. At the same time, as newer drives keep
| drawing more and more power, this is going to start to push
| the limits of "well why did I buy the fast drive in the first
| place" if they didn't come with these ever increasing cooling
| solutions as well.
| numpad0 wrote:
| People are buying gaming branded PEG propping sticks and
| sustainer wires because high end PEGs are sagging, and neither
| the case nor the card support that front slot for full length
| cards. It's well past the time for a card cage spec as far as I
| can see from the user perspective.
| alberth wrote:
| > _"Overall the AT(X) form factor, with extension cards
| slotting in at a 90deg angle, just doesn 't work all that well
| for efficient heat removal."_
|
| To give an example of this, here's a server from a huge cloud
| provider for a brand new AMD 7700 on an ATX board.
|
| Those 90deg angles make for horrible airflow.
|
| https://twitter.com/PetrCZE01/status/1637122488025923585
| ilyt wrote:
| That look more like "you put powersupply on wrong side".
|
| But yeah, most servers have risers that flip the cards to be
| parallel to the board.
| mordae wrote:
| Yeah, but X470D4U and similar boards are so overpriced one
| can somewhat relate to people using gaming boards for
| servers. Especially since a lot of them route ECC pins
| nowadays.
|
| I sure wasn't happy paying extra just to have a different
| board layout with mostly the same components.
|
| Well, there's IPMI at least. Still not worth the price tag.
| dx034 wrote:
| It should be clear that this was only shot for marketing
| purposes. I don't think they actually run cables like that,
| but it probably looked better to have cables visible in the
| picture.
| alberth wrote:
| Do you know this for a fact?
|
| Or are you speculating.
| hgsgm wrote:
| That looks like a ribbon cable blocking the fan?
| Asooka wrote:
| I'm water cooling both the CPU and GPU in my PC and have found
| out that leads to virtually no airflow over the m.2 slot. For
| now I've simply placed a fan aimed directly at the slot on top
| of the GPU and that keeps the SSD at a 50 to 60C. I am
| considering installing a water block on the SSD when I do
| maintenance next.
| CrimsonRain wrote:
| All of it doesn't matter because SSDs like being hot.
| ilyt wrote:
| Sure if you dont like your data
| coldtea wrote:
| In the same that an ice cream is too cold and could use some
| heat
| jmclnx wrote:
| >While NAND flash tends to prefer higher temperatures there is
| nothing wrong with running it closer to ambient temperatures
|
| New one on me :) I did not know NAND liked to be hot, if true
| does not bode well for laptops for over-clockers.
|
| To me, the end result seems to be, yes and no, up to you. But I
| still prefer HDD anyway, I am very old school.
| jackmott42 wrote:
| You do not actually prefer HDD, and nothing bodes badly for
| overclockers or laptops. You are just looking for ways to be
| contrarian.
| detrites wrote:
| Preferring a storage medium for its reliability regardless of
| the amount of writes it endures is utility - I could see that
| might be preferable to SSD in some specific case. Maybe there
| are other upsides, eg, it's often much cheaper.
|
| Regardless, some people drive an old, dangerous, slow, gas-
| guzzling car - and maintain it at great expense - just
| because they prefer it. Aesthetic and sentimental appeal is
| highly personal and knows no bounds.
| seized wrote:
| Except hard drives aren't celebrated for reliability. Or
| speed. Or low latency. Or durability (try knocking one). Or
| power and heat. Old cars you can at least make some
| arguments for... Hard drives as primary storage/boot, no.
|
| Really they're good for bulk storage. And that's it. For
| use in primary compute they're really great if you want to
| slow everything down.
| hulitu wrote:
| AFAIK compared with SSDs they are better (reliability).
|
| And running any electronic component hot is just asking
| for trouble.
| fuzzfactor wrote:
| >hard drives aren't celebrated for reliability.
|
| With no revelry whatsoever my 2006 early SATA Maxtor
| 100GB HDD is still going strong with Windows 11 on a Dell
| Vista PC.
|
| Boots no slower than our IT guys have 2-year-old SSD W10
| PC's doing at the office.
| justsomehnguy wrote:
| > AFAIK compared with SSDs they are better (reliability).
|
| Depends on the price point.
|
| Just days ago PM1725 gave us trouble. Yet, five WD10JUCT
| I bought recently (in R5) beat it on the price and
| available capacity, even with abysmal performance.
|
| >any electronic component hot is just asking for trouble
|
| I'd say running _too hot_.
| EscapeFromNY wrote:
| What's wrong with HDD? It's actually quite convenient having
| time for your morning jog and a shower while you wait for
| your computer to boot up.
| consp wrote:
| As a bonus and it's really old it sounds like an old
| gravity fed drip coffee machine.
|
| on a heat note: a spinning rustdisk also uses quite a bit
| of watt, every time, all the time. Powerwise the high W
| ssd's are still less power hungry over time.
| pmontra wrote:
| A 1 TB 2" HDD I attached to an Odroid consumes little
| more than 1 W. A 3.5" 2 TB one consumes 10 W. I turn them
| off by software when I don't need it. They are a backup
| storage.
| pixl97 wrote:
| Nothing at all, I like to be able to count my IOPS on my
| fingers.
| coldtea wrote:
| Please tell us more about the psychology of the parent
| commenter.
|
| You seem to have studied it quite well, or perhaps find that
| ad-hominems make for the best arguments!
| formerly_proven wrote:
| NAND is bad for cold storage, because writes cause more wear
| when it's not warm. Meanwhile data retention benefits from
| lower temperatures.
| adgjlsfhk1 wrote:
| cold storage by it's nature doesn't have a lot of writes...
| flaminHotSpeedo wrote:
| And if there was a hot (as in frequently used) drive it
| would still heat up (as in temperature).
|
| But I guess the other commenters point might be valid if
| you run a datacenter in a blast chiller
| theknocker wrote:
| [dead]
| valine wrote:
| Does anyone know of a PCIE 5 SSD designed for sustained
| read/write? Most of these new drives are meant for short bursts
| of data transfer, I can't imagine water cooling would be
| necessary for most drives.
| matja wrote:
| Kioxia CD8 : https://www.storagereview.com/news/kioxia-
| cd8-series-pcie-5-...
___________________________________________________________________
(page generated 2023-05-27 23:02 UTC)