[HN Gopher] Marvell Announces First PCIe 5.0 NVMe SSD Controller...
___________________________________________________________________
Marvell Announces First PCIe 5.0 NVMe SSD Controllers: Up to 14
GB/S
Author : ksec
Score : 198 points
Date : 2021-05-27 14:01 UTC (8 hours ago)
(HTM) web link (www.anandtech.com)
(TXT) w3m dump (www.anandtech.com)
| harveywi wrote:
| Marvell? I thought that DC had all the Flash controller rights.
| FBISurveillance wrote:
| I see what you did there.
| deadcore wrote:
| lol that made me chuckle
| louwrentius wrote:
| Try to imagine with all these developments how much performance
| you can get from vertical scaling.
|
| I bet that many applications don't need to care about horizontal
| scaling with all the burdens involved before you outgrow the
| performance of a 'single' box (aka stack overflow)
| PedroBatista wrote:
| That ~10W tho ...
|
| I think right now the real money is in somebody who can implement
| PCI-E 5 efficiently, or soon we'll see every SSD with a mini fan
| on it.
|
| ( It's not all PCI-E 5's fault, these controllers have been more
| and more power hungry )
| wtallis wrote:
| You can always wish for more efficiency, but it's important to
| understand that SSDs _have_ actually been getting steadily more
| efficient in the Joules per Byte transferred sense. We 're just
| seeing a simultaneous concentration and consolidation of
| performance that is moving a bit faster, hence the need for
| enterprise SSDs to abandon the M.2 form factor in favor of the
| somewhat larger and easier to cool EDSFF family.
|
| One way of looking at things is to realize that until now, a
| storage controller handling 14 GB/s of traffic would have been
| a full-height RAID card demanding 300 LFM of airflow, and now
| we're putting that much throughput into a SSD with a heatsink
| that's roughly 1U by 1in in cross section.
| bmcahren wrote:
| With the location of modern NVME slots, you can't even exceed
| 500MB/s without a controller rate limiting due to heat.
| PinkPigeon wrote:
| I mean, I love the insane GB/sec figures, but does anyone else
| mostly care about IOPS? These state 1.8M read and 1M write, which
| sounds quite impressive.
| CoastalCoder wrote:
| Related: anyone know of a good video or diagram for helping CS
| students get an intuition regarding the interplay of bandwidth
| and latency? Including how saturating bandwidth increases
| latency by causing queueing bottlenecks?
|
| I'm looking for something a bit more visual and dynamic than
| the old "station wagon full of tapes going down the highway"
| imagery.
|
| [EDIT: Just for clarification, I feel like I already have a
| pretty good grasp on these concepts. I'm looking for good ways
| to help others at the ~ undergrad level.]
| bombcar wrote:
| Here's a post on relative latencies that may be useful:
| https://danluu.com/infinite-disk/
|
| There was another post I saw recently comparing the increase
| in disk SIZE over the last 30 years vs the increase in disk
| SPEED vs LATENCY (so size in GB, speed in GB/s, latency in
| IOPS) - and how size increases far outstripped speed which
| outstripped latency, though all had improved.
|
| Found it! The key is IOPS/GB as a metric.
|
| https://brooker.co.za/blog/2021/03/25/latency-bandwidth.html
| louwrentius wrote:
| Maybe this doesn't answer your question exactly but I
| addressed this topic in two blogpost, maybe it helps.
|
| https://louwrentius.com/understanding-storage-performance-
| io...
|
| https://louwrentius.com/understanding-iops-latency-and-
| stora...
| wtallis wrote:
| I do some rather coarse measurements of random read
| throughput vs latency as part of my SSD reviews. See eg. the
| bottom of https://www.anandtech.com/show/16636/the-inland-
| performance-...
|
| Those graphs cut off the essentially vertical latency spike
| that results from enqueuing requests faster than the
| sustained rate at which the drive can serve them. For a
| different view in terms of queue depth rather than
| throughput, there are some relevant graphs from an older
| review that predates io_uring:
| https://www.anandtech.com/show/11930/intel-optane-ssd-
| dc-p48...
|
| Generally speaking, latency starts increasing long before
| you've reached a drive's throughput limit. Some of this is
| inevitable, because you have a relatively small number of
| channels (eg. 8) and dies to access in parallel. Once you're
| up to the throughput range where you have dozens of requests
| in flight at a time, you'll have constant collisions where
| multiple requests want to read from the same
| plane/die/channel at once, and some of those requests have to
| be delayed. But that's mostly about contention and link
| utilization between the SSD controller and the NAND flash
| itself. The PCIe link is pretty good about handling
| transactions with consistently low latency even when on
| average it's mostly busy.
| uyt wrote:
| Are you referring to Little's Law?
| MrFoof wrote:
| >... _but does anyone else mostly care about IOPS_
|
| IOPS helps, but for the average user, hundreds of thousands is
| already functionally infinite. What matters at this point is
| latency. Where you really feel that is Queue Depth 1. Where you
| read a file that points you to other files, that point you to
| other files, etc. That is the exact case where the computer is
| still making you wait.
|
| This happens when you start your operating system, it starts
| services, you launch apps, etc. Driving that latency down is
| the biggest improvement you'll ever see past where we are today
| in terms of IOPS and throughput.
|
| This is where the latest Optane actually shines. Optane doesn't
| win on IOPS or throughput, but where it shines is its crazy
| latency _(delivered at relatively low power levels)_. Where
| latencies are 10% of that of even the highest end PCIe 4.0 NVMe
| SSDs. Do something like launch 20 applications at once, and it
| 'll be done in a fraction of a time compared to even something
| like a Samsung 980 Pro because of latency being more around 10
| ms instead of 100 ms.
|
| PCIe 5.0 SSDs will cut latencies down to where Optane is today,
| but driving latency under 1 ms is where we'll get into a new
| level of crazy.
| jiggawatts wrote:
| I can't upvote this enough.
|
| Related: Notice how the public cloud marketing material tends
| to focus on scalability over other metrics? That's because
| scaling horizontally for them is _easy_ : They just plop down
| more "stamps" -- a set of clusters and controllers that is
| their unit of scale. Need 1,000 more servers in US East? Plop
| down 10 more stamps of 100 servers. Easy!
|
| Except of course this involves an awful lot of networking,
| with long cable runs and many hops and virtualisation layers.
| The end result is that you can't get _anywhere_ near the
| underlying storage latency.
|
| Azure's Premium SSD has "write flush" latencies of about 4
| milliseconds according to my measurements, which is easily
| 100x slower than what my laptop can do with a now very
| outdated Samsung NVMe SSD.
|
| Notice that if you go to their marketing page, they talk
| about "low latency" and "lowest latency", but they have _no
| numbers?_ Meanwhile the MB /s and IOPS is stated with
| numbers: https://azure.microsoft.com/en-
| us/pricing/details/managed-di...
| Dylan16807 wrote:
| Isn't launching 20 applications _at once_ the realm where
| flash competes the best?
| WesolyKubeczek wrote:
| 1) is the actual NAND flash faster, or are we talking about going
| from "awesome" to "abysmal" as soon as you run out of DRAM/SLC
| caches or what have you? Which, given this kind of bandwidth, is
| going to be sooner rather than later.
|
| 2) this plus QLC cells, which, durability-wise, make TLC look
| good, makes me anticipate headlines like "This new PCIe 5.0 SSD
| ran out of its rated TBW in a week!"
| NullPrefix wrote:
| Chia is a godsend for storage prosumers. All you have to do is
| look at the speed and check if the warranty is voided by Chia.
| wtallis wrote:
| These controllers are for enterprise drives where SLC caching
| is almost unheard of and all the performance specs are for
| sustained performance on a full drive. But the best performance
| may only be attainable on a drive with higher capacity than you
| can afford.
| derefr wrote:
| > Which, given this kind of bandwidth, is going to be sooner
| rather than later.
|
| I don't see why -- like NAND up until now, it keeps up by
| adding more separately-addressable chips to the board and
| striping writes across them. A 512GB SSD with this chip would
| hit the wall pretty soon, but they wouldn't waste this
| controller on a 512GB SSD. They'd use it for 16TB+ SSDs.
| vbezhenar wrote:
| > "This new PCIe 5.0 SSD ran out of its rated TBW in a week!"
|
| If this drive supports claimed 9 GB/s write speed, you can
| write 324 TB at 10 hours. Samsung QLC warranty for 1 TB drive
| is 360 TBW.
| wtallis wrote:
| 1 TB of QLC can't get anywhere close to 9 GB/s write speed.
| The best is currently about 40 MB/s for a 1 Tbit die, so 320
| MB/s for 1 TByte. The slow write speed of QLC generally
| prevents you from burning out a drive in less than a few
| weeks.
| NikolaeVarius wrote:
| Is there a theoretical "practical" limit for how fast these
| things can physically get (in this particular form factor)
|
| My understanding is that NVMe interface is pretty much as close
| as you can get to the CPU without being integrated. Is there a
| world where these things can operate as fast as RAM?
| dragontamer wrote:
| > My understanding is that NVMe interface is pretty much as
| close as you can get to the CPU without being integrated.
|
| The best for consumer technology, yes. But future I/O protocols
| continue to improve and get better.
|
| NVidia / IBM's collaboration on OpenCAPI (which is deployed in
| Summit as a CPU/GPU interface) has 300GBps I/O between the CPU
| and GPU, far faster than NVMe speeds (and even DDR4 RAM
| bandwidth).
|
| And future chips may go even faster. I/O is probably one of the
| fastest growing aspects of a modern computer. PCIe 5.0, CXL,
| OpenCAPI, etc. etc. Lots and lots of new technology coming into
| play here.
|
| There are even some products on making Flash-RAM work on the
| DDR4 interface. Non-volatile memory is what that's called.
| Intel's Optane works pretty well on that. Its not very
| mainstream but I hear reports that its impressive (slower than
| real RAM of course, but storage that has the bandwidth of RAM
| is still cool).
|
| > Is there a world where these things can operate as fast as
| RAM?
|
| Well... yesish. Flash is a kind of RAM (random access memory).
|
| To answer your fundamental question though: No. Flash is
| fundamentally higher latency than DRAM (aka: DDR4). But with
| enough parallelism / big enough flash arrays (or DRAM arrays),
| you can continue to get higher and higher bandwidth.
|
| At the top-of-the-line is SRAM, the RAM used to make L1 cache
| and registers inside of CPUs. This is very expensive and rarely
| used.
|
| --------
|
| Then you've got various non-mainstream RAMs: FRAM, MRAM,
| Optane, etc. etc.
| baybal2 wrote:
| Flash chips themselves can't operate as fast as RAM. You can
| get as much bandwidth as RAM, but you can not get it as
| physically fast as RAM i.e. faster than 200ns
| anarazel wrote:
| Why? DMA can do transfers to/from CPU caches.
| baybal2 wrote:
| NAMD flash cells themselves can't charge/discharge faster
| m4rtink wrote:
| What about those non volatile RAM technologies (IIRC called
| 3D XPoint) Intel is using for their Optane stuff ?
|
| It seems to be kinda in between RAM and flash spec wise.
| Der_Einzige wrote:
| I actually was one of the people who did performance
| benchmarking of 3D XPoint before it came out. In app direct
| mode, you can maybe eek out 70% of the throughout and about
| 3x worse latency. Also, not all apps support app direct
| mode.
|
| Many customers try to use 3D XPoint as another part of the
| cache hierarchy in-between regular ssd and ram. It's
| actually pretty neat for faas workloads which want
| containers to be "warm" rather than "hot" or "cold"...
| wtallis wrote:
| To clarify for readers who aren't current on all the
| lingo: app direct mode refers to one of the modes for
| using the Optane persistent memory modules that connect
| to the CPU's DRAM controller. It doesn't apply to the
| Optane NVMe SSDs that use PCI Express and require the
| software overhead of a traditional IO stack. In a few
| years, something like CXL may allow for something close
| to the app direct mode to be usable on persistent memory
| devices with an SSD-like form factor, but we're not there
| yet.
| the8472 wrote:
| Optane comes in two flavors. As NVMe storage and as NVDIMM.
| The former sits a bit below flash in terms of latency. The
| latter sits between a bit above DRAM in terms of latency
| and is byte-addressable.
| baybal2 wrote:
| Intel-Micron tried, we seen how it went.
|
| It wasn't really that fast, or wear resistant to replace
| flash.
| LinAGKar wrote:
| Are you sure? That's not what I've heard. It's just
| really expensive, and not very dense.
| baybal2 wrote:
| Rephrasing: the speed, and write endurance were
| sufficiently jaw dropping enough to beat flash's density,
| and cost.
| gameswithgo wrote:
| Just put a battery on ram and you have an sdd as fast as ram,
| in principle.
|
| The big thing about SSDs is that while sequential reads and
| writes have steadily improved, many workloads have not. From a
| cheap SSD to the best, there is very minimal difference in
| things like "how fast does my computer boot", "how fast does my
| game or game level load" and "how fast does visual studio
| load", or "how fast does gcc compile"
| programmer_dude wrote:
| A battery is not a solid state device. It often has
| liquids/gels (electrolyte) in it.
| toast0 wrote:
| > From a cheap SSD to the best, there is very minimal
| difference.
|
| I've got a cheap SSD that will change your mind. It has
| amazingly poor performance though, I agree with the general
| concept that while quantitative differences can be measured,
| there's not a qualitative difference between competent SSDs.
| kmonsen wrote:
| Is your second paragraph really true? These exact worlds are
| the reasons many have upgraded their ssd, the PS5 will only
| let you start new games from they built in fast hard drive
| for example.
| topspin wrote:
| "Is your second paragraph really true?"
|
| Partially, on a PC. The difference, on a PC, between a high
| end NVMe SSD and a SATA SSD is minimal for most use cases;
| small enough that an average user won't perceive much
| difference. The workloads in question (booting, loading a
| program, compiling code, etc.) involve a lot of operations
| that aren't bound by IO performance (network exchanges,
| decompressing textures, etc.) and haven't been optimized to
| maximize the benefit of high performance storage so the
| throughput and IOPS difference of the storage device don't
| entirely dominate.
| wincy wrote:
| The PS5s operating system and even how they package game
| files had to be developed from the ground up to realize
| these gains. There's a good technical talk about it from
| one of Sony's engineers.
|
| https://m.youtube.com/watch?v=ph8LyNIT9sg
| emkoemko wrote:
| so Sony made their own OS for PS5? they are not using
| FreeBSD anymore?
| wtallis wrote:
| They might still be using BSD stuff, but they developed
| new IO and compression offload functionality that doesn't
| match any off the shelf capabilities I'm aware of.
| touisteur wrote:
| Might be close to spdk + compress accelerators?
| wtallis wrote:
| The PS5 and Xbox Series X were designed to offer high-
| performance SSDs as the least common denominator that game
| developers could rely on, so that game devs could start
| doing things that aren't possible if you still have to
| allow for the possibility of running off a hard drive.
| That's still largely forward-looking; most games currently
| available still aren't designed to do that much IO--but the
| consoles will treat all new games as if they really rely on
| that expectation of high performance IO, and that means
| running them only from the NVMe storage.
| blackoil wrote:
| Are these things blocked on disk i/o?
| vbezhenar wrote:
| I second that. I have M.2 SSD on laptop and SATA SSD on
| desktop. There's no perceived difference on disk operations
| for me outside of corner cases like copying huge file. But
| there's very noticeable difference between prices.
| reader_mode wrote:
| Build times should be better no ? Unless you have enough
| ram to keep it all im cache ?
| staticassertion wrote:
| That may be due to the fact that your software (like your
| OS) was built for a world of slow spinning disks, or maybe
| semi-slow SSDs at best. Not a lot of code is written with
| the assumption that disks are actually fast, and it's not
| too common to see people organizing sequential read/writes
| in their programs (except for dbs).
| programmer_dude wrote:
| +1, I installed an M.2 SSD in my desktop looking at the
| quoted 10x difference in speed but I was taken aback at the
| lack of any perceivable improvement. Money down the drain I
| guess.
| prutschman wrote:
| Do you have an NVMe M.2 device specifically? The M.2 form
| factor supports both NVMe and SATA (though any given device
| or slot might not support one or the other).
|
| I've got a workstation that has both SATA 3 and M.2 NVMe
| SSDs installed.
|
| The SATA 3 device can do sustained reads of about 550
| MB/sec, fairly close to the 600 MB/sec line rate of SATA 3.
|
| The NVMe device can do about 1.3 GB/sec, faster than
| physically possible for SATA 3.
| vbezhenar wrote:
| Yes, I have an NVMe M.2 device. Samsung 970 EVO Plus.
| branko_d wrote:
| Similar effect existed for the HDDs of old.
|
| As the platter density increased, the head could glide over
| more data at the same spindle speed and you would get
| increasingly higher sequential speed. But you would _not_ get
| much better latency - physically moving the head to a
| different track could not be done significantly faster. With
| each new generation, you would get only marginally better
| speed for many workloads that users actually care about. The
| latest HDDs could saturate SATA bus with sequential reads,
| but would still dip into KB territory (not MB, let alone GB)
| for sufficiently random-access workloads.
|
| SSDs are similar in a sense that they can be massively
| parallelized for the increased throughput, but the latency of
| an individual cell is much harder to improve. Benchmarks will
| saturate the I/O queue and reach astronomical numbers by
| reading many cells in parallel, but for most desktop users,
| the queue depth of 1 (and the individual cell latency) is
| probably more relevant. That's why a lowly SATA SSD will be
| only marginally slower than the newest NVMe SSD for booting
| an OS or loading a game.
| TwoBit wrote:
| Doesn't it depend a lot on the game? A well designed game
| could get a lot more out of a fast SSD.
| digikata wrote:
| NVMe is built on PCIe, so latency wise, they will be limited by
| the PCIe latency (roughly an order of magnitude slower than the
| memory bus though pcie 5 may be smaller than that) + media
| latency. Throughput wise, they are limited by the number of
| PCIe lanes that the controller supports, and the system it
| plugs into has been sized to allocate.
| vmception wrote:
| Which of your applications or your clients applications running
| at the same time are currently bottlenecked?
| Strom wrote:
| Even something as simple as grep is bottlenecked by disk speeds
| right now.
| api wrote:
| That's getting to be RAM speeds, but of course not RAM latencies.
| ChuckMcM wrote:
| Pretty impressive. And if it can really do 4 GB/s[1] of random
| write performance is super helpful in database applications.
|
| I am wondering where the "Cloud SSD" stuff takes it relative to
| general purpose use. Does anyone have any insights on that?
|
| [1] They quote 1 million IOPS and assuming a 4K block size, which
| is a good compromise between storage efficiency and throughput,
| gives the 4GB/s number.
| wtallis wrote:
| Here's a version of the OCP NVMe Cloud SSD Spec from last year:
| https://www.opencompute.org/documents/nvme-cloud-ssd-specifi...
|
| It covers a lot of ground, but as far as I'm aware nothing in
| there really makes drives less suitable for general server use.
| It just tightens up requirements that aren't addressed
| elsewhere.
| ChuckMcM wrote:
| Yeah, pretty much. Thanks I've added it to my specifications
| archive.
| ksec wrote:
| 20x20mm Package. That is quite large.
|
| ~10W Controller. Doesn't matter on a Desktop or Server. But we
| sort of hit the limit what could be done on a Laptop.
|
| Having said that it doesn't mention what node it was fabbed on. I
| assume there are more energy efficiency could be squeezed out.
|
| <6us Latency. Doesn't say what percentile it is and under what
| sort of condition. But Marvell claim this is 30% better than
| previous Marvell SSD Controller.
|
| I think ServeTheHome article [1] is probably better. ( Cant
| change it now >< )
|
| We also have PCI-E 6.0 [1] finalised by the end of this year
| which we cant expect to be in 2023/2024. SSD Controller with
| 28GB/s.
|
| I am also wondering if we are approaching the end of S-Cruve.
|
| [1] https://www.servethehome.com/marvell-bravera-
| sc5-offers-2m-i...
|
| [2] https://www.anandtech.com/show/16704/pci-
| express-60-status-u...
| jagger27 wrote:
| 10W definitely matters on servers. You can easily have a dozen
| of these in 1U. 120W isn't nothing to dissipate.
|
| I wonder what node this controller is made on. If it's made on
| TSMC N7 then they could cut power consumption roughly in half
| by going to N5P. The package size makes me wonder if it's an
| even older node however.
| dogma1138 wrote:
| It's not an issue in servers really, look at how much power
| high end NIC consume heck 10GBE SFP+ modules can consumer 5W>
| each and you can easily have 48> of those in a switch...
| dylan604 wrote:
| But a switch doesn't include a hairdrier, er, GPU
| generating heat within the enclosure. Server cases have
| potentially multiple GPUs, CPUs, plus now these 10W
| controller chips
| lostlogin wrote:
| You are correct.
|
| Some switches could do with much better cooling. A POE
| switch or one that is putting 10gb down a cat 6 cable is
| very toasty though. I've got one that is borderline too
| hot to touch. Thanks Ubiquiti.
| touisteur wrote:
| Oh you don't want your hand near a Mellanox connectx 6
| then. 200GbE doesn't come cold...
| zamadatix wrote:
| Most switches don't allow 5W pull in every port, those
| levels are usually only found in 10G copper SFPs which
| can't reach full distance due to power requirements so
| typically pull max allowed (or higher) levels. Typical 10G
| SFP+ SR or Twinax will consume about a watt per module. The
| ASIC may be a couple hundred watts under load.
|
| Servers typically have much higher power density unless
| you're talking 400G switches compared to low end servers.
| lmilcin wrote:
| > ~10W Controller. Doesn't matter on a Desktop or Server.
|
| Of course it does. For one, that's heat you have to efficiently
| remove or face SSD throttling back on you or degrade over time.
| It does not make sense to buy expensive hardware if it is going
| to show impaired performance due to thermal throttling.
|
| This makes the business of putting together your PC this much
| more complicated, because up until recently you only had to
| take care about CPU and GPU and everything else was mostly
| afterthought.
|
| We are already facing motherboards with their own fans, now I
| suppose it is time for SSDs.
| dragontamer wrote:
| Weren't 10,000 or 15,000 RPM hard drives like 15W or 20W or
| something?
|
| These 2U or 4U cases, or tower desktop cases, were designed
| to efficiently remove heat from storage devices, as well as
| the 3000W++ that the dual socket CPU and 8way GPUs will pull.
|
| 10W is tiny for desktop and server. Barely a factor in
| overall cooling plans.
| lmilcin wrote:
| They were, but 3,5" drives are large hunks of aluminium
| with many times the surface. Meaning that as long as you
| have some airflow around them and the air is not too hot
| they are fine.
|
| Also the heat was mostly generated by motor and an actuator
| and not the controller.
| zamadatix wrote:
| High performance m.2 drives come with removable finned
| heatsinks for this reason. Without them they rely on
| throttling during sustained heavy workload. Dissipating
| 10 Watts in a desktop isn't the concern.
| Matt3o12_ wrote:
| The heatsink doesn't really work, though, and is
| marketing for the most part [1].
|
| I have a crucial P1 NVME SSD and I can make it overheat
| pretty reliably. Pretty much any synthetic workload makes
| it overload if the SSD is empty (it reaches 70deg pretty
| quickly and even starts throttling until it reaches 80deg
| and the whole system starts shuttering because of extreme
| throttling do it doesn't damage itself. Although I have
| not properly tested it, it seems that not using any
| heatsinks from my motherboard makes the temps actually
| better but it still overheats.
|
| The main reason it can overheat quickly is probably
| because its sitting in a really bad position where it
| gets close to zero airflow despite being in an airflow
| focused case. Most motherboards place the nvme slot
| directly under the GPU. The main problem seems to be that
| the controller is overheating when it's writing at close
| to 2000 MB/s. It's also important to note that only the
| controller (an actual relatively powerful ARM processor),
| not the flash memory, seems to overheat.
|
| Fortunately, this is mostly not an issue because it's a
| QLC drive and the workload is unrealistic in the real
| world. When writing to an empty drive at 2000MB/s (Queue
| depth 4, 128k sequential writes), it takes 2 minutes
| until the cache is full. The way its currently used, it
| takes 30 secs for the cache to become full and for write
| speeds to drop to 150MB/s. The only way it has every
| overheated in the real world was during the loading
| screen of a gameplay when it reached 78C quickly (and I
| only noticed it in the hardware monitor). If the GPU
| hadn't heated up the nvme drive before (it was sitting at
| 65C mostly idle), and starved it for air, I doubt it
| would have hit 60C.
|
| So until motherboards start placing nvme where it can get
| some actual cooling, or they make actual functioning
| heatsinks, their power usage can make a difference.
|
| [1]: https://www.gamersnexus.net/guides/2781-msi-m2-heat-
| shield-i... but there are many more articles/forum posts
| with similar issues.
| zamadatix wrote:
| The Crucial P1 is a budget SSD that doesn't come with a
| proper heatsink, similar in quality to that god awful
| "heat shield" in that linked review. When I say "High
| performance m.2 drives come with removable finned
| heatsinks" I mean an actual high performance drives that
| come with a finned heatsinks like
| https://www.amazon.com/Corsair-Force-
| MP600-Gen4-PCIe/dp/B07S... not examples of budget drives
| paired with flat pieces of metal.
|
| Also your high performance SSD should be going in the
| direct-to-CPU slot to the right of the GPU, not under it.
| baybal2 wrote:
| The thing is the flash chips themselves are relatively
| cool.
|
| It's the controller that gets most of the heat from
| running PCIE at top speed.
| matheusmoreira wrote:
| Would be cool to have a general liquid cooling solution for
| all components. We'd install a radiator outside our homes
| just like an air conditioning unit and then connect the
| computers to it.
| AtlasBarfed wrote:
| Probably not a small node, remember flash gets more fragile the
| smaller the process.
|
| SSD makers are layering on larger nodes, and focusing on
| multibit (they are basically at PLC/5bit flash for consumer or
| non-heavy wear, which is frankly a bit nuts)
| Dylan16807 wrote:
| This is a controller, not flash.
| bhouston wrote:
| 2023/2024 I suspect you meant to write.
|
| I do like the fact that SSD bandwidth may be approaching memory
| bandwidth.
| ksec wrote:
| >2023/2024 I suspect you meant to write.
|
| ROFL. Thanks. Keep thinking about the good old days.
| CyberDildonics wrote:
| SSD bandwidth is not really approaching modern memory
| bandwidth - a computer with one of these will probably have 4
| to 8 channels of DDR4 at 25GB/s each with DDR5 released about
| a year ago. That is 100GB/s to 200GB/s currently and more by
| the time PCIe 5 becomes a reality.
|
| I am sure if you go back a few years you can find systems
| that have the same or less memory bandwidth than this has
| now, but they have both been moving forward enough that SSDs
| are still not close. That being said, between bandwidth and
| random access times, swapping out to disk backed virtual
| memory is very pragmatic and isn't the death of performance
| it used to be.
| dekhn wrote:
| I think a fairer comparison would be 1 DDR channel to one
| SSD, or comparing the 8 channels to a multi-SSD striped
| RAID array. SSDs are ~7GB/sec, so I would say that SSD
| bandwidth is roughly within one or two orders of magnitude
| slower than RAM. I certainly wouldn't want to replace an
| app optimized to use all the RAM on a machine with one
| running on a lower-RAM SSD machine, though.
| muxator wrote:
| Maybe I'm old. I never used to think that an application
| that is able to use my whole ram is optimized. It evokes
| exactly the opposite impression, indeed.
|
| I think I understand what you mean, but the gut reaction
| is that one.
|
| Yeah, I am really old, after all.
|
| Edit: let me explain. Something that is able to use the
| whole space of a flat memory model is way less
| sophisticated than something that is able to deal with a
| complex memory hierarchy. Our machines are indeed a
| complex pyramid of different subsystems with varying
| bandwidth and latency characteristics. A program that is
| able to embrace the inherent hierarchical nature of our
| machines (or multi-node systems) is way more "optimized",
| according to my sensibility.
| dekhn wrote:
| I'm talking about situations like search, where you hold
| the entire index in ram. Total # of machines = size of
| index / indexserver ram. Usually the apps that run on
| these have, say, 96 cores and they're using about 80, and
| the idle time is mostly instructions waiting for memory
| fetches.
|
| Typically that index fronts a disk repository which
| wouldn't fit in RAM, although over time, what fit in RAM,
| what fit in disk, what lived in RAM, what got cached in
| RAM, etc, have changed over time.
|
| BTW, I'm probably of the same generation as you and the
| single most important lesson I ever learned for computing
| performance was "add more RAM"; in the days when I first
| started using Linux with 4MB of RAM, it wasn't enough to
| do X11, g++ and emacs all at the same time without
| swapping, so I spent my hard-earned money to max out the
| RAM, at which point it didn't swap and I could actually
| do software development quickly.
| terafo wrote:
| If you have 8 of these on a single system, you are getting
| very close to RAM bandwidth. Still, there is issue of
| latency and TBW.
| jopsen wrote:
| Now if we could just make a Beowulf cluster of these...
| bhouston wrote:
| Beowulf, that is a name i haven't heard in a long time.
| TwoBit wrote:
| IMO disk tech of 25 GB/s vs 100 GB/s for memory counts as
| approaching.
| anarazel wrote:
| It really depends on the type of memory usage though. On
| current Intel multi-socket the bandwidth a single process
| can have for core meditated memory accesses, even for node
| local memory, is seriously disappointing. Often < 10GB/s.
| Yes, it scales reasonably nicely, but it's very painful to
| tune for that, given that client and low core count, single
| socket, SKUs end up with > 35GB/s. And that this affects
| you even from within VMs that are on a single socket.
|
| I heard that ice lake sp improves a bit in the area, but
| haven't gotten access to one yet.
| walrus01 wrote:
| Given a theoretical either/or choice between a PCIE 5.0 SSD, or
| more PCI-E lanes using the tech we have now, I would rather have
| a greater number of PCI-E 4.0 lanes in single socket
| consumer/workstation grade motherboards.
|
| Leaving open the possibility for dual NVME SSD in a workstation
| along with a x16 video card, and 10Gbps NICs.
| Synaesthesia wrote:
| Still not good enough for PS5 hey Sony?
| dvfjsdhgfv wrote:
| It looks like we'll get fantastic speeds in a few years. Now it's
| time to take care of durability.
| dstaley wrote:
| Obviously there's always a customer for faster speeds, but have
| we even hit the upper threshold of PCIe 4 in the consumer market?
| wmf wrote:
| SN850 is getting close to the limit of PCIe 4.0.
| https://www.anandtech.com/show/16505/the-western-digital-wd-...
| wtallis wrote:
| The first wave of consumer gen4 SSDs that all used the Phison
| E16 controller were only good for about 5 GB/s out of the ~7
| GB/s possible (and 3.5 GB/s on PCIe gen3). But the newer gen4
| consumer drives that started hitting the market last fall come
| a lot closer, and this summer a lot of those are getting
| refreshed with newer, faster NAND that will have PCIe 4.0 as
| thoroughly saturated as PCIe 3.0 has been for the past few
| years. Phison has already clearly stated that their next high-
| end controller after the recently-launched E18 will be a PCIe
| gen5 chip, and the E18 is fast enough to finish out the gen4
| era.
| dstaley wrote:
| Absolutely wild that PCIe 4.0 was saturated just two years
| after introduction in the consumer market.
___________________________________________________________________
(page generated 2021-05-27 23:00 UTC)