[HN Gopher] Transforming a QLC SSD into an SLC SSD
       ___________________________________________________________________
        
       Transforming a QLC SSD into an SLC SSD
        
       Author : userbinator
       Score  : 207 points
       Date   : 2024-05-19 09:30 UTC (13 hours ago)
        
 (HTM) web link (theoverclockingpage.com)
 (TXT) w3m dump (theoverclockingpage.com)
        
       | mofosyne wrote:
       | It be nice if manufacturers provide a way to downgrade SSD to SLC
       | via some driver settings.
        
         | NL807 wrote:
         | but how would they make more money?
        
           | Lex-2008 wrote:
           | I'd buy such a device. Currently in holding on to my last
           | pair of SSDs from pre-QLC era, refusing to buy anything new
        
             | LoganDark wrote:
             | There are still new SSDs that use TLC, such as the Silicon
             | Power UD90 (I have one in my system). Not only that, some
             | of them will run in SLC mode when writing new data and then
             | move the data to TLC later - advertised as SLC Caching -
             | which could be even better than always-TLC drives (even
             | ones with a DRAM cache).
        
               | drtgh wrote:
               | Your comment, along with other users, suggests that TLC
               | is a positive attribute for consumers, however, the
               | transition from SLC and MLC NAND to TLC and QLC 3D-NAND
               | actually marked a decline in the longevity of SSDs.
               | 
               | Using a mode other than SLC with current SSDs is insane
               | due to the difference with planar NAND features, as the
               | current 3D-NAND consumes writes for everything.
               | 
               | 3D-NAND, To read data consume writes [0],
               | " Figure 1a plots the average SSD lifetime consumed by
               | the read-only workloads across 200 days on three SSDs
               | (the detailed parameters of these SSDs can be found from
               | SSD-A/-B/-C in Table 1). As shown in the figure, the
               | lifetime consumed by the read (disturbance) induced
               | writes increases significantly as the SSD density
               | increases. In addition, increasing the read throughput
               | (from 17MBps to 56/68MBps) can greatly accelerate the
               | lifetime consumption. Even more problematically, as the
               | density increases, the SSD lifetime (plotted in Figure
               | 1b) decreases. In addition, SSD-aware write-reduction-
               | oriented system software is no longer sufficient for
               | high-density 3D SSDs, to reduce lifetime consumption.
               | This is because the SSDs entered an era where one can
               | wear out an SSD by simply reading it."
               | 
               | 3D-NAND, Data retention consume writes [1],
               | " 3D NAND flash memory exhibits three new error sources
               | that were not previously observed in planar NAND flash
               | memory:              (1) layer-to-layer process
               | variation,          a new phenomenon specific to the 3D
               | nature of the device, where the average error rate of
               | each 3D-stacked layer in a chip is significantly
               | different;              (2) early retention loss,
               | a new phenomenon where the number of errors due to charge
               | leakage increases quickly within several hours after
               | programming; and              (3) retention interference,
               | a new phenomenon where the rate at which charge leaks
               | from a flash cell is dependent on the data value stored
               | in the neighboring cell. "
               | 
               | [0] https://dl.acm.org/doi/10.1145/3445814.3446733
               | 
               | [1] https://ghose.cs.illinois.edu/papers/18sigmetrics_3df
               | lash.pd...
        
               | jsheard wrote:
               | Even datacenter-grade drives scarcely use SLC or MLC
               | anymore since TLC has matured to the point of being more
               | than good enough even in most server workloads, what
               | possible need would 99% of consumers have for SLC/MLC
               | nowadays?
               | 
               | If you really want a modern SLC drive there's the Kioxia
               | FL6, which has a whopping 350,400 TB of write endurance
               | in the 3TB variant, but it'll cost you $4320.
               | Alternatively you can get 4TB of TLC for $300 and take
               | your chances with "only" 2400 TB endurance.
        
               | drtgh wrote:
               | TLC cannot mature as long as it continues to use 3D-NAND
               | without utilizing a more advanced material science.
               | Reading data and preserving data consume writes, what
               | degrades the memory, because the traces in the vertical
               | stack of the circuit create interference.
               | 
               | Perhaps there are techniques available to separate the
               | traces, but this would ultimately increase the surface
               | area? which seems to be something they are trying to
               | avoid.
               | 
               | You should not use datacenter SSD disks as a reference,
               | as they typically do not last more than two years and a
               | half. It appears to be a profitable opportunity for the
               | SSD manufacturer, and increasing longevity does not seem
               | to be a priority.
               | 
               | To be more specific, we are talking about planned
               | obsolescence for consumer and enterprise SSD disks.
               | 
               | > If you really want a modern SLC drive there's the
               | Kioxia FL6, which has a whopping 350,400 TB of write
               | endurance in the 3TB variant, but it'll cost you $4320.
               | 
               | Did you read the OP article?
        
               | LoganDark wrote:
               | I got 4TB of TLC for $230 (Silicon Power UD90). It even
               | has SLC caching (can use parts of the flash in SLC mode
               | for short periods of time).
        
               | jsheard wrote:
               | True, I was looking at the prices for higher end drives
               | with on-board DRAM, but DRAM-less drives like that UD90
               | are also fine in the age of NVMe. Going DRAM-less was a
               | significant compromise on SATA SSDs, but NVMe allows the
               | drive to borrow a small chunk of system RAM over PCIe
               | DMA, and in practice that works well enough.
               | 
               | (Caveat: that DMA trick doesn't work if you put the drive
               | in a USB enclosure, so if that's your use-case you should
               | ideally still look for a drive with its own DRAM)
        
               | LoganDark wrote:
               | > Your comment, along with other users, suggests that TLC
               | is a positive attribute for consumers
               | 
               | TLC is better than QLC, which is specifically what my
               | comment was addressing; I never implied that it's better
               | than SLC though, so just don't, please.
               | 
               | It's interesting to see that 3D-NAND has other issues
               | even when run in SLC mode, though.
        
               | drtgh wrote:
               | > I never implied that it's better than SLC though, so
               | just don't, please.
               | 
               | My apologies.
               | 
               | > It's interesting to see that 3D-NAND has other issues
               | even when run in SLC mode, though.
               | 
               | Basically the SSD manufacturers are increasing capacity
               | by adding more layers (3D-NAND). When one cell is read in
               | the vertical stack, the interferences produced by the
               | traces in the area increases the cells that need to be
               | rewritten, what consumes the life of the device, by
               | design.
        
               | wtallis wrote:
               | > When one cell is read in the vertical stack, the
               | interferences produced by the traces in the area
               | increases the cells that need to be rewritten, what
               | consumes the life of the device, by design.
               | 
               | You should try being honest about the magnitude of this
               | effect. It takes _thousands_ of read operations at a
               | minimum to cause a read disturb that can be fixed with
               | _one_ write. What you 're complaining about is the NAND
               | equivalent of DRAM rowhammer. It's not a serious problem
               | in practice.
        
               | drtgh wrote:
               | Not NAND equivalent as the larger the stack, the larger
               | the writings on the continuous cells, not just rewriting
               | a single cell.
               | 
               | Here, the dishonest are the SSD manufacturers of the last
               | decade, and they are feeling so comfortable as to
               | introduce QLC into the market.
               | 
               | > It's not a serious problem in practice.
               | 
               | It's as serious as in to read data consume the disk, and
               | the faster its read the faster it's consumed [0]. You
               | should have noticed that SSD disks no longer come with a
               | 10-year warranty.                   "under low throughput
               | read-only workloads, SSD-A/-B/-C/-D/-E/-F extensively
               | rewrite the potentially-disturbed data in the background,
               | to mitigate the read (disturbance) induced latency
               | problem and sustain a good read performance. Such
               | rewrites significantly consume the already-reduced SSD
               | lifetime. "
               | 
               | Under low throughput read-only workloads.
               | 
               | It is a paper from 2021, what means sci-hub can be used
               | to read it.
               | 
               | [0] https://dl.acm.org/doi/10.1145/3445814.3446733
        
               | wtallis wrote:
               | > It's as serious as in to read data consume the disk,
               | and the faster its read the faster is consumed
               | 
               | Numbers, please. Quantify that or GTFO. You keep quoting
               | stuff that implies SSDs are horrifically unreliable and
               | burning through their write endurance alarmingly fast.
               | But the reality is that even consumer PCs with cheap SSDs
               | are not experiencing an epidemic of premature SSD
               | failures.
               | 
               | EDIT:
               | 
               | > You should have noticed that SSD disks no longer come
               | with a 10-year warranty.
               | 
               | 10-year warranties were never common for SSDs. There was
               | a brief span of time where the flagship consumer SSDs
               | from Samsung and SanDisk had 10-year warranties because
               | they were trying to one-up each other and couldn't
               | improve performance any further because they had
               | saturated what SATA was capable of. The fact that those
               | 10-year warranties existed for a while and then went away
               | says _nothing_ about trends in the true reliability of
               | the storage. SSD warranties and write endurance ratings
               | are dictated primarily by _marketing_ requirements.
        
               | drtgh wrote:
               | In a 2min search,
               | 
               | https://www.reddit.com/r/DataHoarder/comments/150orlb/ent
               | erp...                   "So, on page 8's graphs, they
               | show that 800GB-3800GB 3D-TLC SSDs had a very low "total
               | drive failure" rate. But as soon as you got to 8000GB and
               | 15000GB, the drives had a MASSIVE increase in risk that
               | the entire drive has hardware errors and dies, becomes
               | non-responsive, etc."
               | 
               | Study:
               | https://www.usenix.org/system/files/fast20-maneas.pdf
               | 
               | (with video): https://www.usenix.org/conference/fast20/pr
               | esentation/maneas
        
               | wtallis wrote:
               | Would you care to explain how any of that supports the
               | points you're actually making here?
               | 
               | Some of what you're spamming seems to directly undermine
               | your claims, eg.:
               | 
               | > Another finding is that SLC (single level cell), the
               | most costly drives, are NOT more reliable than MLC
               | drives. And while the newest high density 3D-TLC (triple
               | level cell) drives have the highest overall replacement
               | rate, the difference is likely not caused by the 3D-TLC
               | technology
        
               | drtgh wrote:
               | "likely" not caused by. Any case I delete such spamming?
               | link.
               | 
               | > Would you care to explain how any of that supports the
               | points you're actually making here?
               | 
               | Other day, if you don't mind.
        
               | Dylan16807 wrote:
               | The massive increase is still 1/500 chance per year.
        
               | wtallis wrote:
               | > Your comment, along with other users, suggests that TLC
               | is a positive attribute for consumers, however, the
               | transition from SLC and MLC NAND to TLC and QLC 3D-NAND
               | actually marked a decline in the longevity of SSDs.
               | 
               | The bit that you're pointedly ignoring and that none of
               | your quotes address is the fact that SLC SSDs had _far_
               | more longevity than anyone really needed. Sacrificing
               | longevity to get higher capacity for the same price was
               | the _right tradeoff_ for consumers and almost all server
               | use cases.
               | 
               | The fact that 3D NAND has some new mechanisms for data to
               | be corrupted is _pointless trivia_ on its own, bordering
               | on fearmongering the way you 're presenting it. The real
               | impact these issues have on overall drive lifetime,
               | compared to realistic estimates of how much lifespan
               | people actually need from their drives, is not at all
               | alarming.
               | 
               | Not using SLC is not insane. Insisting on using SLC
               | everywhere is what's insane.
        
             | NL807 wrote:
             | The point I was making is that there is no profit to be
             | made by extending the life of drives. And sample size of
             | one (i.e. you) is not representative of the market. There
             | is always a demand for storage and people will keep buying
             | worse products because there is no other choice.
        
               | AnthonyMouse wrote:
               | I don't understand this logic. Consider the two
               | possibilities here.
               | 
               | The first is that only weird tech people are interested
               | in doing this. Then they might as well allow it because
               | it's a negligible proportion of the market but it makes
               | those customers favor rather than dislike your brand, and
               | makes them more likely to recommend your devices to
               | others, which makes you some money.
               | 
               | The second is that it would be widely popular and large
               | numbers of customers would want to do it, and thereby
               | choose the drives that allow it. Then if Samsung does it
               | and SanDisk doesn't, or vice versa, they take more of the
               | other's customers. Allowing it is the thing makes them
               | _more_ money.
               | 
               | Meanwhile the thing that trashes most SSDs isn't wear,
               | it's obsolescence. There are millions of ten year old QLC
               | SSDs that are perfectly operational because they lived in
               | a desktop and saw five drive writes over their entire
               | existence. They're worthless not because they don't work,
               | but because a drive which is newer and bigger and faster
               | is $20. It costs the manufacturer nothing to let them be
               | more reliable because they're going in the bin one way or
               | the other.
               | 
               | The status quo seems like MBAs cargo culting some
               | heuristic where a company makes money in proportion to
               | how evil they are. Companies actually make money in
               | proportion to how much money they can get customers to
               | spend with them. Which often has something to do with how
               | much customers like them.
        
               | userbinator wrote:
               | _There are millions of ten year old QLC SSDs_
               | 
               | In 2014 QLC was nothing but a research curiosity. The
               | first QLC SSD was introduced in 2018:
               | 
               | https://www.top500.org/news/intel-micron-ship-first-qlc-
               | flas...
               | 
               | You have to also remember that people buy storage
               | expecting it to last. I have decades-old magnetic media
               | which is tiny but still readable.
        
             | bayindirh wrote:
             | Samsung 870EVO (SSD), 980 Pro/990 Pro (NVMe) are all TLC
             | drives. Kingston KC3000 is faster than 980 Pro, hence it's
             | probably TLC, too.
        
               | jsheard wrote:
               | A decent rule of thumb if that if a drive uses TLC, it
               | will probably say so in the spec sheet.
               | 
               | If it's left ambiguous then it's either QLC, or a lottery
               | where the "same" model may be TLC or QLC.
        
               | bayindirh wrote:
               | Kingston NV2 is in that "what you get may differ"
               | category, and Kingston explicitly says that what you get
               | may change. I have two NV2s with differing die count, for
               | example. Their controller might be different too. They're
               | external, short-use drives, so I don't care.
               | 
               | So, returning to previously mentioned ones, from their
               | respective datasheets:                   - 870 EVO:
               | Samsung V-NAND 3bit MLC         - 980 Pro: Samsung V-NAND
               | 3bit MLC         - 990 Pro: Samsung V-NAND TLC         -
               | KC3000: NAND: 3D TLC         - NV2: NAND: 3D // Explicit
               | Lottery.
        
               | ngcc_hk wrote:
               | A bit confused ... the article is about Ssd drive with
               | 500- M. Is what it said and discussed more details here
               | applied to nvme drive with 1000+ M. Same?
        
         | numpad0 wrote:
         | Great thing about disks is that they don't require drivers at
         | all. The driver settings Windows app is not going to be open
         | sourced if such thing were to exist.
        
         | surajrmal wrote:
         | While ssds do not, all flash chips do. So if you were ever
         | going to try building your own SSD or simply connect some flash
         | directly up to your soc via some extra pins, you would be able
         | to program them this way. I imagine extending NVMe to offer
         | this is possible if there was enough popular demand.
        
           | namibj wrote:
           | NVMe already supports low level reformatting.
        
       | Havoc wrote:
       | Wild! I had assumed this is a hardware level distinction
        
         | LoganDark wrote:
         | How many bits a particular NAND chip _can_ store per cell is
         | presumably hardware-level, but I believe it 's possible to
         | achieve SLC on all of them anyway, even if they support TLC or
         | QLC.
         | 
         | Hell, the Silicon Power NVMe SSD I have in my machine right now
         | will use SLC for writes, then (presumably) move that data later
         | to TLC during periods of inactivity. Running the NAND in SLC
         | mode is a feature of these drives, it's called "SLC caching".
        
         | magicalhippo wrote:
         | Of course it is trivial to just write 000 for zero and 111 for
         | one in the cells of a TLC SSD to turn it into effectively a SLC
         | SSD, but that in itself doesn't explain why it's so much faster
         | to read and write compared to TLC.
         | 
         | For example, if it had been DRAM where the data is stored as
         | charge on a capacitor, then one could imagine using a R-2R
         | ladder DAC to write the values and a flash ADC to read the
         | values. In that case there would be no speed difference between
         | how many effective levels was stored per cell (ignoring noise
         | and such).
         | 
         | From what I can gather, the reason the pseudo-SLC mode is
         | faster is down to how flash is programmed and read, and relies
         | on the analog nature of flash memory.
         | 
         | Like DRAM there's still a charge that's being used to store the
         | value, however it's not just in a plain capacitor but in a
         | double MOSFET gate[1].
         | 
         | The amount of charge changes the effective threshold voltage of
         | the transistor. Thus to read, one needs to apply different
         | voltages to see when the transistor starts to conduct[2].
         | 
         | To program a cell, one has to inject some amount of charge that
         | puts the threshold voltage to a given value depending on which
         | bit pattern you want to program. Since one can only inject
         | charge, one must be careful not to overshoot. Thus one uses a
         | series of brief pulses and then do a read cycle to see if the
         | required level has been reached or not[3], repeating as needed.
         | Thus the more levels per cell, the shorter pulses are needed
         | and more read cycles to ensure the required amount of charge is
         | reached.
         | 
         | When programming the multi-level cell in single-level mode, you
         | can get away with just a single, larger charge injection[4].
         | And when reading the value back, you just need to determine if
         | the transistor conducts at a single level or not.
         | 
         | So to sum up, pseudo-SLC does not require changes to the multi-
         | level cells as such, but it does require changes to how those
         | cells are programmed and read. So most likely it requires
         | changing those circuits somewhat, meaning you can't implement
         | this just in firmware.
         | 
         | [1]: https://en.wikipedia.org/wiki/Flash_memory#Floating-
         | gate_MOS...
         | 
         | [2]:
         | https://dr.ntu.edu.sg/bitstream/10356/80559/1/Read%20and%20w...
         | 
         | [3]: https://people.engr.tamu.edu/ajiang/CellProgram.pdf
         | 
         | [4]: http://nyx.skku.ac.kr/publications/papers/ComboFTL.pdf
        
       | hwbunny wrote:
       | Silicon Motion controllers are trash.
        
         | mardifoufs wrote:
         | I thought they were the best in class. What's the alternatives?
        
       | nickcw wrote:
       | This hack seems to take a 480GB SSD and transform it into a 120GB
       | SSD
       | 
       | However the write endurance (the amount of data you can write to
       | the SSD before expecting failures) increases from 120TB to 4000TB
       | which could be a very useful tradeoff, for example if you were
       | using the disk to store logs.
       | 
       | I've never seen this offered by the manufacturers though (maybe I
       | haven't looked on the right place), I wonder why not?
        
         | dist-epoch wrote:
         | Manufacturers offer that, in the form of TLC drives. Which are
         | supported, unlike this hack which might cause data loss.
         | 
         | This gives you 120GB with 4000TB write endurance, but you can
         | buy a 4TB TLC drive with 3000TB write endurance for $200.
        
           | greggsy wrote:
           | Then you could use this technique to achieve something like a
           | 1.2TB disk with 40PB TBW?
           | 
           | I'd be fascinated to hear any potential use cases for that
           | level of endurance in modern data storage.
        
             | justinclift wrote:
             | > use cases for that level of endurance in modern data
             | storage.
             | 
             | All flash arrays. Saying that, as I have a bunch of smaller
             | (400GB) 2.5" SAS SSDs combined into larger all-flash
             | arrays, with each one of those SSD's rated for about 30PB
             | of endurance.
             | 
             | I'm expecting the servers to be dead by the time that
             | endurance is exhausted though. ;)
        
               | greggsy wrote:
               | Exactly, I've done similar maths on my disks, and
               | realised that it would be 20 years before they approach
               | their end of life.
               | 
               | By which point, they will be replaced for some new tech
               | that will be cheaper, faster and more reliable and power
               | efficient.
        
           | causality0 wrote:
           | Which drive would that be? The ones I'm seeing cost a lot
           | more than $200.
        
             | hippich wrote:
             | I'll second that question!
        
             | BertoldVdb wrote:
             | SSD prices fluctuate a lot. I recently bought 4TB SSDs for
             | 209eu but they are more expensive now (SNV2S/4000G, QLC
             | though)
        
             | 5e92cb50239222b wrote:
             | My friend picked up a 3.84 TB Kingston SEDC600M with 7 PB
             | of write endurance on sale for $180 a couple of months ago.
             | That same place now sells them for around $360. Definitely
             | an original drive. Maybe you just have to be on the lookout
             | for one for when they go on sale.
        
         | greggsy wrote:
         | I wonder if it would be useful as cache disks for ZFS or
         | Synology (with further tinkering)?
        
           | dgacmu wrote:
           | To dive slightly into that: You don't necessarily want to
           | sacrifice space for a read cache disk: having more space can
           | reduce writes as you do less replacement.
           | 
           | But where you want endurance is for a ZIL SLOG (the write
           | cache, effectively). Optane was great for this because of
           | really high endurance and very low latency persistent writes,
           | but, ... Farewell, dear optane, we barely knew you.
           | 
           | The 400GB optane card had an endurance of 73 PB written.
           | Pretty impressive, though at almost $3/GB it was really
           | expensive.
           | 
           | This would likely work but as a sibling commenter noted,
           | you're probably better off with a purpose-built, high
           | endurance drive. Since it's a write cache, just replace it a
           | little early.
        
             | sneak wrote:
             | AIUI the slog is only for synchronous writes; most people
             | using zfs at home don't do any of those (unless you set
             | sync=always which is not the default).
             | 
             | https://jrs-s.net/2019/05/02/zfs-sync-async-zil-slog/
        
           | gdevenyi wrote:
           | Under provisioning is the standard recommendation for ZFS SSD
           | cache/log/l2arc drives since those special types were a
           | thing.
        
           | liuliu wrote:
           | Optane 905p goes for $500 a piece (1T) I believe.
        
             | nine_k wrote:
             | For how long?
             | 
             | Terrific for a hobby project, build farm, or even a
             | business in a prototype stage (buy 3-4 then).
             | 
             | Hardly acceptable in a larger setting where continuity in
             | 10 years is important. Of course, not the exact same part
             | available in 10 years (which is not unheard of, though),
             | but something compatible or at least comparable.
        
               | wtallis wrote:
               | If you have a scenario where Optane makes sense today, in
               | 10 years it'll be cost effective to use at least that
               | much DRAM, backed by whatever storage is mainstream then
               | and whatever capacitors or batteries you need to safely
               | flush that DRAM to storage.
               | 
               | A dead-end product on clearance sale isn't the right
               | choice for projects where you need to keep a specific
               | mission-critical machine running for a decade straight.
               | But for a lot of projects, all that really matters is
               | that in a few years you can set up a new system with
               | equal or better performance characteristics and not need
               | to re-write your application to work well on the new
               | hardware. I think all of the (vanishingly few) scenarios
               | where Optane NVMe SSDs make sense fall into the latter
               | category. (I feel sorry for anyone who invested
               | significant effort into writing software to use Optane
               | DIMMs.)
        
               | greggsy wrote:
               | I've often wondered when the DRAM-backed storage
               | revolution was going arrive.
               | 
               | Not long ago, 64GB SSDs were the bare minimum you could
               | get away with, and only the most expensive setups had
               | 64GB RAM. Now we're seeing 64GB modules for consumer
               | laptops priced reason cheap.
               | 
               | I wonder: if RAM prices head towards $0.05/GB (around $50
               | for the cheapest 1TB) that we're currently seeing for
               | SSDs, would that allow the dream of a legitimately useful
               | RAM disk to become a reality?
        
         | BertoldVdb wrote:
         | There are companies selling SLC SSDs (often using TLC or QLC
         | flash but not using that mode) for industrial applications, for
         | example Swissbit.
        
           | userbinator wrote:
           | But they cost far more than what SLC should be expected to
           | cost (4x the price of QLC or 3x the price of TLC.) The clear
           | answer to the parent's question is planned obsolescence.
        
         | BearOso wrote:
         | I don't understand how the author goes from 3.8 WAF(Write
         | Amplication Factor) to 2.0 WAF and gets a 30x increase in
         | endurance. I'd expect about 2x from that.
         | 
         | From what I can see, he seems to be taking the 120TBW that the
         | OEM warranties on the drive for the initial result, but then
         | using the NAND's P/E cycles spec for the final result, which
         | seems suspicious.
         | 
         | The only thing that I could be missing is the NAND going to
         | pSLC mode somehow increases the P/E cycles drastically, like
         | requiring massively lower voltage to program the cells. But I
         | think that would be included in the WAF measure.
         | 
         | What am I missing?
        
           | wtallis wrote:
           | QLC memory cells need to store and read back the voltage much
           | more precisely than SLC memory cells. You get far more P/E
           | cycles out of SLC because answering "is this a zero or a
           | one?" remains fairly easy long after the cells are too worn
           | to reliably distinguish between sixteen different voltage
           | levels.
        
             | 1oooqooq wrote:
             | the author is wrong. what you mention is only true for
             | actual SLC chip+firmware. qlc drivers probably don't even
             | have the hardware to use the entire cell as slc, and they
             | adopt one of N methods to save time/writes/power by
             | underutilizing the resolution of the cell. neither gives
             | you all the benefits, all increases the downsides to
             | improve one upside.
             | 
             | and you can't choose.
        
               | wtallis wrote:
               | Respectfully: to the extent that I can understand what
               | you're trying to say, you don't seem to know what you're
               | talking about. Stop trying so hard to bash the whole
               | industry for making tradeoffs you don't agree with, and
               | put a little more effort into understanding how these
               | things actually work.
        
               | 1oooqooq wrote:
               | we are all here reading a machined translated article
               | from a pt_br warez forum on using the wrong firmware on a
               | ssd controller to talk to the firmware on a "smart" nand
               | flash. to mimic a semblance of control of your own
               | device.
               | 
               | but yeah, I'm the delusional one and the industry is very
               | sane and carrying for the wishes of the consumer. carry
               | on.
        
               | wtallis wrote:
               | See, you're _still_ taking every opportunity to rant
               | about  "control", while continuing to get the technical
               | details wrong (this time: the bit about firmware on a
               | smart NAND chip, which is a real thing but not what the
               | article is about). You're not even bothering to make a
               | cogent case for _why_ retail SSDs should expose this low-
               | level control, just repetitively complaining. You could
               | have actually made the conversation more interesting by
               | taking a different approach.
        
               | 1oooqooq wrote:
               | i could complain about all day about how it's impossible
               | to write a decent driver for ssd here, even hdd drivers
               | seen like decent, which is a far cry, but besides amusing
               | you, where do you think this will go?
        
               | Dylan16807 wrote:
               | Even if they do it in a slapdash way, it's still going to
               | be 0 versus "pretty high" and that's a lot easier than
               | gradients of 16ths. Dropping the endurance to match QLC
               | mode would require intentional effort.
        
         | 1oooqooq wrote:
         | data longevity depends on implementation in the firmware, which
         | you have zero visibility. most consumer drivers will lower
         | longevity.
        
       | kozak wrote:
       | Some Kingston SSDs allow you to manage over-provisioning (i.e. to
       | choose the capacity-endurance tradeoff) by using a manufacturer-
       | provided software tool.
        
         | LoganDark wrote:
         | I don't think that would change how many bits are stored per
         | cell, though? If you, say, set overprovisioning to 80%, then
         | that's going to be 80% of the QLC capacity, and it's going to
         | use the remaining 20% still in QLC mode, it's not going to
         | recognize that it can use SLC with 20% of the SLC
         | overprovisioned.
        
           | Crosseye_Jack wrote:
           | Yeah, all over provisioning does is gives the controller more
           | spare cells to play with. The cells will still wear at the
           | same rate as if you didn't over provision, however depending
           | on how the controller is wear leveling it could further
           | improve the life of the drive because each cell is being used
           | less often.
           | 
           | This mod (I only just skimmed the post) provides a longer
           | life not by using the cells less often (or keeping more in
           | reserve), but by extending each cells life by decreasing the
           | tolerance of charge needed to store the state of the cell,
           | but in return decreasing the bits that can be stored in the
           | cell so decreasing the capacity.
        
       | DeathArrow wrote:
       | I thought memory QLC and TLC memory chips are different at the
       | physical level, not that is just a matter of firmware.
        
         | dist-epoch wrote:
         | There are physical differences, QLC requires more precise
         | hardware, since you need to distinguish between more charge
         | levels. But you can display a low-quality picture on a high-
         | definition screen, or in a camera sensor average 4 physical
         | pixels to get a virtual one, same thing here, you combine
         | together some charge levels for increased reliability.
         | 
         | Put another way, you can turn a QLC into a TLC, but not the
         | other way around.
        
         | wtallis wrote:
         | The memory cells are identical. The peripheral circuitry for
         | accessing the memory array gets more complicated as you support
         | more bits per cell, and the SRAM page buffers have to get
         | bigger to hold the extra bits. But everyone designs their NAND
         | chips to support operating with _fewer_ bits per cell.
         | 
         | Sometimes TLC and QLC chips will be made in different sizes, so
         | that each has the requisite number of memory cells to provide a
         | capacity that's a power of two. But it's just as common for
         | some of the chips to have an odd size, eg. Micron's first 3D
         | NAND was sold as 256Gbit MLC or 384Gbit TLC (literally the same
         | die), and more recently we've seen 1Tbit TLC and 1.33Tbit QLC
         | parts from the same generation.
        
       | willis936 wrote:
       | I wish this kind of deep dive with bus transfer rates was more
       | common. It would be great to have a block diagram that lists
       | every important IC model number / working clock frequency + bus
       | width / working clock rate between these ICs for every SSD.
        
       | kasabali wrote:
       | You don't need to go through all that trouble to use most cheap
       | DRAMless SSDs in pSLC mode. You can simply under-provision them
       | by using only 25-33% capacity.
       | 
       | Most low end DRAMless controllers run in full disk caching mode.
       | In other words, they first write *everything* in pSLC mode until
       | all cells are written, only after there are no cells left they go
       | back and rewrite/group some cells as TLC/QLC to free up some
       | space. And they do it _only_ when necessary, they don 't go and
       | do that in background to free up more space.
       | 
       | So, if you simply create a partition 1/3 (for TLC) or 1/4 (for
       | QLC) the size of the disk, make sure the remaining empty space is
       | TRIMMED and never used, it'll be _always_ writing in pSLC mode.
       | 
       | You can verify the SSD you're interested in is running in this
       | mode by searching for a "HD Tune" _full drive_ write benchmark
       | results for it. If The write speed is fast for the first 1 /3-1/4
       | of the drive, then it dips to abysmal speeds for the rest, you
       | can be sure the drive is using the full drive caching mode. As I
       | said, most of these low-end DRAMless Silicon Motion/Phison/Maxion
       | controllers are, but of course the manufacturer might've modified
       | the firmware to use a smaller sized cache (like Crucial did for
       | the test subject BX500).
        
         | chx wrote:
         | How do you make sure the empty space is trimmed? Can you trim a
         | portion of a disk?
        
           | kasabali wrote:
           | AFAIK Windows runs TRIM when you format a partition. So you
           | can create a dummy partition and format it. Then you can
           | either delete or simply not use it.
           | 
           | On Linux, blkdiscard can be used in the same manner (create a
           | dummy partition and run blkdiscard on it ex. *blkdiscard
           | /dev/sda2).
        
             | bravetraveler wrote:
             | If one prefers working with LVM for their devices, that can
             | be a similar wrench. Making/removing a logical volume can
             | do the same
             | 
             | It depends on _' issue_discards'_ being set in the config
             | 
             | This has drifted over time. I haven't quite rationalized
             | why. Hoping someone can remind me if nothing else, away
             | from a real computer for a bit.
        
             | rzzzt wrote:
             | You can go into Properties > Tools > Optimize, the same
             | button that runs defrag on spinning drives runs TRIM on
             | solid state devices.
        
             | formerly_proven wrote:
             | Windows by default issues TRIMs basically instantly when
             | deleting a file, and runs "Optimize disk" (which trims all
             | free space) on a schedule by default as well.
        
             | FooBarWidget wrote:
             | What about external SSDs over USB? How do you trim those?
        
               | wtallis wrote:
               | There are trim-equivalent commands in the ATA, SCSI, and
               | NVMe command sets. So the OS can issue the SCSI command
               | to the USB device that's using UASP, and the bridge chip
               | inside the external SSD can translate that to the ATA or
               | NVMe counterpart before passing it on to the SSD
               | controller behind the bridge. Not all external SSDs or
               | bridge chips actually support trim passthrough, but these
               | days it's standard functionality.
        
               | FooBarWidget wrote:
               | I wonder how to do it on macOS then. I have several
               | external SSDs and none of them can be trimmed.
        
               | wtallis wrote:
               | https://kb.plugable.com/data-storage/trim-an-ssd-in-macos
               | 
               | Apparently macOS doesn't expose the ability for userspace
               | to manually issue trim commands to block devices (or at
               | least doesn't ship a suitable tool to do so), so the best
               | available workaround is to tell the filesystem layer that
               | it should do automatic trimming even on external drives.
        
           | ciupicri wrote:
           | Create a partition that you'll never use and run blkdiscard
           | [1] from util-linux on it.
           | 
           | [1]: https://man7.org/linux/man-pages/man8/blkdiscard.8.html
           | / https://github.com/util-linux/util-linux/blob/master/sys-
           | uti...
        
           | mananaysiempre wrote:
           | The literal answer is yes, an ATA TRIM, SCSI UNMAP, or NVMe
           | Deallocate command can cover whatever range on a device you
           | feel like issuing it for. (The device, in turn, can clear
           | all, none, or part of it.) On Linux, blkdiscard accepts the
           | -o, --offset and -l, --length options (in bytes) that map
           | more or less exactly to that. Making a(n unformatted)
           | partition for the empty space and then trimming it is a valid
           | workaround as well.
           | 
           | But you're most probably doing this on a device with nothing
           | valuable on it, so you should be able to just trim the whole
           | thing and then allocate and format whatever piece of it that
           | you are planning to use.
        
         | howbadisthat wrote:
         | What happens if the remaining space is TRIMMED but routinely
         | accessed? (for example by dd, read only)
        
           | wtallis wrote:
           | If a logical block address is not mapped to any physical
           | flash memory addresses, then the SSD can return zeros for a
           | read request immediately, without touching the flash.
        
             | stavros wrote:
             | Does the mapping happen on first write? Is TRIM then a
             | command that signals the SSD to unmap that block?
        
               | BeeOnRope wrote:
               | Yes.
        
           | Dwedit wrote:
           | You read zeroes.
        
             | BeeOnRope wrote:
             | Not guaranteed by default for NVMe drives. There's an NVMe
             | feature bit for "Read Zero After TRIM" which if set for a
             | drive guarantees this behavior but many drives of interest
             | (2024) do not set this.
        
               | Dwedit wrote:
               | Hmmm, when I quick-formatted a drive (which TRIMs the
               | whole thing), then tried reading it back in a disk hex
               | editor, I just saw zeroes.
        
         | halifaxbeard wrote:
         | :mind-blown:
         | 
         | i knew about "preconditioning" for SSDs when it comes to
         | benchmarking, etc. didn't realize this was the why.
         | 
         | thanks!
        
         | 1oooqooq wrote:
         | ssd firmwares are a mistake. they saw how easy it is to sell
         | crap, with non ecc (i.e. bogus ram) being sold as the default
         | and ran (pun intended) with it.
         | 
         | so if under provisioned now they work as pSLC, giving you more
         | data resilience in short term but wasting more write cycles
         | because they're technically writing 1111111 instead of 1. every
         | time. if you fill them up then they have less data resilience.
         | 
         | and the best part, there's no way you can control any of it
         | based on your needs.
        
           | wtallis wrote:
           | > giving you more data resilience in short term but wasting
           | more write cycles because they're technically writing 1111111
           | instead of 1. every time.
           | 
           | No, that's not how it works. SLC caches are used primarily
           | for performance reasons, and they're faster precisely because
           | they _aren 't_ doing the equivalent of writing four ones (and
           | especially not _seven_!?) to a QLC cell.
        
           | wruza wrote:
           | Technically they are writing (0,0,0,1) instead of (0.0625).
        
         | hengheng wrote:
         | How can I verify that things stay this way?
         | 
         | Partitioning off a small section of the drive feels very 160 GB
         | SCSI "Let's only use the outer sectors".
        
           | kasabali wrote:
           | Even keeping the drive always 75% empty would be enough, but
           | partitioning off is the easiest way to make sure it's never
           | exceed 25-33% full (assuming the drive behaves like that in
           | the first place).
           | 
           | To verify the drive uses the all of the drive as a cache, you
           | can run full drive sequential write test (like the one in HD
           | Tune Pro) and analyze the speed graph. If, say, a 480GB drive
           | writes at full speed for the first 120GB, and then the write
           | speed drops for the remaining 360GB, this means the drive is
           | suitable for this kind of use.
           | 
           | I think controllers might've been doing some GC jobs to
           | always keep some amount of cells ready for pSLC use, but it
           | should be a few GBs at most and shouldn't affect the use case
           | depicted here.
        
           | Dylan16807 wrote:
           | > Partitioning off a small section of the drive feels very
           | 160 GB SCSI "Let's only use the outer sectors".
           | 
           | In that it was very reliable at accomplishing the goal?
        
             | bayindirh wrote:
             | Short stroking bigger disks for higher IOPS and storage
             | speed was a _de-facto_ method in some HPC centers. Do this
             | to a sufficiently large array and you can see unbelievable
             | IOPS numbers _for that generation of hardware_.
        
         | userbinator wrote:
         | That is what an ideal FTL would do if only a fraction of the
         | LBAs are accessed, but as you say some manufacturers will
         | customise the firmware to do otherwise, while this mod
         | basically guarantees that the whole space is used as SLC.
        
       | justinclift wrote:
       | It mentions the required tool being available from um...
       | interesting places.
       | 
       | Doing a Duck Duck Go search on the "SMI SM2259XT2 MPTool FIMN48
       | V0304A FWV0303B0" string in the article shows this place has the
       | tool for download:
       | 
       | https://www.usbdev.ru/files/smi/sm2259xt2mptool/
       | 
       | The screenshot in the article looks to be captured from that site
       | even. ;)
       | 
       | Naturally, be careful with anything downloaded from there.
        
         | gaius_baltar wrote:
         | There were several instances were I saw an interesting tool for
         | manipulating SSDs and SD cards only available from strange
         | Russian websites. This one at least has an English UI ... A lot
         | of research seems concentrated there and I wonder why it did
         | not catch the same level of interest in the west.
        
           | justinclift wrote:
           | Yeah. That site has a lot of info for a huge number of flash
           | controllers/chipsets/etc.
           | 
           | Wish I had a bunch of spare time to burn on stuff like this.
           | :)
        
             | fuzzfactor wrote:
             | Good to see they are still available.
             | 
             | The wide variety of controller/memory combinations makes it
             | quite a moving target.
             | 
             | This is the "mass production" software that makes it
             | possible to provision, partition, format, and even place
             | pre-arranged data or OS's in position before shipping
             | freshly prepared drives to a bulk OEM customer. On one or
             | more "identical" drives at the same time.
             | 
             | For _USB flash thumb drives_ the same approach is used.
             | Factory software like this which is capable of modifying
             | the firmware of the device is unfortunately about the only
             | good way to determine the page size and erase block size of
             | a particular USB drive. If the logical sectors storing your
             | information are not aligned with the physical memory blocks
             | (which somewhat correspond to the  "obsolete" CHS
             | geometry), the USB key will be slower than necessary,
             | especially on writes. Due to write-amplification, and also
             | it will wear out much sooner.
             | 
             | Care does not go into thumb drives like you would expect
             | from SSDs, seems like very often a single SKU will have
             | untold variations in controller/memory chips. Also it seems
             | likely that during the production discontinuities when the
             | supply of one of these ICs on the BOM becomes depleted, it
             | is substituted with a dissimilar enough chip that a
             | revision of the partitioning, formatting, and data layout
             | would be advised, but does not take place because nobody
             | does it. And it still works anyway so nobody notices or
             | cares. Or worse, it's recognized as an engineering
             | downgrade but downplayed as if in denial. Wide variation in
             | performance within a single SKU is a canary for this, which
             | can sometimes be rectified.
        
           | userbinator wrote:
           | _and I wonder why it did not catch the same level of interest
           | in the west._
           | 
           | Because people in the west are too scared of IP laws.
        
         | drtgh wrote:
         | I was unable to find the source code, so it is important to be
         | careful. In my case it sounds like a faith jump that I don't
         | have (my apologies to the developers).
         | 
         | In any case, this is a feature that manufacturers should
         | provide. I wonder how it could be obtained.
        
           | justinclift wrote:
           | > I wonder how it could be obtained.
           | 
           | Reverse engineering and a huge amount of free time.
        
           | userbinator wrote:
           | That _is_ the actual manufacturer 's tool.
        
         | sampa wrote:
         | oh that western superiority complex... hits once again...
         | beside the mark
        
           | justinclift wrote:
           | > western superiority complex
           | 
           | What are you on about?
        
         | userbinator wrote:
         | In countries where people have been less conditioned to be
         | mindless sheep, you can more easily find lots of truth that
         | doesn't toe the official line.
         | 
         | Spreading xenophobic FUD only serves to make that point
         | clearer: you can't argue with the facts, so you can only sow
         | distrust.
        
           | justinclift wrote:
           | > Spreading xenophobic FUD
           | 
           | ?
        
       | aristus wrote:
       | About ten years ago I got my hands on some of the last production
       | FusionIO SLC cards for benchmarking. The software was an in-
       | memory database that a customer wanted to use with expanded
       | capacity. I literally just used the fusion cards as swap.
       | 
       | After a few minutes of loading data, the kernel calmed down and
       | it worked like a champ. Millions of transactions per second
       | across billions of records, on a $500 computer... and a card that
       | cost more than my car.
       | 
       | Definitely wouldn't do it that way these days, but it was an
       | impressive bit of kit.
        
         | sargun wrote:
         | I worked at a place where I can say, FusionIO saved the
         | company. W e had a single Postgres database which powered a
         | significant portion of the app. We tried the kick off a
         | horizontal scale project to little success around it - turns
         | out that partitioning is hard on a complex, older codebase.
         | 
         | Somehow we end up with a FusionIO card in tow. We go from
         | something like 5,000 read QPS to 300k reads QPS on pgbench
         | using the cheapest 2TB card.
         | 
         | Ever since then, I've always thought that reaching for vertical
         | scale is more tenable than I originally thought. It turns out
         | hardware can do a lot more than we think.
        
           | hinkley wrote:
           | The slightly better solution for these situations is to set
           | up a reverse proxy that sends all GET requests to a read
           | replica and the server with the real database gets all of the
           | write traffic.
           | 
           | But the tricky bit there is that you may need to set up the
           | response to contain the results of the read that is triggered
           | by a successful write. Otherwise you have to solve lag
           | problems on the replica.
        
         | hinkley wrote:
         | In the nineties they used battery backed RAM that cost more
         | than a new car for WAL data on databases that desperately
         | needed to scale higher.
        
         | linsomniac wrote:
         | Back when the first Intel SSDs were coming out, I worked with
         | an ISP that had an 8 drive 10K RAID-10 array for their mail
         | server, but it kept teetering on the edge of not being able to
         | handle the load (lots of small random IO).
         | 
         | As an experiment, I sent them a 600GB Intel SSD in laptop drive
         | form factor. They took down the secondary node, installed the
         | SSD, and brought it back up. We let DRBD sync the arrays, and
         | then failed the primary node over to this SSD node. I added the
         | SSD to the logical volume, then did a "pvmove" to move the
         | blocks from the 8 drive array to the SSD, and over the next few
         | hours the load steadily dropped down to nothing.
         | 
         | It was fun to replace 8x 3.5" 10K drives with something that
         | fit comfortably in the palm of my hand.
        
       | RA2lover wrote:
       | Could this be used to extend the lifetime of an already worn-out
       | SSD? I wonder if there's some business in china taking those and
       | reflashing them as "new".
        
         | dannyw wrote:
         | Technically, QLC NAND that is no longer able to distinguish at
         | QLC levels should certainly still be suitable as MLC for a
         | while longer, and SLC, for all practical intents and purposes,
         | forever.
        
         | chasil wrote:
         | The only rejuvenation process that I know is heat, either long
         | period exposure to 250degC or short-term at higher temperature
         | (800degC).
         | 
         | https://m.hexus.net/tech/news/storage/48893-making-flash-mem...
         | 
         | https://m.youtube.com/watch%3Fv%3DH4waJBeENVQ&sa=U&ved=2ahUK...
        
           | justinclift wrote:
           | So the trick is to somehow redirect all of the heat energy
           | coming from cpus onto the storage, in bursts? :D
        
             | kasabali wrote:
             | Stay closer, now they're putting heatsinks on controller
             | chips of SSDs :D
        
           | userbinator wrote:
           | That first article was 12 years ago when MLC was the norm and
           | had 10k endurance.
           | 
           |  _Macronix have known about the benefits of heating for a
           | long time but previously used to bake NAND chips in an oven
           | at around 250C for a few hours to anneal them - that's an
           | expensive and inconvenient thing to do for electronic
           | components!_
           | 
           | I wonder if the e-waste recycling operations in China may be
           | doing that to "refurbish" worn out NAND flash and resell it.
           | They already do harvesting of ICs so it doesn't seem
           | impossible... and maybe this effect was first noticed by
           | someone heating the chips to desolder them.
        
       | loeg wrote:
       | DIWhy type stuff. Still, fun hack. TLC media has plenty of
       | endurance. We see approximately 1.3-1.4x NAND write amplification
       | in production workloads at ~35% fill rate with decent TRIMing.
        
       | riobard wrote:
       | Is it possible for SSD firmware to act "progressively" from SLC
       | to MLC to TLC and to QLC (and maybe PLC in the future)? E.g. for
       | a 1TB QLC SSD, it would act as SLC for usage under 256GB, then
       | MLC under 512GB, then TLC under 768GB, and then QLC under 1TB
       | (and PLC under 1280GB).
        
         | wtallis wrote:
         | It's theoretically possible, but in practice when a drive is
         | getting close to full what makes sense is to compact data from
         | the SLC cache into the densest configuration you're willing to
         | allow, without any intermediate steps.
        
         | hotstickyballs wrote:
         | That's just a normal ssd rated at the QLC capacity.
        
       | msarnoff wrote:
       | I'd also recommend this if you're using eMMC in embedded devices.
       | On a Linux system, you can use the `mmc` command from `mmc-utils`
       | to configure your device in pSLC mode. It can also be done in
       | U-Boot but the commands are a bit more obtuse. (It's one-time
       | programmable, so once set it's irreversible.)
       | 
       | In mass-production quantities, programming houses can
       | preconfigure this and any other eMMC settings for you.
        
         | eternityforest wrote:
         | That makes eMMC slightly less awful!
        
       | userbinator wrote:
       | What isn't prominently mentioned in the article is that endurance
       | and retention are highly related --- flash cells wear out by
       | becoming leakier with each cycle, and so the more cycles one goes
       | through, the faster it'll lose its charge. The fact that SLC only
       | requires distinguishing between two states instead of 16 for QLC
       | means that the drive will also hold data for (much) longer in SLC
       | mode for the same number of cycles.
       | 
       | In other words, this mod doesn't only mean you get extreme
       | endurance, but retention. This is usually specified by
       | manufacturers as N years after M cycles; early SLC was rated for
       | 10 years after 100K cycles, but this QLC might be 1 year after
       | 900 cycles, or 1 year after 60K cycles in SLC mode; if you don't
       | actually cycle the blocks that much, the retention will be much
       | higher.
       | 
       | I'm not sure if the firmware will still use the stronger ECC
       | that's required for QLC vs. SLC even for SLC mode blocks, but if
       | it does, that will also add to the reliability.
        
       ___________________________________________________________________
       (page generated 2024-05-19 23:00 UTC)