[HN Gopher] Transforming a QLC SSD into an SLC SSD
___________________________________________________________________
Transforming a QLC SSD into an SLC SSD
Author : userbinator
Score : 207 points
Date : 2024-05-19 09:30 UTC (13 hours ago)
(HTM) web link (theoverclockingpage.com)
(TXT) w3m dump (theoverclockingpage.com)
| mofosyne wrote:
| It be nice if manufacturers provide a way to downgrade SSD to SLC
| via some driver settings.
| NL807 wrote:
| but how would they make more money?
| Lex-2008 wrote:
| I'd buy such a device. Currently in holding on to my last
| pair of SSDs from pre-QLC era, refusing to buy anything new
| LoganDark wrote:
| There are still new SSDs that use TLC, such as the Silicon
| Power UD90 (I have one in my system). Not only that, some
| of them will run in SLC mode when writing new data and then
| move the data to TLC later - advertised as SLC Caching -
| which could be even better than always-TLC drives (even
| ones with a DRAM cache).
| drtgh wrote:
| Your comment, along with other users, suggests that TLC
| is a positive attribute for consumers, however, the
| transition from SLC and MLC NAND to TLC and QLC 3D-NAND
| actually marked a decline in the longevity of SSDs.
|
| Using a mode other than SLC with current SSDs is insane
| due to the difference with planar NAND features, as the
| current 3D-NAND consumes writes for everything.
|
| 3D-NAND, To read data consume writes [0],
| " Figure 1a plots the average SSD lifetime consumed by
| the read-only workloads across 200 days on three SSDs
| (the detailed parameters of these SSDs can be found from
| SSD-A/-B/-C in Table 1). As shown in the figure, the
| lifetime consumed by the read (disturbance) induced
| writes increases significantly as the SSD density
| increases. In addition, increasing the read throughput
| (from 17MBps to 56/68MBps) can greatly accelerate the
| lifetime consumption. Even more problematically, as the
| density increases, the SSD lifetime (plotted in Figure
| 1b) decreases. In addition, SSD-aware write-reduction-
| oriented system software is no longer sufficient for
| high-density 3D SSDs, to reduce lifetime consumption.
| This is because the SSDs entered an era where one can
| wear out an SSD by simply reading it."
|
| 3D-NAND, Data retention consume writes [1],
| " 3D NAND flash memory exhibits three new error sources
| that were not previously observed in planar NAND flash
| memory: (1) layer-to-layer process
| variation, a new phenomenon specific to the 3D
| nature of the device, where the average error rate of
| each 3D-stacked layer in a chip is significantly
| different; (2) early retention loss,
| a new phenomenon where the number of errors due to charge
| leakage increases quickly within several hours after
| programming; and (3) retention interference,
| a new phenomenon where the rate at which charge leaks
| from a flash cell is dependent on the data value stored
| in the neighboring cell. "
|
| [0] https://dl.acm.org/doi/10.1145/3445814.3446733
|
| [1] https://ghose.cs.illinois.edu/papers/18sigmetrics_3df
| lash.pd...
| jsheard wrote:
| Even datacenter-grade drives scarcely use SLC or MLC
| anymore since TLC has matured to the point of being more
| than good enough even in most server workloads, what
| possible need would 99% of consumers have for SLC/MLC
| nowadays?
|
| If you really want a modern SLC drive there's the Kioxia
| FL6, which has a whopping 350,400 TB of write endurance
| in the 3TB variant, but it'll cost you $4320.
| Alternatively you can get 4TB of TLC for $300 and take
| your chances with "only" 2400 TB endurance.
| drtgh wrote:
| TLC cannot mature as long as it continues to use 3D-NAND
| without utilizing a more advanced material science.
| Reading data and preserving data consume writes, what
| degrades the memory, because the traces in the vertical
| stack of the circuit create interference.
|
| Perhaps there are techniques available to separate the
| traces, but this would ultimately increase the surface
| area? which seems to be something they are trying to
| avoid.
|
| You should not use datacenter SSD disks as a reference,
| as they typically do not last more than two years and a
| half. It appears to be a profitable opportunity for the
| SSD manufacturer, and increasing longevity does not seem
| to be a priority.
|
| To be more specific, we are talking about planned
| obsolescence for consumer and enterprise SSD disks.
|
| > If you really want a modern SLC drive there's the
| Kioxia FL6, which has a whopping 350,400 TB of write
| endurance in the 3TB variant, but it'll cost you $4320.
|
| Did you read the OP article?
| LoganDark wrote:
| I got 4TB of TLC for $230 (Silicon Power UD90). It even
| has SLC caching (can use parts of the flash in SLC mode
| for short periods of time).
| jsheard wrote:
| True, I was looking at the prices for higher end drives
| with on-board DRAM, but DRAM-less drives like that UD90
| are also fine in the age of NVMe. Going DRAM-less was a
| significant compromise on SATA SSDs, but NVMe allows the
| drive to borrow a small chunk of system RAM over PCIe
| DMA, and in practice that works well enough.
|
| (Caveat: that DMA trick doesn't work if you put the drive
| in a USB enclosure, so if that's your use-case you should
| ideally still look for a drive with its own DRAM)
| LoganDark wrote:
| > Your comment, along with other users, suggests that TLC
| is a positive attribute for consumers
|
| TLC is better than QLC, which is specifically what my
| comment was addressing; I never implied that it's better
| than SLC though, so just don't, please.
|
| It's interesting to see that 3D-NAND has other issues
| even when run in SLC mode, though.
| drtgh wrote:
| > I never implied that it's better than SLC though, so
| just don't, please.
|
| My apologies.
|
| > It's interesting to see that 3D-NAND has other issues
| even when run in SLC mode, though.
|
| Basically the SSD manufacturers are increasing capacity
| by adding more layers (3D-NAND). When one cell is read in
| the vertical stack, the interferences produced by the
| traces in the area increases the cells that need to be
| rewritten, what consumes the life of the device, by
| design.
| wtallis wrote:
| > When one cell is read in the vertical stack, the
| interferences produced by the traces in the area
| increases the cells that need to be rewritten, what
| consumes the life of the device, by design.
|
| You should try being honest about the magnitude of this
| effect. It takes _thousands_ of read operations at a
| minimum to cause a read disturb that can be fixed with
| _one_ write. What you 're complaining about is the NAND
| equivalent of DRAM rowhammer. It's not a serious problem
| in practice.
| drtgh wrote:
| Not NAND equivalent as the larger the stack, the larger
| the writings on the continuous cells, not just rewriting
| a single cell.
|
| Here, the dishonest are the SSD manufacturers of the last
| decade, and they are feeling so comfortable as to
| introduce QLC into the market.
|
| > It's not a serious problem in practice.
|
| It's as serious as in to read data consume the disk, and
| the faster its read the faster it's consumed [0]. You
| should have noticed that SSD disks no longer come with a
| 10-year warranty. "under low throughput
| read-only workloads, SSD-A/-B/-C/-D/-E/-F extensively
| rewrite the potentially-disturbed data in the background,
| to mitigate the read (disturbance) induced latency
| problem and sustain a good read performance. Such
| rewrites significantly consume the already-reduced SSD
| lifetime. "
|
| Under low throughput read-only workloads.
|
| It is a paper from 2021, what means sci-hub can be used
| to read it.
|
| [0] https://dl.acm.org/doi/10.1145/3445814.3446733
| wtallis wrote:
| > It's as serious as in to read data consume the disk,
| and the faster its read the faster is consumed
|
| Numbers, please. Quantify that or GTFO. You keep quoting
| stuff that implies SSDs are horrifically unreliable and
| burning through their write endurance alarmingly fast.
| But the reality is that even consumer PCs with cheap SSDs
| are not experiencing an epidemic of premature SSD
| failures.
|
| EDIT:
|
| > You should have noticed that SSD disks no longer come
| with a 10-year warranty.
|
| 10-year warranties were never common for SSDs. There was
| a brief span of time where the flagship consumer SSDs
| from Samsung and SanDisk had 10-year warranties because
| they were trying to one-up each other and couldn't
| improve performance any further because they had
| saturated what SATA was capable of. The fact that those
| 10-year warranties existed for a while and then went away
| says _nothing_ about trends in the true reliability of
| the storage. SSD warranties and write endurance ratings
| are dictated primarily by _marketing_ requirements.
| drtgh wrote:
| In a 2min search,
|
| https://www.reddit.com/r/DataHoarder/comments/150orlb/ent
| erp... "So, on page 8's graphs, they
| show that 800GB-3800GB 3D-TLC SSDs had a very low "total
| drive failure" rate. But as soon as you got to 8000GB and
| 15000GB, the drives had a MASSIVE increase in risk that
| the entire drive has hardware errors and dies, becomes
| non-responsive, etc."
|
| Study:
| https://www.usenix.org/system/files/fast20-maneas.pdf
|
| (with video): https://www.usenix.org/conference/fast20/pr
| esentation/maneas
| wtallis wrote:
| Would you care to explain how any of that supports the
| points you're actually making here?
|
| Some of what you're spamming seems to directly undermine
| your claims, eg.:
|
| > Another finding is that SLC (single level cell), the
| most costly drives, are NOT more reliable than MLC
| drives. And while the newest high density 3D-TLC (triple
| level cell) drives have the highest overall replacement
| rate, the difference is likely not caused by the 3D-TLC
| technology
| drtgh wrote:
| "likely" not caused by. Any case I delete such spamming?
| link.
|
| > Would you care to explain how any of that supports the
| points you're actually making here?
|
| Other day, if you don't mind.
| Dylan16807 wrote:
| The massive increase is still 1/500 chance per year.
| wtallis wrote:
| > Your comment, along with other users, suggests that TLC
| is a positive attribute for consumers, however, the
| transition from SLC and MLC NAND to TLC and QLC 3D-NAND
| actually marked a decline in the longevity of SSDs.
|
| The bit that you're pointedly ignoring and that none of
| your quotes address is the fact that SLC SSDs had _far_
| more longevity than anyone really needed. Sacrificing
| longevity to get higher capacity for the same price was
| the _right tradeoff_ for consumers and almost all server
| use cases.
|
| The fact that 3D NAND has some new mechanisms for data to
| be corrupted is _pointless trivia_ on its own, bordering
| on fearmongering the way you 're presenting it. The real
| impact these issues have on overall drive lifetime,
| compared to realistic estimates of how much lifespan
| people actually need from their drives, is not at all
| alarming.
|
| Not using SLC is not insane. Insisting on using SLC
| everywhere is what's insane.
| NL807 wrote:
| The point I was making is that there is no profit to be
| made by extending the life of drives. And sample size of
| one (i.e. you) is not representative of the market. There
| is always a demand for storage and people will keep buying
| worse products because there is no other choice.
| AnthonyMouse wrote:
| I don't understand this logic. Consider the two
| possibilities here.
|
| The first is that only weird tech people are interested
| in doing this. Then they might as well allow it because
| it's a negligible proportion of the market but it makes
| those customers favor rather than dislike your brand, and
| makes them more likely to recommend your devices to
| others, which makes you some money.
|
| The second is that it would be widely popular and large
| numbers of customers would want to do it, and thereby
| choose the drives that allow it. Then if Samsung does it
| and SanDisk doesn't, or vice versa, they take more of the
| other's customers. Allowing it is the thing makes them
| _more_ money.
|
| Meanwhile the thing that trashes most SSDs isn't wear,
| it's obsolescence. There are millions of ten year old QLC
| SSDs that are perfectly operational because they lived in
| a desktop and saw five drive writes over their entire
| existence. They're worthless not because they don't work,
| but because a drive which is newer and bigger and faster
| is $20. It costs the manufacturer nothing to let them be
| more reliable because they're going in the bin one way or
| the other.
|
| The status quo seems like MBAs cargo culting some
| heuristic where a company makes money in proportion to
| how evil they are. Companies actually make money in
| proportion to how much money they can get customers to
| spend with them. Which often has something to do with how
| much customers like them.
| userbinator wrote:
| _There are millions of ten year old QLC SSDs_
|
| In 2014 QLC was nothing but a research curiosity. The
| first QLC SSD was introduced in 2018:
|
| https://www.top500.org/news/intel-micron-ship-first-qlc-
| flas...
|
| You have to also remember that people buy storage
| expecting it to last. I have decades-old magnetic media
| which is tiny but still readable.
| bayindirh wrote:
| Samsung 870EVO (SSD), 980 Pro/990 Pro (NVMe) are all TLC
| drives. Kingston KC3000 is faster than 980 Pro, hence it's
| probably TLC, too.
| jsheard wrote:
| A decent rule of thumb if that if a drive uses TLC, it
| will probably say so in the spec sheet.
|
| If it's left ambiguous then it's either QLC, or a lottery
| where the "same" model may be TLC or QLC.
| bayindirh wrote:
| Kingston NV2 is in that "what you get may differ"
| category, and Kingston explicitly says that what you get
| may change. I have two NV2s with differing die count, for
| example. Their controller might be different too. They're
| external, short-use drives, so I don't care.
|
| So, returning to previously mentioned ones, from their
| respective datasheets: - 870 EVO:
| Samsung V-NAND 3bit MLC - 980 Pro: Samsung V-NAND
| 3bit MLC - 990 Pro: Samsung V-NAND TLC -
| KC3000: NAND: 3D TLC - NV2: NAND: 3D // Explicit
| Lottery.
| ngcc_hk wrote:
| A bit confused ... the article is about Ssd drive with
| 500- M. Is what it said and discussed more details here
| applied to nvme drive with 1000+ M. Same?
| numpad0 wrote:
| Great thing about disks is that they don't require drivers at
| all. The driver settings Windows app is not going to be open
| sourced if such thing were to exist.
| surajrmal wrote:
| While ssds do not, all flash chips do. So if you were ever
| going to try building your own SSD or simply connect some flash
| directly up to your soc via some extra pins, you would be able
| to program them this way. I imagine extending NVMe to offer
| this is possible if there was enough popular demand.
| namibj wrote:
| NVMe already supports low level reformatting.
| Havoc wrote:
| Wild! I had assumed this is a hardware level distinction
| LoganDark wrote:
| How many bits a particular NAND chip _can_ store per cell is
| presumably hardware-level, but I believe it 's possible to
| achieve SLC on all of them anyway, even if they support TLC or
| QLC.
|
| Hell, the Silicon Power NVMe SSD I have in my machine right now
| will use SLC for writes, then (presumably) move that data later
| to TLC during periods of inactivity. Running the NAND in SLC
| mode is a feature of these drives, it's called "SLC caching".
| magicalhippo wrote:
| Of course it is trivial to just write 000 for zero and 111 for
| one in the cells of a TLC SSD to turn it into effectively a SLC
| SSD, but that in itself doesn't explain why it's so much faster
| to read and write compared to TLC.
|
| For example, if it had been DRAM where the data is stored as
| charge on a capacitor, then one could imagine using a R-2R
| ladder DAC to write the values and a flash ADC to read the
| values. In that case there would be no speed difference between
| how many effective levels was stored per cell (ignoring noise
| and such).
|
| From what I can gather, the reason the pseudo-SLC mode is
| faster is down to how flash is programmed and read, and relies
| on the analog nature of flash memory.
|
| Like DRAM there's still a charge that's being used to store the
| value, however it's not just in a plain capacitor but in a
| double MOSFET gate[1].
|
| The amount of charge changes the effective threshold voltage of
| the transistor. Thus to read, one needs to apply different
| voltages to see when the transistor starts to conduct[2].
|
| To program a cell, one has to inject some amount of charge that
| puts the threshold voltage to a given value depending on which
| bit pattern you want to program. Since one can only inject
| charge, one must be careful not to overshoot. Thus one uses a
| series of brief pulses and then do a read cycle to see if the
| required level has been reached or not[3], repeating as needed.
| Thus the more levels per cell, the shorter pulses are needed
| and more read cycles to ensure the required amount of charge is
| reached.
|
| When programming the multi-level cell in single-level mode, you
| can get away with just a single, larger charge injection[4].
| And when reading the value back, you just need to determine if
| the transistor conducts at a single level or not.
|
| So to sum up, pseudo-SLC does not require changes to the multi-
| level cells as such, but it does require changes to how those
| cells are programmed and read. So most likely it requires
| changing those circuits somewhat, meaning you can't implement
| this just in firmware.
|
| [1]: https://en.wikipedia.org/wiki/Flash_memory#Floating-
| gate_MOS...
|
| [2]:
| https://dr.ntu.edu.sg/bitstream/10356/80559/1/Read%20and%20w...
|
| [3]: https://people.engr.tamu.edu/ajiang/CellProgram.pdf
|
| [4]: http://nyx.skku.ac.kr/publications/papers/ComboFTL.pdf
| hwbunny wrote:
| Silicon Motion controllers are trash.
| mardifoufs wrote:
| I thought they were the best in class. What's the alternatives?
| nickcw wrote:
| This hack seems to take a 480GB SSD and transform it into a 120GB
| SSD
|
| However the write endurance (the amount of data you can write to
| the SSD before expecting failures) increases from 120TB to 4000TB
| which could be a very useful tradeoff, for example if you were
| using the disk to store logs.
|
| I've never seen this offered by the manufacturers though (maybe I
| haven't looked on the right place), I wonder why not?
| dist-epoch wrote:
| Manufacturers offer that, in the form of TLC drives. Which are
| supported, unlike this hack which might cause data loss.
|
| This gives you 120GB with 4000TB write endurance, but you can
| buy a 4TB TLC drive with 3000TB write endurance for $200.
| greggsy wrote:
| Then you could use this technique to achieve something like a
| 1.2TB disk with 40PB TBW?
|
| I'd be fascinated to hear any potential use cases for that
| level of endurance in modern data storage.
| justinclift wrote:
| > use cases for that level of endurance in modern data
| storage.
|
| All flash arrays. Saying that, as I have a bunch of smaller
| (400GB) 2.5" SAS SSDs combined into larger all-flash
| arrays, with each one of those SSD's rated for about 30PB
| of endurance.
|
| I'm expecting the servers to be dead by the time that
| endurance is exhausted though. ;)
| greggsy wrote:
| Exactly, I've done similar maths on my disks, and
| realised that it would be 20 years before they approach
| their end of life.
|
| By which point, they will be replaced for some new tech
| that will be cheaper, faster and more reliable and power
| efficient.
| causality0 wrote:
| Which drive would that be? The ones I'm seeing cost a lot
| more than $200.
| hippich wrote:
| I'll second that question!
| BertoldVdb wrote:
| SSD prices fluctuate a lot. I recently bought 4TB SSDs for
| 209eu but they are more expensive now (SNV2S/4000G, QLC
| though)
| 5e92cb50239222b wrote:
| My friend picked up a 3.84 TB Kingston SEDC600M with 7 PB
| of write endurance on sale for $180 a couple of months ago.
| That same place now sells them for around $360. Definitely
| an original drive. Maybe you just have to be on the lookout
| for one for when they go on sale.
| greggsy wrote:
| I wonder if it would be useful as cache disks for ZFS or
| Synology (with further tinkering)?
| dgacmu wrote:
| To dive slightly into that: You don't necessarily want to
| sacrifice space for a read cache disk: having more space can
| reduce writes as you do less replacement.
|
| But where you want endurance is for a ZIL SLOG (the write
| cache, effectively). Optane was great for this because of
| really high endurance and very low latency persistent writes,
| but, ... Farewell, dear optane, we barely knew you.
|
| The 400GB optane card had an endurance of 73 PB written.
| Pretty impressive, though at almost $3/GB it was really
| expensive.
|
| This would likely work but as a sibling commenter noted,
| you're probably better off with a purpose-built, high
| endurance drive. Since it's a write cache, just replace it a
| little early.
| sneak wrote:
| AIUI the slog is only for synchronous writes; most people
| using zfs at home don't do any of those (unless you set
| sync=always which is not the default).
|
| https://jrs-s.net/2019/05/02/zfs-sync-async-zil-slog/
| gdevenyi wrote:
| Under provisioning is the standard recommendation for ZFS SSD
| cache/log/l2arc drives since those special types were a
| thing.
| liuliu wrote:
| Optane 905p goes for $500 a piece (1T) I believe.
| nine_k wrote:
| For how long?
|
| Terrific for a hobby project, build farm, or even a
| business in a prototype stage (buy 3-4 then).
|
| Hardly acceptable in a larger setting where continuity in
| 10 years is important. Of course, not the exact same part
| available in 10 years (which is not unheard of, though),
| but something compatible or at least comparable.
| wtallis wrote:
| If you have a scenario where Optane makes sense today, in
| 10 years it'll be cost effective to use at least that
| much DRAM, backed by whatever storage is mainstream then
| and whatever capacitors or batteries you need to safely
| flush that DRAM to storage.
|
| A dead-end product on clearance sale isn't the right
| choice for projects where you need to keep a specific
| mission-critical machine running for a decade straight.
| But for a lot of projects, all that really matters is
| that in a few years you can set up a new system with
| equal or better performance characteristics and not need
| to re-write your application to work well on the new
| hardware. I think all of the (vanishingly few) scenarios
| where Optane NVMe SSDs make sense fall into the latter
| category. (I feel sorry for anyone who invested
| significant effort into writing software to use Optane
| DIMMs.)
| greggsy wrote:
| I've often wondered when the DRAM-backed storage
| revolution was going arrive.
|
| Not long ago, 64GB SSDs were the bare minimum you could
| get away with, and only the most expensive setups had
| 64GB RAM. Now we're seeing 64GB modules for consumer
| laptops priced reason cheap.
|
| I wonder: if RAM prices head towards $0.05/GB (around $50
| for the cheapest 1TB) that we're currently seeing for
| SSDs, would that allow the dream of a legitimately useful
| RAM disk to become a reality?
| BertoldVdb wrote:
| There are companies selling SLC SSDs (often using TLC or QLC
| flash but not using that mode) for industrial applications, for
| example Swissbit.
| userbinator wrote:
| But they cost far more than what SLC should be expected to
| cost (4x the price of QLC or 3x the price of TLC.) The clear
| answer to the parent's question is planned obsolescence.
| BearOso wrote:
| I don't understand how the author goes from 3.8 WAF(Write
| Amplication Factor) to 2.0 WAF and gets a 30x increase in
| endurance. I'd expect about 2x from that.
|
| From what I can see, he seems to be taking the 120TBW that the
| OEM warranties on the drive for the initial result, but then
| using the NAND's P/E cycles spec for the final result, which
| seems suspicious.
|
| The only thing that I could be missing is the NAND going to
| pSLC mode somehow increases the P/E cycles drastically, like
| requiring massively lower voltage to program the cells. But I
| think that would be included in the WAF measure.
|
| What am I missing?
| wtallis wrote:
| QLC memory cells need to store and read back the voltage much
| more precisely than SLC memory cells. You get far more P/E
| cycles out of SLC because answering "is this a zero or a
| one?" remains fairly easy long after the cells are too worn
| to reliably distinguish between sixteen different voltage
| levels.
| 1oooqooq wrote:
| the author is wrong. what you mention is only true for
| actual SLC chip+firmware. qlc drivers probably don't even
| have the hardware to use the entire cell as slc, and they
| adopt one of N methods to save time/writes/power by
| underutilizing the resolution of the cell. neither gives
| you all the benefits, all increases the downsides to
| improve one upside.
|
| and you can't choose.
| wtallis wrote:
| Respectfully: to the extent that I can understand what
| you're trying to say, you don't seem to know what you're
| talking about. Stop trying so hard to bash the whole
| industry for making tradeoffs you don't agree with, and
| put a little more effort into understanding how these
| things actually work.
| 1oooqooq wrote:
| we are all here reading a machined translated article
| from a pt_br warez forum on using the wrong firmware on a
| ssd controller to talk to the firmware on a "smart" nand
| flash. to mimic a semblance of control of your own
| device.
|
| but yeah, I'm the delusional one and the industry is very
| sane and carrying for the wishes of the consumer. carry
| on.
| wtallis wrote:
| See, you're _still_ taking every opportunity to rant
| about "control", while continuing to get the technical
| details wrong (this time: the bit about firmware on a
| smart NAND chip, which is a real thing but not what the
| article is about). You're not even bothering to make a
| cogent case for _why_ retail SSDs should expose this low-
| level control, just repetitively complaining. You could
| have actually made the conversation more interesting by
| taking a different approach.
| 1oooqooq wrote:
| i could complain about all day about how it's impossible
| to write a decent driver for ssd here, even hdd drivers
| seen like decent, which is a far cry, but besides amusing
| you, where do you think this will go?
| Dylan16807 wrote:
| Even if they do it in a slapdash way, it's still going to
| be 0 versus "pretty high" and that's a lot easier than
| gradients of 16ths. Dropping the endurance to match QLC
| mode would require intentional effort.
| 1oooqooq wrote:
| data longevity depends on implementation in the firmware, which
| you have zero visibility. most consumer drivers will lower
| longevity.
| kozak wrote:
| Some Kingston SSDs allow you to manage over-provisioning (i.e. to
| choose the capacity-endurance tradeoff) by using a manufacturer-
| provided software tool.
| LoganDark wrote:
| I don't think that would change how many bits are stored per
| cell, though? If you, say, set overprovisioning to 80%, then
| that's going to be 80% of the QLC capacity, and it's going to
| use the remaining 20% still in QLC mode, it's not going to
| recognize that it can use SLC with 20% of the SLC
| overprovisioned.
| Crosseye_Jack wrote:
| Yeah, all over provisioning does is gives the controller more
| spare cells to play with. The cells will still wear at the
| same rate as if you didn't over provision, however depending
| on how the controller is wear leveling it could further
| improve the life of the drive because each cell is being used
| less often.
|
| This mod (I only just skimmed the post) provides a longer
| life not by using the cells less often (or keeping more in
| reserve), but by extending each cells life by decreasing the
| tolerance of charge needed to store the state of the cell,
| but in return decreasing the bits that can be stored in the
| cell so decreasing the capacity.
| DeathArrow wrote:
| I thought memory QLC and TLC memory chips are different at the
| physical level, not that is just a matter of firmware.
| dist-epoch wrote:
| There are physical differences, QLC requires more precise
| hardware, since you need to distinguish between more charge
| levels. But you can display a low-quality picture on a high-
| definition screen, or in a camera sensor average 4 physical
| pixels to get a virtual one, same thing here, you combine
| together some charge levels for increased reliability.
|
| Put another way, you can turn a QLC into a TLC, but not the
| other way around.
| wtallis wrote:
| The memory cells are identical. The peripheral circuitry for
| accessing the memory array gets more complicated as you support
| more bits per cell, and the SRAM page buffers have to get
| bigger to hold the extra bits. But everyone designs their NAND
| chips to support operating with _fewer_ bits per cell.
|
| Sometimes TLC and QLC chips will be made in different sizes, so
| that each has the requisite number of memory cells to provide a
| capacity that's a power of two. But it's just as common for
| some of the chips to have an odd size, eg. Micron's first 3D
| NAND was sold as 256Gbit MLC or 384Gbit TLC (literally the same
| die), and more recently we've seen 1Tbit TLC and 1.33Tbit QLC
| parts from the same generation.
| willis936 wrote:
| I wish this kind of deep dive with bus transfer rates was more
| common. It would be great to have a block diagram that lists
| every important IC model number / working clock frequency + bus
| width / working clock rate between these ICs for every SSD.
| kasabali wrote:
| You don't need to go through all that trouble to use most cheap
| DRAMless SSDs in pSLC mode. You can simply under-provision them
| by using only 25-33% capacity.
|
| Most low end DRAMless controllers run in full disk caching mode.
| In other words, they first write *everything* in pSLC mode until
| all cells are written, only after there are no cells left they go
| back and rewrite/group some cells as TLC/QLC to free up some
| space. And they do it _only_ when necessary, they don 't go and
| do that in background to free up more space.
|
| So, if you simply create a partition 1/3 (for TLC) or 1/4 (for
| QLC) the size of the disk, make sure the remaining empty space is
| TRIMMED and never used, it'll be _always_ writing in pSLC mode.
|
| You can verify the SSD you're interested in is running in this
| mode by searching for a "HD Tune" _full drive_ write benchmark
| results for it. If The write speed is fast for the first 1 /3-1/4
| of the drive, then it dips to abysmal speeds for the rest, you
| can be sure the drive is using the full drive caching mode. As I
| said, most of these low-end DRAMless Silicon Motion/Phison/Maxion
| controllers are, but of course the manufacturer might've modified
| the firmware to use a smaller sized cache (like Crucial did for
| the test subject BX500).
| chx wrote:
| How do you make sure the empty space is trimmed? Can you trim a
| portion of a disk?
| kasabali wrote:
| AFAIK Windows runs TRIM when you format a partition. So you
| can create a dummy partition and format it. Then you can
| either delete or simply not use it.
|
| On Linux, blkdiscard can be used in the same manner (create a
| dummy partition and run blkdiscard on it ex. *blkdiscard
| /dev/sda2).
| bravetraveler wrote:
| If one prefers working with LVM for their devices, that can
| be a similar wrench. Making/removing a logical volume can
| do the same
|
| It depends on _' issue_discards'_ being set in the config
|
| This has drifted over time. I haven't quite rationalized
| why. Hoping someone can remind me if nothing else, away
| from a real computer for a bit.
| rzzzt wrote:
| You can go into Properties > Tools > Optimize, the same
| button that runs defrag on spinning drives runs TRIM on
| solid state devices.
| formerly_proven wrote:
| Windows by default issues TRIMs basically instantly when
| deleting a file, and runs "Optimize disk" (which trims all
| free space) on a schedule by default as well.
| FooBarWidget wrote:
| What about external SSDs over USB? How do you trim those?
| wtallis wrote:
| There are trim-equivalent commands in the ATA, SCSI, and
| NVMe command sets. So the OS can issue the SCSI command
| to the USB device that's using UASP, and the bridge chip
| inside the external SSD can translate that to the ATA or
| NVMe counterpart before passing it on to the SSD
| controller behind the bridge. Not all external SSDs or
| bridge chips actually support trim passthrough, but these
| days it's standard functionality.
| FooBarWidget wrote:
| I wonder how to do it on macOS then. I have several
| external SSDs and none of them can be trimmed.
| wtallis wrote:
| https://kb.plugable.com/data-storage/trim-an-ssd-in-macos
|
| Apparently macOS doesn't expose the ability for userspace
| to manually issue trim commands to block devices (or at
| least doesn't ship a suitable tool to do so), so the best
| available workaround is to tell the filesystem layer that
| it should do automatic trimming even on external drives.
| ciupicri wrote:
| Create a partition that you'll never use and run blkdiscard
| [1] from util-linux on it.
|
| [1]: https://man7.org/linux/man-pages/man8/blkdiscard.8.html
| / https://github.com/util-linux/util-linux/blob/master/sys-
| uti...
| mananaysiempre wrote:
| The literal answer is yes, an ATA TRIM, SCSI UNMAP, or NVMe
| Deallocate command can cover whatever range on a device you
| feel like issuing it for. (The device, in turn, can clear
| all, none, or part of it.) On Linux, blkdiscard accepts the
| -o, --offset and -l, --length options (in bytes) that map
| more or less exactly to that. Making a(n unformatted)
| partition for the empty space and then trimming it is a valid
| workaround as well.
|
| But you're most probably doing this on a device with nothing
| valuable on it, so you should be able to just trim the whole
| thing and then allocate and format whatever piece of it that
| you are planning to use.
| howbadisthat wrote:
| What happens if the remaining space is TRIMMED but routinely
| accessed? (for example by dd, read only)
| wtallis wrote:
| If a logical block address is not mapped to any physical
| flash memory addresses, then the SSD can return zeros for a
| read request immediately, without touching the flash.
| stavros wrote:
| Does the mapping happen on first write? Is TRIM then a
| command that signals the SSD to unmap that block?
| BeeOnRope wrote:
| Yes.
| Dwedit wrote:
| You read zeroes.
| BeeOnRope wrote:
| Not guaranteed by default for NVMe drives. There's an NVMe
| feature bit for "Read Zero After TRIM" which if set for a
| drive guarantees this behavior but many drives of interest
| (2024) do not set this.
| Dwedit wrote:
| Hmmm, when I quick-formatted a drive (which TRIMs the
| whole thing), then tried reading it back in a disk hex
| editor, I just saw zeroes.
| halifaxbeard wrote:
| :mind-blown:
|
| i knew about "preconditioning" for SSDs when it comes to
| benchmarking, etc. didn't realize this was the why.
|
| thanks!
| 1oooqooq wrote:
| ssd firmwares are a mistake. they saw how easy it is to sell
| crap, with non ecc (i.e. bogus ram) being sold as the default
| and ran (pun intended) with it.
|
| so if under provisioned now they work as pSLC, giving you more
| data resilience in short term but wasting more write cycles
| because they're technically writing 1111111 instead of 1. every
| time. if you fill them up then they have less data resilience.
|
| and the best part, there's no way you can control any of it
| based on your needs.
| wtallis wrote:
| > giving you more data resilience in short term but wasting
| more write cycles because they're technically writing 1111111
| instead of 1. every time.
|
| No, that's not how it works. SLC caches are used primarily
| for performance reasons, and they're faster precisely because
| they _aren 't_ doing the equivalent of writing four ones (and
| especially not _seven_!?) to a QLC cell.
| wruza wrote:
| Technically they are writing (0,0,0,1) instead of (0.0625).
| hengheng wrote:
| How can I verify that things stay this way?
|
| Partitioning off a small section of the drive feels very 160 GB
| SCSI "Let's only use the outer sectors".
| kasabali wrote:
| Even keeping the drive always 75% empty would be enough, but
| partitioning off is the easiest way to make sure it's never
| exceed 25-33% full (assuming the drive behaves like that in
| the first place).
|
| To verify the drive uses the all of the drive as a cache, you
| can run full drive sequential write test (like the one in HD
| Tune Pro) and analyze the speed graph. If, say, a 480GB drive
| writes at full speed for the first 120GB, and then the write
| speed drops for the remaining 360GB, this means the drive is
| suitable for this kind of use.
|
| I think controllers might've been doing some GC jobs to
| always keep some amount of cells ready for pSLC use, but it
| should be a few GBs at most and shouldn't affect the use case
| depicted here.
| Dylan16807 wrote:
| > Partitioning off a small section of the drive feels very
| 160 GB SCSI "Let's only use the outer sectors".
|
| In that it was very reliable at accomplishing the goal?
| bayindirh wrote:
| Short stroking bigger disks for higher IOPS and storage
| speed was a _de-facto_ method in some HPC centers. Do this
| to a sufficiently large array and you can see unbelievable
| IOPS numbers _for that generation of hardware_.
| userbinator wrote:
| That is what an ideal FTL would do if only a fraction of the
| LBAs are accessed, but as you say some manufacturers will
| customise the firmware to do otherwise, while this mod
| basically guarantees that the whole space is used as SLC.
| justinclift wrote:
| It mentions the required tool being available from um...
| interesting places.
|
| Doing a Duck Duck Go search on the "SMI SM2259XT2 MPTool FIMN48
| V0304A FWV0303B0" string in the article shows this place has the
| tool for download:
|
| https://www.usbdev.ru/files/smi/sm2259xt2mptool/
|
| The screenshot in the article looks to be captured from that site
| even. ;)
|
| Naturally, be careful with anything downloaded from there.
| gaius_baltar wrote:
| There were several instances were I saw an interesting tool for
| manipulating SSDs and SD cards only available from strange
| Russian websites. This one at least has an English UI ... A lot
| of research seems concentrated there and I wonder why it did
| not catch the same level of interest in the west.
| justinclift wrote:
| Yeah. That site has a lot of info for a huge number of flash
| controllers/chipsets/etc.
|
| Wish I had a bunch of spare time to burn on stuff like this.
| :)
| fuzzfactor wrote:
| Good to see they are still available.
|
| The wide variety of controller/memory combinations makes it
| quite a moving target.
|
| This is the "mass production" software that makes it
| possible to provision, partition, format, and even place
| pre-arranged data or OS's in position before shipping
| freshly prepared drives to a bulk OEM customer. On one or
| more "identical" drives at the same time.
|
| For _USB flash thumb drives_ the same approach is used.
| Factory software like this which is capable of modifying
| the firmware of the device is unfortunately about the only
| good way to determine the page size and erase block size of
| a particular USB drive. If the logical sectors storing your
| information are not aligned with the physical memory blocks
| (which somewhat correspond to the "obsolete" CHS
| geometry), the USB key will be slower than necessary,
| especially on writes. Due to write-amplification, and also
| it will wear out much sooner.
|
| Care does not go into thumb drives like you would expect
| from SSDs, seems like very often a single SKU will have
| untold variations in controller/memory chips. Also it seems
| likely that during the production discontinuities when the
| supply of one of these ICs on the BOM becomes depleted, it
| is substituted with a dissimilar enough chip that a
| revision of the partitioning, formatting, and data layout
| would be advised, but does not take place because nobody
| does it. And it still works anyway so nobody notices or
| cares. Or worse, it's recognized as an engineering
| downgrade but downplayed as if in denial. Wide variation in
| performance within a single SKU is a canary for this, which
| can sometimes be rectified.
| userbinator wrote:
| _and I wonder why it did not catch the same level of interest
| in the west._
|
| Because people in the west are too scared of IP laws.
| drtgh wrote:
| I was unable to find the source code, so it is important to be
| careful. In my case it sounds like a faith jump that I don't
| have (my apologies to the developers).
|
| In any case, this is a feature that manufacturers should
| provide. I wonder how it could be obtained.
| justinclift wrote:
| > I wonder how it could be obtained.
|
| Reverse engineering and a huge amount of free time.
| userbinator wrote:
| That _is_ the actual manufacturer 's tool.
| sampa wrote:
| oh that western superiority complex... hits once again...
| beside the mark
| justinclift wrote:
| > western superiority complex
|
| What are you on about?
| userbinator wrote:
| In countries where people have been less conditioned to be
| mindless sheep, you can more easily find lots of truth that
| doesn't toe the official line.
|
| Spreading xenophobic FUD only serves to make that point
| clearer: you can't argue with the facts, so you can only sow
| distrust.
| justinclift wrote:
| > Spreading xenophobic FUD
|
| ?
| aristus wrote:
| About ten years ago I got my hands on some of the last production
| FusionIO SLC cards for benchmarking. The software was an in-
| memory database that a customer wanted to use with expanded
| capacity. I literally just used the fusion cards as swap.
|
| After a few minutes of loading data, the kernel calmed down and
| it worked like a champ. Millions of transactions per second
| across billions of records, on a $500 computer... and a card that
| cost more than my car.
|
| Definitely wouldn't do it that way these days, but it was an
| impressive bit of kit.
| sargun wrote:
| I worked at a place where I can say, FusionIO saved the
| company. W e had a single Postgres database which powered a
| significant portion of the app. We tried the kick off a
| horizontal scale project to little success around it - turns
| out that partitioning is hard on a complex, older codebase.
|
| Somehow we end up with a FusionIO card in tow. We go from
| something like 5,000 read QPS to 300k reads QPS on pgbench
| using the cheapest 2TB card.
|
| Ever since then, I've always thought that reaching for vertical
| scale is more tenable than I originally thought. It turns out
| hardware can do a lot more than we think.
| hinkley wrote:
| The slightly better solution for these situations is to set
| up a reverse proxy that sends all GET requests to a read
| replica and the server with the real database gets all of the
| write traffic.
|
| But the tricky bit there is that you may need to set up the
| response to contain the results of the read that is triggered
| by a successful write. Otherwise you have to solve lag
| problems on the replica.
| hinkley wrote:
| In the nineties they used battery backed RAM that cost more
| than a new car for WAL data on databases that desperately
| needed to scale higher.
| linsomniac wrote:
| Back when the first Intel SSDs were coming out, I worked with
| an ISP that had an 8 drive 10K RAID-10 array for their mail
| server, but it kept teetering on the edge of not being able to
| handle the load (lots of small random IO).
|
| As an experiment, I sent them a 600GB Intel SSD in laptop drive
| form factor. They took down the secondary node, installed the
| SSD, and brought it back up. We let DRBD sync the arrays, and
| then failed the primary node over to this SSD node. I added the
| SSD to the logical volume, then did a "pvmove" to move the
| blocks from the 8 drive array to the SSD, and over the next few
| hours the load steadily dropped down to nothing.
|
| It was fun to replace 8x 3.5" 10K drives with something that
| fit comfortably in the palm of my hand.
| RA2lover wrote:
| Could this be used to extend the lifetime of an already worn-out
| SSD? I wonder if there's some business in china taking those and
| reflashing them as "new".
| dannyw wrote:
| Technically, QLC NAND that is no longer able to distinguish at
| QLC levels should certainly still be suitable as MLC for a
| while longer, and SLC, for all practical intents and purposes,
| forever.
| chasil wrote:
| The only rejuvenation process that I know is heat, either long
| period exposure to 250degC or short-term at higher temperature
| (800degC).
|
| https://m.hexus.net/tech/news/storage/48893-making-flash-mem...
|
| https://m.youtube.com/watch%3Fv%3DH4waJBeENVQ&sa=U&ved=2ahUK...
| justinclift wrote:
| So the trick is to somehow redirect all of the heat energy
| coming from cpus onto the storage, in bursts? :D
| kasabali wrote:
| Stay closer, now they're putting heatsinks on controller
| chips of SSDs :D
| userbinator wrote:
| That first article was 12 years ago when MLC was the norm and
| had 10k endurance.
|
| _Macronix have known about the benefits of heating for a
| long time but previously used to bake NAND chips in an oven
| at around 250C for a few hours to anneal them - that's an
| expensive and inconvenient thing to do for electronic
| components!_
|
| I wonder if the e-waste recycling operations in China may be
| doing that to "refurbish" worn out NAND flash and resell it.
| They already do harvesting of ICs so it doesn't seem
| impossible... and maybe this effect was first noticed by
| someone heating the chips to desolder them.
| loeg wrote:
| DIWhy type stuff. Still, fun hack. TLC media has plenty of
| endurance. We see approximately 1.3-1.4x NAND write amplification
| in production workloads at ~35% fill rate with decent TRIMing.
| riobard wrote:
| Is it possible for SSD firmware to act "progressively" from SLC
| to MLC to TLC and to QLC (and maybe PLC in the future)? E.g. for
| a 1TB QLC SSD, it would act as SLC for usage under 256GB, then
| MLC under 512GB, then TLC under 768GB, and then QLC under 1TB
| (and PLC under 1280GB).
| wtallis wrote:
| It's theoretically possible, but in practice when a drive is
| getting close to full what makes sense is to compact data from
| the SLC cache into the densest configuration you're willing to
| allow, without any intermediate steps.
| hotstickyballs wrote:
| That's just a normal ssd rated at the QLC capacity.
| msarnoff wrote:
| I'd also recommend this if you're using eMMC in embedded devices.
| On a Linux system, you can use the `mmc` command from `mmc-utils`
| to configure your device in pSLC mode. It can also be done in
| U-Boot but the commands are a bit more obtuse. (It's one-time
| programmable, so once set it's irreversible.)
|
| In mass-production quantities, programming houses can
| preconfigure this and any other eMMC settings for you.
| eternityforest wrote:
| That makes eMMC slightly less awful!
| userbinator wrote:
| What isn't prominently mentioned in the article is that endurance
| and retention are highly related --- flash cells wear out by
| becoming leakier with each cycle, and so the more cycles one goes
| through, the faster it'll lose its charge. The fact that SLC only
| requires distinguishing between two states instead of 16 for QLC
| means that the drive will also hold data for (much) longer in SLC
| mode for the same number of cycles.
|
| In other words, this mod doesn't only mean you get extreme
| endurance, but retention. This is usually specified by
| manufacturers as N years after M cycles; early SLC was rated for
| 10 years after 100K cycles, but this QLC might be 1 year after
| 900 cycles, or 1 year after 60K cycles in SLC mode; if you don't
| actually cycle the blocks that much, the retention will be much
| higher.
|
| I'm not sure if the firmware will still use the stronger ECC
| that's required for QLC vs. SLC even for SLC mode blocks, but if
| it does, that will also add to the reliability.
___________________________________________________________________
(page generated 2024-05-19 23:00 UTC)