[HN Gopher] Raid-Z Expansion Feature for ZFS Goes Live
       ___________________________________________________________________
        
       Raid-Z Expansion Feature for ZFS Goes Live
        
       Author : chungy
       Score  : 67 points
       Date   : 2022-02-08 09:16 UTC (13 hours ago)
        
 (HTM) web link (freebsdfoundation.org)
 (TXT) w3m dump (freebsdfoundation.org)
        
       | aniou wrote:
       | I'm sorry to say that but this article is not entirely true - an
       | illustration "how does traditional raid 4/5/6 do it?" shows ONLY
       | RAID 4. There is a big difference between RAID 4 and RAID 5/6 and
       | former was abandoned a years (decades?) ago in favor of RAID 5
       | and - later - 6.
       | 
       | Of course, it gives "better publicity" for RAID-Z, but it is
       | rather an marketing trick not engineering.
       | 
       | See https://en.wikipedia.org/wiki/Standard_RAID_levels
        
         | mrighele wrote:
         | Note that the article talks about the way the array is
         | expanded, not how the specific level works.
         | 
         | In other words, what they are saying is that the traditional
         | way to expand an array is essentially to rewrite the whole
         | array from scratch, so if the old array has three stripes, with
         | blocks [1,2,3,p1] [4,5,6,p2] and [7,8,9,p3] (with p1 and p2
         | being the parity blocks), the new array will have stripes
         | [1,2,3,4,p1'], [5,6,7,8,p2'] and [9,x,x,x,p3'], i.e. not only
         | has to move the blocks around, but also recompute essentially
         | all the parity blocks.
         | 
         |  _IF_ I understand the ZFS approach correctly, the existing
         | blocks are not restructured but only reshuffled, so the new
         | layout will be logically still [1,2,3,p1] [4,5,6,p2] and
         | [7,8,9,p3] but distributed on five disks so [1,2,3,p1,4]
         | [5,6,p2,7,8], [9,p3,x,x,x]
         | 
         | It seems that this means less work while expanding, but some
         | space lost unless one manually copies old data in a new place.
         | 
         |  _IF_ I got it right, I am not sure who is the intended
         | audience for this feature: enterprise users will probably not
         | use it, and power users would probably benefit from getting all
         | the space they could get from the extra disk
        
       | genpfault wrote:
       | So is this feature FreeBSD-only? Or will it be integrated into
       | OpenZFS at some point?
        
         | chungy wrote:
         | It is an experimental feature on FreeBSD 14-CURRENT. It will be
         | merged into OpenZFS eventually (and maybe backported to FreeBSD
         | 13-STABLE and whatever new point releases happen).
        
       | mnd999 wrote:
       | Since the Linux crowd moved in zfs development seems to have gone
       | from stability to feature feature feature. I'm starting to get a
       | bit concerned that this isn't going to end well. I really hope
       | I'm wrong.
        
         | nightfly wrote:
         | These feature feature features are ones people have been asking
         | for _years_.
        
         | dsr_ wrote:
         | This feature is being developed in FreeBSD, and will become
         | part of the general ZFSonLinux set.
        
         | p_l wrote:
         | Nothing really changed in development, ZFSonLinux team was
         | actually one of the more conservative in terms of data safety,
         | what changed is that a bunch of things that were _really_ long
         | in the works coincided in reaching maturity.
         | 
         | If you want "feature chasing", FreeBSD ZFS TRIM is the ur
         | example. I've read that code end to end... and I'll leave it at
         | this.
        
       | mrighele wrote:
       | For those that like videos more than text, there is a youtube
       | video from last year [1] that explain the feature (unless it's
       | changed since ,but it seems not to be the case).
       | 
       | One downside that I see of this approach, if I understand it
       | correctly, is that the data already present on disk will not take
       | advantage of the extra disk per slice. For example, if I have a
       | raidz of 4 disks (so 25% of space "wasted"), and add another
       | disk, new data will be distributed on 5 disks (so 20% of space
       | "wasted") but the old data will keep using stripes of 4 blocks,
       | they will just be reshuffled between the disks. Do I understand
       | it correctly ?
       | 
       | [1] https://www.youtube.com/watch?v=yF2KgQGmUic
        
       | vanillax wrote:
       | ZFS is great until its not. When you lose a zpool or vdev its
       | unrecoverable, its pretty crappy when it happens. Check out how
       | Linus Tech Tips lost everything.
       | https://www.youtube.com/watch?v=Npu7jkJk5nM
        
         | 7steps2much wrote:
         | To be fair, they didn't really understand how ZFS works and
         | failed to set up bitrot detection.
         | 
         | A "clean" setup would include those, as well as either a
         | messaging system or a regular checkup on how your FS is doing.
        
         | Youden wrote:
         | You can say this about literally any storage system.
         | Unrecoverable failures can always happen, that's why you keep
         | backups.
         | 
         | ZFS redundancy features aren't there to eliminate the need for
         | backups, they're there to reduce the chance of downtime.
        
       | reincarnate0x14 wrote:
       | This is great, there has been a demand for this since forever.
       | Enterprise-y people generally didn't care much but the
       | homelab/SMB users end up dealing with it a lot more than might be
       | naively imagined.
       | 
       | Always reminds me of when NetApp used to do their arrays in
       | RAID-4 because it made expansion super-fast, just add a new
       | zeroed disk and only had to update the new disk blocks + parity
       | drive on writes. Used to blow our Netware admin's mind as almost
       | nobody else ever used RAID-4 -- I had it as an interview question
       | along with "what is virtual memory" because you'd get interesting
       | answers :)
        
       | uniqueuid wrote:
       | This is great, but an important and little known caveat is that
       | raidz is limited to the iops of one disk. So growing a raidz will
       | at some point have lots of throughput but suffer in small and
       | random reads and writes. At that point, it will be better to grow
       | the pool with additional, separate raidz.
        
         | Osiris wrote:
         | So you setup two raid-z then stripe them for increased
         | performance?
        
       | FullyFunctional wrote:
       | I certainly wanted this. I even heckled Bill Moore about it.
       | Having gone through the expansion the old way (replace each drive
       | one at a time with a larger one), this looks a lot simpler.
       | Unfortunately it appears to not work with simple mirror and
       | stripes (~ RAID10) so it will make no difference for me. (Drives
       | are cheap but performance is not -> RAID10).
        
         | chungy wrote:
         | Simple mirrors and stripes could always be expanded (and
         | reduced, too). RAID-Z has been special.
        
           | [deleted]
        
         | arwineap wrote:
         | This looks different than the old way
         | 
         | The old way ( as you referenced ) was to replace each disk one
         | by one with a larger one.
         | 
         | If I'm understanding this right, and please correct me, this
         | feature will allow you to add a 5th disk to a 4 disk raidz
         | 
         | And if I'm right about that, then this feature wouldn't really
         | make sense for RAID10 anyway
        
           | FullyFunctional wrote:
           | I love ZFS but this is something that just works in btrfs;
           | mirror just means all blocks live in two physical locations.
           | You certainly can do that even with an odd number of drives.
           | However ZFS is more rigid and doesn't allow flowing blocks
           | like this, nor dynamic defragmentation.
        
             | deagle50 wrote:
             | Would you use raid 5/6 in btrfs?
        
               | NavinF wrote:
               | The btrfs raid5/6 write hole is still around, if anyone's
               | wondering. Though it was only recently that btrfs started
               | warning users that it would eat their data: https://www.p
               | horonix.com/scan.php?page=news_item&px=Btrfs-Wa...
        
             | NavinF wrote:
             | What are you on about? The submission is about raidz.
             | 
             | Adding drives to a mirror has worked in zfs since
             | prehistoric times. "zpool attach test_pool sda sdc" will
             | mirror sda to sdc. If sda was already mirrored with sdb,
             | you now have a triple-mirror with sda, sdb, and sdc.
        
       | 2OEH8eoCRo0 wrote:
       | ZFS is a fad.
        
         | sleepycatgirl wrote:
         | Nah, ZFS is pretty comfy FS, it has lots nice features, it is
         | reasonably fast, and it is stable. And as far as I know, it has
         | been used for fairly long time.
        
       | ggm wrote:
       | Lots of people have wanted this for ages. I managed to cope with
       | spindle replace and resize into new space (larger spindles) but
       | being able to add more discrete devices and get more parity
       | coverage and more space (I may be incorrectly assuming you get
       | better redundancy as well) is great.
        
         | rincebrain wrote:
         | This trick cannot be used to turn an N-disk raidzP into an [any
         | number]-disk raidzP+1, as far as I understand.
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-02-08 23:00 UTC)