I’m planning on building a new home server and was thinking about the possibility to use disc spanning to create matching disk sizes for a RAID array. I have 2x2TB drives and 4x4TB drives.

Comparison with RAID 5

4 x 4 TB drives

  • 1 RAID array
  • 12 TB total

4 x 4 TB drives & 2 x 2 TB drives

  • 2 RAID arrays
  • 14 TB total

5 x 4* TB drives

  • Several 4TB disks and 2 smaller disks spanned to produce a 4 TB block device
  • 16 TB total

I’m not actually planning on actually doing this because this setup will probably have all kinds of problems, however I do wonder, what would those problems be?

  • lemmyvore@feddit.nl
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 months ago

    Typical problems with parity arrays are:

    • They suffer from something called “write hole”. If power fails while information is being written to the array, different drives can end up with conflicting versions of the information and no way to reconcile it. The software solution is to use ZFS, but ZFS has a pretty steep learning curve and is not easy to manage. The hardware solution is to make sure power to the array never fails, by using either an UPS to the machine or connecting the drives through a PCI card with a battery, which allows them to always finish write operations even without power.
    • Making up a 4 TB out of 2x2 TB is not a good idea, you’re basically doubling the failure probability of that particular “4 TB” drive.
    • Parity arrays usually require drives to be all the same size. Meaning that if you want to upgrade your array you need to buy as many drives before you can take advantage of the increased space. There are parity schemes like Unraid that work around this by using only one large parity drive that computes parities across all the others regardless of their sizes; but Unraid is proprietary and requires a paid subscription.
    • If a drive fails, rebuilding the array after replacing that drive requires an intensive pass through all the surviving members of the array. This can greatly increase the risk of another drive failing. A RAID5 array would be lost if that occured. That’s why people usually recommend RAID6, but RAID6 only makes sense with 5+ drives.

    Unrelated to parity:

    • Using a lot of small drives is very power-intensive and inefficient.
    • Whenever designing arrays you have to consider what you’ll do in case of drive failure. Do you have a replacement on hand? Will you go out and buy another drive? How long will it take for it to reach you?
    • What about backups?
    • How much of your data is really essential and should be preserved at all costs?