Specifying individual drives in each pair of each mirror for RAID10

Bayonet · April 17, 2016, 1:44pm

Hello Rockstor community,

After repeated dissatisfaction with Freenas and ZFS I have decided to give a BTRFS implementation a try, so far I am loving the responsiveness, ease of use and visual design of the UI in Rockstor. However, I have encountered something which concerns me with the way RAID10 is displayed. As you can see below, I have twelve drives, six of which are Seagate, six Western Digital. My theory behind this is that I am more likely to have two drives from the same manufacturer fail than one drive from each. Thus I had intended to configure each mirror to consist of one Seagate and one WD drive. With this I would be less likely to have the entire pool go down if more than one drive failed.

When creating a RAID10 pool in Rockstor there does not seem to be any way to specify which drives are in each mirror pair. In fact the arrangement appears rather random.
Is there any way (even if via the CLI) that I can specify the individual drives in each pair of each mirror for RAID10?

Spectre694 · April 17, 2016, 2:04pm

No BTRFS does not allow you to specify which disks data goes onto. They are allocated on a sort of round robin scheme based on capacity using all of the disks in the array.

That means for say a 4 disk RAID 1 ( disks 1,2,3, and 4) data set data A could be mirrored across 1 and 2 but data b could be mirrored across 4 and 2 and data c across disks 1 and 3 etc. RAID 10 just adds a second mirror set and stripes across.

I wouldn’t worry about two disks from the same manufacturer failing in order as unless you get a bad batch they tend to fail more or less randomly.

Bayonet · April 17, 2016, 2:38pm

Thanks for the quick reply.
Are you saying that RAID1 works differently to the traditional implementation where a four disk array would consist of one data disk and three parity disks. But here there is only one set of data and one parity, even with four disks? So with BTRFS would two drives failing have a greater chance of data lose than with ZFS or Hardware RAID due to the data being distributed randomly across all drives in the pool? Perhaps my understanding of BTRFS is lacking, but it seems crazy that if a pool were 100% full and two drives were lost there would be a 100% chance of some data loss.

Spectre694 · April 17, 2016, 2:54pm

That is correct RAID is defined and implemented differently in BTRFS vs HW ZFS RAID. It makes different assumptions based for capacity vs performance vs reliability.

BTRFS:

RAID 1 = all data mirrored on any two drives (can handle one failure)
RAID 10 = all data mirrored on two different pair and then striped across those pairs ( can lose a disk)
RAID 5 = data exists on (n-1) drives with parity on one disk (can lose a disk)
RAID 6 = data exists on (n-2) drives with parity on two disks)

HW:

RAID 1= data from one disk is copied to all others
RAID 10 = two mirrored pairs are striped ( for larger than 4 it would depend on what the implement could handle ) likely increase striped pairs
RAID 5 = same
RAID 6 = same

In this case a btrfs RAID 1 (using RAID 1 for ease of explanations) array capacity of 12 drives can survive one failure and has a capacity of one half (if all are the same size)

A traditional HW RAID 1 of 12 drives could survive 11 failures but has a capacity of one drive. A HW RAID 10 of 12 drives should equal the same space but lose less drives (depends on implement though). (both are exceedingly strange choices for an array)

12 disks is either time to split the array or switch to RAID5/6

Bayonet · April 18, 2016, 6:41am

So there is really no practical advantage of RAID1 over RAID5, both only have a single point of failure. Sounds like RAID6 is the only way to go for reasonable redundancy.
How about performance, would RAID10 still have greater IOPS and sequential speeds than RAID5/6?
I did some looking around the web to find more information, but have come up short. I also seem not to be the only one unsure of BTRFS’s differences: https://www.reddit.com/r/linuxquestions/comments/3de7z4/can_i_put_my_trust_into_btrfs_yet/ct4mcee

Spectre694 · April 18, 2016, 2:21pm

Haha ya btrfs does a few things differently from traditional implements in order to give it the flexibility to change RAID levels and use different sized disks plus others.

For RAID1 use when number of disks is low or when they have vastly different capacities (helps maximize space)
For RAID 10 use with > than 4 disks and when IOPS and write/read speed are key .
RAID5/6 use with > 4/5 disks [can be used with less if you really want] and when capacity and read speeds (should have faster sequential reads than 10 if given enough disks) are key.
[except for RAID 1 all should have no issue saturating a gigabit port though]

I don’t have a handy link detailing all of the differences between them maybe someone else here does.

Bayonet · April 18, 2016, 2:47pm

I probably should have mentioned that I am running 10Gb, so my aim was to get speeds as close to that as possible. But while important data is backed up on something else, I would like a bit more redundancy than BTRFS can give with RAID10, so it sounds like RAID6 will be the best option in my situation.
Thanks very much Spectre for your help on this.

I wonder if a moderator could please move this thread to the BTRFS category.

OpenSourceUser · April 18, 2016, 6:31pm

It might also be possible to convert RAID6 (2 drives parity) to something with 3 oder more parities later on. There is a patch, but it seems like this feature will not be integrated in the official branch. =/
I wouldn’t be concerned about drives from the same manifacturer failing after a specific time. There are reports of big data centers saying that drive failures happen rather randomly even if there has been a firmware issue like the one causing the LCC to go up as hell in no time.

phillxnet · April 18, 2016, 9:53pm

Tagged as BTRFS as requested.

@Bayonet Welcome to the Rockstor community.