Definition of RAID types in Rockstor/Btrfs

xdavidx · February 7, 2021, 2:01am

I’m new to Rockstor. I’m considering it for a new NAS build at home. I’m doing some reading up on it before I install it.

I’m familiar with standard RAID terms and definitions. What I wasn’t aware of (until today) is that some of these terms have been given new definitions in Btrfs, and hence Rockstor. As I set about to try to determine, definitively, what those definitions are, I found conflicting information in various posts online about Rockstor and Btrfs and in the Rockstor documentation itself. This could have been due to changes over time or it could have been due to misunderstandings by the authors.

Is there one, up-to-date, definitive source for how the Rockstor/Btrfs RAID terms are defined? In particular, I’m looking at information that will explain how the data is stored on the drive arrays and how many disk failures result in data loss. I’m aware of the write hole issue in Btrfs, but I’m not talking about that level of detail (failure of the file system in various power outage or non-disk failure conditions).

It might not be in the same place, but if someone can point me to the up-to-date information on how to recover from driver failures with Rockstor, that would be helpful as well. I’m not asking for anyone to write a post explaining these things. I just want a pointer to the correct documentation that I should refer to.

As an example of what I’m talking about, there are some sources that state that a Rockstor/Btrfs RAID10 array can only sustain a single drive failure and that a second drive failure will result in data loss (regardless of the number of disks in the array). Other sources state that it works the same way as traditional RAID10, where there is no loss with a single disk failure, but the probability of loss after a second disk failure is 1/N.

Again, I’m not looking for a post that explains this. I’m looking for documentation, so I can read up on it completely, from a definitive source.

I’ll be sure to ask clarifying questions after I read it, to be sure I understand it fully.

Thanks.

phillxnet · February 7, 2021, 11:21am

@xdavidx Welcome to the Rockstor community.
Re:

Yes, we do have some anomalies in our own documentation, likely for both of the reasons your site and at least the following doc issue on GitHub to address this:

https://github.com/rockstor/rockstor-doc/issues/167

Which states the following:

“Also note that there is the indication that raid10 can sustain a 2 drive failure, although this is academically possible it is extremely unlikely given the block/extend level of raid 1 (mirror) rather than the device level. This should be corrected to reflect the practical limitation along with an explanation.”

That issue links to a number of other doc issues in turn. We have some work to do here but in essence, btrfs does raid ‘of old’ but on a block/extend level. Not a drive level. This changes the definitions slightly. Also raid1 of old was, in some circles, considered to have the same number of copies as disk in the pool (vol in btrfs speak). But btrfs currently only does 2 copies (on different devices), irrespective of the number of pool members. Note however that there are now newer options, as yet unsupported by Rockstor to do 3 and 4 copies in raid1.

Anyway you specifically asked for authoritative sources not a response here, however I thought I’d just slip the above in for others coming accross the thread as the above difference helps to explain a lot I find.

The authoritative sources, in my opinion, are:

The btrfs wiki:

btrfs Wiki
Although this, like Rockstor’s documentation, is often awaiting community maintenance, with Rockstor’s founder having made small but important contributions there early on; like for example the minimum number of devices for a parity raid (5/6) setup.

We have a reference to this wiki in our docs thus:

Please see the btrfs wiki for up to date information on all btrfs matters.

For a BTRFS features stability status overview, including redundancy profiles, vist the btrfs wiki status page.

With the up to date being a matter of opinion really as it pertains defferently to different distros, i.e. Ubuntu vs Debian vs Tumbleweed etc.

The linux-btrfs mailing list:

There are various ways to peruse/digest this however so here is a canonical reference from the wiki to get you started.
Btrfs mailing list - btrfs Wiki

These would be best addressed to the btrfs mailing list, as that is the definitive source of truth, expecially given that truth is a moving target depending the the kernel version and what back-ports are added to the distro kernel. In our case our non legacy variant is ‘Built on openSUSE’, Leap 15.2 currently as it goes.

We here have working knowledge of such things and specifically how we treat ‘stuff’ but in essence we simply inherit whatever filesystem technology comes down the line from our donor OS. We make no modifications to this base technology what-so-every on the filesystem/kernel front. But we do restrict some of the features by not surfacing them in an attempt to enhance simplicity/usability.

Hope that helps.