Raid5/6 stability

Hello,
There are some patches incomming for BTRFS. I hope it will arrive in next kernel/btrfs-tools but at this time raid5 should not be used.
BTRFS raid 1 is still very flexible and “stable” to use. I like that i can easily upgrade my storage with btrfs just adding a larger driver then removing the smaller one (mixing sizes).
If you really want raid5 then you should keep an eye open for the next kernel releases and the BTRFS wiki.

Running 3x4TB on Raid5 since 2years without problems so far, but I’m looking forward to any upcoming update like it comes with the next 3.9.1 update

I used to use btrfs w/ RAID5. I lost multiple TBs of data and I have since moved on. RAID5/6 just can’t be trusted on btrfs and the tools just aren’t there either. I needed RAID5/6 because I wanted to grow my array a disk at a time and using different size disks.

Over the past few months I’ve used a couple solutions, one is a Windows server using Stablebit DrivePool + SnapRAID and Windows Storage Spaces. I would have stayed with the DrivePool + SnapRAID solution, but my parity on SnapRAID just started getting errors that could be fixed, but would just pop up the next time parity was updated. Windows Storage Spaces on Server 2016 worked great but the performance was just not there in parity mode even if you had a cache drive. I could never fully saturate a gigabit connection with reads or writes.

As of this post, I am currently transferring 16TB of data to an UnRAID array. If this doesn’t work out, my last option is mdadm and hoping btrfs RAID5/6 gets fixed since Rockstor itself was great, but limited by the filesystem.

I think you have missunderstod BTRFS raid1.
You can mix different size disks and expand 1 disk at the time.
BTRFS raid1 is not like traditional raid1 that you find in for example ZFS, hardware raid or linux software raid (MD).

BTRFS will just make sure you have 1 extra copy of the data.

I use 1x3TB and 2x2TB disks in a single btrfs raid1. It works great for me. Later on i will switch out one of the 2TB disks for a 4TB disk.

I specifically said that I did not want RAID1 and I don’t think the topic starter even mentioned RAID1.

You said "I needed RAID5/6 because I wanted to grow my array a disk at a time and using different size disks"
All i was pointing out is that RAID1 in BTRFS can do this. RAID5/6 is not needed to grow one disk at the time.

I have been trying to follow the RAID5/6 developments in the BTRFS maillist.

Progress seems to have been made, especially to the scrub and disk repair functions.

There seems to have been some rather serious bugs in the code, that is now ironed out.

And the later developments, where they have made functions that purposely can construct errors in the RAID array, will also make it easier to find bugs and correct them (when you know the errors you have created, you can see if they are handled properly).

But from what I can read, the code, allthough significantly improved, is not tested and verified to run reliable. This is a diificult situation, as the functions needs to be used in order to find the bugs and fix them.

I for one ran RAID5/6 for some time (over a year, light usage, mostly as a media storage), and never experienced dataloss.
But I also never had a hardware failure that prompted any kind of repairs or hardware replacement.
At the time I ran RAID6, maintenance jobs like scrubbing was so very very slow, that it prevented me from actually doing them.
In my current RAID1 setup, a scrub of my entire pool, is a 4-5 hour job. Under RAID6 it was a 3-4 days job, where all the drives were working all the time.
There have been great improvemnets to the scrubbing code (as far as I can see), since then, so this might not be the case any longer

I would not recommend using RAID5/6 yet. Not before some testing has been done, to actually show that it does run, and is able to recover from errors. The new code for introducing errors and testing the code for fixing them, should be helpfull, and I’m keeping an eye out for developments.

I too want to run my array in RAID6, but for now it is RAID1, and it’ll stay that way until RADI5/6 is proven to be more reliable.

Thanks for the replies, seems im out of luck for now and need to buy two more disks to create a second raid6 while waiting for the raid5/6 recovery code to be fixed in btrfs

Regarding the long scrub/balance time for raid setups in BTRFS there is a limitation not sure if it’s only for RAID 5/6 that the scrub/balance jobb will only consume on core. So if you have a fancy CPU with many cores it will only use 1 of the cores and usually 100% of it. therefor it will take a long time to finish the scrub/balance.

Seems it’s only in planning stage to have this fixed so it can use all cores.

FWIW the pull request for kernel 4.12 contains multiple RAID5/6 fixes:

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg63365.html

3 Likes

That could be part of the explanation :slight_smile:

But I remember doing a lot of research into why it was so slow, and one thing I found was that my CPU was completely idle much of the time, it was not pegging one of the cores at 100%, and average load was 2-3% much of the time.
My CPU is not that fancy, an old Quad core, but it does the job. And with a single thread at 100% I would see 25% usage.
It would peak to 25% for periods, and then go idle for longer periods. With constant disk activity on all disks.

The slow scrub is due to other factors also, of that I’m quite sure, and I hope that some of the fixes being introduced could help.

I had a 12 drive RAID 6 , and while it was removing a disk its kernel paniced…

long story short… when i came back up, i had lost everything on that array (60TB-ish)

totally unmountable, repairable, absolutely nothing

yes, that’s similar to the things i have read about the failure handling code in btrfs raid5/6 :confused:

hope you had a backup :smiley:

Nope i didnt…

it was media downloads/rips, so could all be recreated… the cost of backing up that amount of data wouldn’t have been worth it

my personal photos / documents etc are all backed up in multiple locations and to the cloud

I had a similar experience.

Right at the start of my Rockstor usage, I was experimenting a lot with it to see if it should replace my Netgear NAS.

During the addition of a disk (RADI6), and the following balance, the system crashed, and the pool wouldn’t mount.

I was in luck though, and following some howto’s on the subject actually allowed me to fix it, so it came back to working order. As I understand it, one of the rare instances of a recoverable RAID5/6 array.

But later on when I had decided to use Rockstor and RAID6, some horror storries emerged, and when the developers started to discourage the use of RAID5/6, I decided to convert to RAID1.
I’m very much still hoping that RAID6 will be “OK to go” soon, as I really want to use it.

When the new kernel is implemented in Rockstor I’m thinking about building a small setup with some spare part and old disks I have lying around. That setup would run RAID6, and I could simulate different errors (missing drives, replacing drives, adding, removing, that sort of thing). Lets see if I ever get around to it :slight_smile:

I think this is the thread i found when digging around the web to figure out why it was just using one of my cores:
https://www.spinics.net/lists/linux-btrfs/msg51380.html

As i’m following the btrfs kernel list on a daily basis i can say this: dont trust raid5/6! And you shouldn’t for some future kernel releases. Why? Because there are still serious bugs that aren’t even located yet! Some bugs are located and fixed but NOT in for-next or mainline! And only a few are located, fixed and accepted for 4.11 and/or for-next.

Liu Bo (Oracle) and Qu Wenruo (Fujitsu) are here to mention most, imho. Which take much effort bringing Raid5/6 to be stable. With 4.11 we will be able to have a working scrub and a detection (and auto correction, as i remember) on damaged data on file open.

A major problem which i’m waiting to be fixed for 1,5 years now is a working offline scrub (aka “fsck”), cause i lost 12TB while doing a disk replace, where the new disk died while replacing and so the famous “can’t open_ctree” occurred and the fs was gone. Qu has done much work here and has a offline scrubbing function in his tree, which wasn’t reviewed for some months now. There wasn’t much interest from the devs to bring this into the kernel, atm. I expect it to be maybe in 4.14. So don’t hold your breath. Nobody knows what errors get detected after we can recover the fs…I don’t think this thing gets stable this year or the next.

There is still a discussion to mark raid5/6 as unstable and telling this to the user on creation time, which shows the unstable state on the whole code.

But to be positive: There is progress (see and test 4.11)! There are some devs working hard on fixing the problems, but often its hard to find and fix them! Feedback is always welcome! Just join the kernel list and post your problem with the latest mainline! Since Rockstore is using an el-repo kernel, its most of the time on the current stable. Bug reports here will also help!

Personally i’m using a Rockstor NAS as Backup storage at work and on a home NAS. But on productive systems i’m running ZFS which is not that flexible as BTRFS but rock stable. Sadly i won’t recommend ZFS for consumer usage, even when using NAS4free, because its very complex and much work on the cli has to be done to get it personalized to our needs. It also needs a huge ammount of resources which results in higher hw and energy cost.

2 Likes

im continuing to go with zfs as well, sadly not as flexible as you already pointed out so ill have to create another zpool instead of extending the existing one, but i have bought a 24 bay chassis for when the time comes i can pop in some new drives, migrate data and expand the btrfs raid with the “old” drives to full capacity

i hope the full raid5/6 code is stable in one or two years as well, time will tell

My system has a 5x6TB raid 5 array. When i added the fifth disk, the rebalancing took days. Crossing my fingers and hope no drive failures till the bugs are fixed

Latest post from Adam, who testet the latest for-linux(-next) aka 4.12, which will be released in about 9 weeks. It takes some more weeks to appear in el-repo and some more to be tested in Rockstor. So there is hope for end of august for stability/curruption improvements. But dont believe in a data loss free FS at this time!!!:

1 Like