Btrfs Raid 5/6 warning

FREE_NORWAY · July 5, 2016, 11:06pm

Hi

What is the Rockstor teams response to this warning in the btrfs mailing list?
http://www.spinics.net/lists/linux-btrfs/msg56546.html
Has anybody seen this behavior?

Yoshi · July 6, 2016, 5:36am

This is scary I think Rockstor should also include a big fat warning for now, that users really shouldn’t use raid5/6 with BTRFS.

KarstenV · July 6, 2016, 6:44am

This also scares me a little since my pool is RAID6.

I havent experienced problems though.

And my RAID6 did recover from a serious crash at the very start of my Rockstor experience, using the tools available at the time (I don’t remember the exact procedure I used, but it ended up with a scrub repairing the damage to the filesystem).

This is something that needs a thorough looking into by the people programming and maintaining BTRFS.

As I read the reply to the bug report, the error should be able to correct itself by running 2 scrubs in a row. But with the reduced scrubbing speed on RAID56, this would be a week long procedure on my pool.

FREE_NORWAY · July 6, 2016, 8:22am

I wanted to use Rockstor as a replacement for my Windows/hardware Raid setup i use today.
The features of btrfs is what i was looking for, but since i started with it in may i got the feeling it isnt quiet ready yet.
The other thing is that the status of btrfs is not communicated very well, i have use many, many hours to find answers.
Hope the Rockstor team is making a statement!

KarstenV · July 6, 2016, 9:29am

Well the only statement they can make is what they’ve allready said.

That RAID56 is still considered as in development/testing, and should be used with caution.

BTRFS development is completely separate from Rockstor, and I don’t think the developers listen much to what Rockstor has to say.

But off course Rockstor has to be clear about that using RAID56 still has its risks, and should only be used after carefull consideration.

phillxnet · July 6, 2016, 11:48am

@FREE_NORWAY Welcome to the Rockstor community. Yes there have been a number or concerns of the production readiness of btrfs raid 5/6 and also a number of discussions on this forum where a few participant have expressed a degree of satisfaction with these parity variants of btrfs; bar the performance issues with scrubbing of course which have been a long standing issue. And prior to the indicated thread the recent communications re raid 5/6 on the linux-btrfs mailing list were edging towards it being considered stable; this is evident in the changes that happened on the btrfs wiki entry on the 24th June as a result of that thread, in brief:

the subtraction of "...should be usable for most purposes..."

https://btrfs.wiki.kernel.org/index.php?title=RAID56&diff=next&oldid=29769

and the addition of:

“The first two of these problems mean that the parity RAID code is not suitable for any system which might encounter unplanned shutdowns (power failure, kernel lock-up), and it should not be considered production-ready.”

https://btrfs.wiki.kernel.org/index.php?title=RAID56&curid=5631&diff=30429&oldid=30425

and:

“It is recommended that parity RAID be used ‘’‘only’‘’ for testing purposes.”

We also have the addition on the main page relating to Parity btrfs which relates to raid5/6 variants:

** Single and Dual Parity implementations (experimental, not production-ready)
with the bold bit added on the same day as the above changes.

https://btrfs.wiki.kernel.org/index.php?title=Main_Page&curid=1&diff=30427&oldid=30343

I’m afraid I dropped the ball on this one as I had these changes lined up ready to post in one of our ongoing forum threads on btrfs raid 5/6 but alas failed to do so in a timely manner by currently running a few days late. There are now a couple of issues open to address this within the docs and the Web-UI that I hope to make some progress on shortly.

Thanks for starting a newer and fresher thread more specifically on this recent issue and do please consider others experience and statements on these parity levels in the linked forum thread, and as @KarstenV kindly points out in this thread the general more cautious recommendation (ie for production) is to not use the btrfs parity variants of raid 5/6.

That thread has in turn been linked to from a number of other less focused threads.

As it goes the Rockstor docs did recently receive and merge and rapidly published a generous and very frank addition relating to bugs in btrfs raid 5/6 and their impact on the raid 5/6 recovery process by @Fantastitech via issue:

github.com/rockstor/rockstor-doc

Raid5/6 recovery documentation is incorrect and/or out of date

opened 07:36PM - 18 Jun 16 UTC

closed 02:16PM - 22 Jun 16 UTC

Fantastitech

I've just finished a couple of pretty major recovery operations and I noticed, a…fter much research that the documentation for data recovery on RAID5/6 arrays had inaccuracies, out of date information, and unnecessary redundancies. ## Mount command Step 1 and step 12 instruct to mount the pool with: ``` # mount /dev/sda /mnt2/mypool ``` However BTRFS will not mount an array with a missing, failed, or otherwise unmountable drive in a RAID5/6 array without passing the `degraded` mount option. A user with an unmountable drive would not be able to get past step 1 and step 2 does not make any sense in context as it could be run before step 1 to determine if a device is missing or not in which case step 1 would not work. ## Double rebalancing Step 3 says to delete the failed device, which in a RAID5/6 array will trigger a rebalance to rewrite the parity information for the new amount of drives. In the case of this document, that brings the array down to 3 drives. In step 13 and 14 the user is instructed to add the new device then manually trigger a second rebalance. This puts additional wear on the drives, increases the chance of catastrophic failure, doubles the length of the entire procedure, and if there is not enough space on the array to accommodate the new parity data it will fail. A much more practical procedure would be to add the new drive to the pool, then delete the failed/missing device. Since failed devices can cause kernel panics, the most sane solution is to remove the failed device from the system completely, mount degraded, add the new device, then delete missing devices which will trigger a single rebalance rebuilding parity information with the new drive. ## Redundant RAID5 and RAID6 sections. There is a section for RAID5 data recovery, then a second section for RAID6 which just describes that it's the same procedure as RAID6. It would be much more clear, clean, and concise to include information about RAID5 and RAID6 in the same section. --- I've rewritten the RAID5/6 recovery documentation to correct these issues, clarify some details, and added an extra troubleshooting step for unmountable arrays in the event of superblock issues that prevent a mount command from finding existing drives.

where a couple of known bugs in raid5/6 can hamper the recovery process.

It can be very disconcerting looking as closely as the developers do at the underlying technologies of many modern projects, given their complexity, and Rockstor is as all non trivial technologies are, dependant on very many projects doing the same with their technologies. It is also worth noting that even ext4 has regular bugs found but they are often very much corner cases, and this is what btrfs is definitely moving towards.

Thanks for the juggling prompt here; this thread should be updated with progress on the referenced issues.

Hope that helps.

Dragon2611 · July 6, 2016, 4:09pm

Oh crap, is pretty much all I can say to that, as my install is running raid5

Edit: Although it will waste a load of space looks like I can change it raid1 given this new evidence relating to the stability of raid5 I think I will.

Makes me wonder if anything got corrupted when replacing a failing disk a month or so back (Although that disk was still working just had bad sectors)

ScottyEdmonds · July 7, 2016, 2:40pm

Not surprised… BTRFS has it’s bugs just like every other option out there! I primarily use Raid0 in my Rockstor NAS and backup important data every night to Backblaze B2 using HashBackup… I have a Raid5 array with some older disks and I can’t seem to kill it…

This doesn’t change my mind about Rockstor or BTRFS is anyway, I still trust my data to be as safe as possible on my Rockstor NAS… I’ve even been thinking about deploying a secondary Rockstor NAS to use the replication feature buuut Backblaze B2 @ $0.005/GB is still a better ROI.

Keep in mind I’m in Canada
Best $/GB I’ve ever gotten was 4TB HGST (I know it could be cheaper if I went with Seagate but I have sworn to never but another one of their drives again) was $0.0415/GB so at that rate it’s ~8 months without accounting for power usage + other hardware costs… Of course this isn’t including the possibility of needing to recover which would be expensive but I’m optimistic and just preparing for the worst and hoping for the best! lol

phillxnet · July 11, 2016, 3:49pm

Just a slightly belated update: as from Rockstor 3.8-14.03 testing channel updates, the associated text in the Web-UI by each raid selection point now includes a “… raid5 and raid6 are not production-ready.”: akin to the new status of these parity raid levels as per the btrfs wiki main page. The Official Rockstor docs already had warnings of raid5/6 maturity levels but these have been updated as well to be more in keeping with the official btrfs wiki wording. And the FAQ section also received some updates along these same lines. These changes were made against the GitHub issues I previously referenced in this thread.

Thanks to all concerned, including the btrfs developers and documentors, for bringing this matter to the fore.

Yoshi · September 13, 2016, 9:13am

btrfs got an official status page. Maybe it should be linked in the rockstor documents

https://btrfs.wiki.kernel.org/index.php/Status

phillxnet · September 13, 2016, 9:24am

@Yoshi Yes that’s a good idea. I did see this mentioned in linux-btrfs mailing list and though the same. But given it’s only been up for a day or two lets give it time to settle and the general consensus to pan out; I’ve created an issue in our rockstor-doc repo so that it doesn’t get lost:

Do please consider tending to this one if you fancy, just make a note in the issue if you embark on it so that we don’t get avoidable overlapping effort.

Well spotted and thanks for bringing this up.

Yoshi · November 17, 2016, 6:18am

Some progress is being made: https://www.phoronix.com/scan.php?page=news_item&px=Btrfs-RAID5-RAID6-Fixed

Hopefully this will get merged into Linux 4.10

Tomasz_Kusmierz · November 18, 2016, 1:41am

This fixes only a single issue where scrub could corrupt your corrupted FS even further by not locking whole strip… this does NOT fix any of other issues.

An please stop pasting this guys article in multiple threads because I’m starting to think you got some end in profits from advertising. If you care to show us some real progress on btrfs, quote mailing list OR btrfs wiki - other sources are considered a noise.

Again, btrfs RAID 5 & 6 profiles are considered at the moment: experimental / very unstable / very unpredictable. That profile can AND WILL eat your data in the long run, will leave you with errors that scrub is unable to recover from, will silently provide false recovery data fooling scrub into further corruption of FS.

Until in btrfs wiki it will be marked all green please don’t come here proclaiming some sort of victory based on some misinformed persons random article

Yoshi · November 18, 2016, 8:06am

This is the most ridiculous thing someone said about me ever… For advertising? Are you serious? Maybe you should read again what I said.

Phil asked me if I could keep an eye on the progress that is being made. And that is what I’ve done. I said SOME PROGRESS, nothing more, nothing less.

Reading and understanding is the key…

Flyer · November 18, 2016, 8:32am

@Tomasz_Kusmierz ???

Ok, having quotes from btrfs mailing list or wiki (wiki will probably get updated once these patches merged in Kernel, so by 4.10) probably could be better, on the other side not all Rockstor users have to be “IT guys” so an article explaining it is good.

Not considering btrfs mailing vs other blogs article, why did you react that way?

My reaction to last posts? “Nice, there’s a patch partially fixing btrfs issues (like every patch on Rockstor against 200 open issues, not fixing all things, but fixing!) and our forum users take care of btrfs and Rockstor development”

Believe me Tom, that was a ‘little’ unkind.
M.

Tomasz_Kusmierz · November 18, 2016, 1:46pm

Ok, My apologies to Yoshi. I did jumped up and down on your throat for little reason.

Anyway, in my defense I’m simply pissed of about current situation in community that somebody comes in and proclaim the “raid5 fix” ( the article actually states that “Btrfs RAID5/RAID6 Support Finally Get Fixed” ) and then there is a myriad of people coming back saying “I beg you please please save my non backed up data from raid 5” and you want to punch in the face that douche that wrote “yeach it’s fixed”. And the guy that wrote that article is doing it for advertisement, there are 5 advertisements spots around this misguiding article.

Sorry again Yoshi that I’ve put you in same bag with him. I shouldn’t.

@Flyer … I know it was more than unkind

Yoshi · November 18, 2016, 5:29pm

@Tomasz_Kusmierz
Let’s forget it, no hard feelings here. I can understand your point better now, maybe the article is more positive than it should be, but I think Phoronix.com isn’t the badest source for information like that.

Lets say btrfs raid5/6 is a liitle less flawed than before with this fix. But still far away from being stable in concerns of raid5/6.

Tomasz_Kusmierz · November 18, 2016, 5:47pm

Again, sorry for that mate. I’m rooting for raid6 big time, right now I’ve got 8 disks in personal array and due to raid10 I only get 4 disk capacity … and let’s face it I would like to get more. Also if 2 disks die, I’m colloquially f*** …

suman · November 18, 2016, 6:27pm

Thanks for raising your concerns but working out your differences @Tomasz_Kusmierz and @Yoshi. Obviously, we are all in the same team here wishing for a stabler btrfs and a better Rockstor. While it can be frustrating, let’s be respectful to the open source community as a whole and not forget that it’s hard work and takes time, especially so with filesystem development.

Karlodun · November 27, 2016, 5:27am

Simplified version:

Don’t use RAID 5 / 6 BTRFS Modes! Use RAID 0,1 or 10
RAID is NOT Backup!
Free space in BTRFS is complicated
Compression issues.
RAID56 Status
The parity RAID code has multiple serious data-loss bugs in it. It should not be used for anything other than testing purposes. - https://btrfs.wiki.kernel.org/index.php/RAID56
Raid is not backup:
“RAID guards against one kind of hardware failure. There’s lots of failure modes that it doesn’t guard against.
File corruption
Human error (deleting files by mistake)
Catastrophic damage (someone dumps water onto the server)
Viruses and other malware
Software bugs that wipe out data
Hardware problems that wipe out data or cause hardware damage (controller malfunctions, firmware bugs, voltage spikes, …)
and more.” - http://serverfault.com/questions/2888/why-is-raid-not-a-backup
How much free space do I have? or My filesystem is full, and I’ve put almost nothing into it!
Short story: you have data and metadata stored differently. If there is no space for one of them - your disk is full. If you need to rebalance, you need free disk space for both.
My personal suggestion:
from ~10% with pure SSD solution to ~20% of free storage space with HDDs, and something in between for mixed solution.
Long story:
https://btrfs.wiki.kernel.org/index.php/FAQ#How_much_free_space_do_I_have.3F
"Do your own benchmarks. LZO seems to give satisfying results for general use."
From personal experience:

Concerning data compression to save space: a 4TB hdd costs around $150, a 2TB costs $80, plus installation costs and managerial expenses: ~$180 for 4TB, and ~$110 for 2TB.
Most of files I save are precompressed already (gif/jpg/docx/mpeg).
So the question is, will the data compression pay off to save disk space?
Depending on data I tried to compress, the compression ratio suggested me to buy additional hdd, if I should ever run out of space. Especially because you save some additional stuff along your compressed data…
However, the compression allowed me to increase the r/w speeds. It highly depends on your setup and data you compress. As stated above: do your benchmarks!