BTRFS Raid5 issue

shocker · April 8, 2021, 3:38pm

Hello,
Just got an HBA card failure where I’ve got some raid deices. HBA has been changed and with a RAID5 I’ve got one of the HDD’s with IO failure. Now in theory the RAID5 should work in degraded state but seems that there is another one affected.

[29095.898222] BTRFS info (device sdp): allowing degraded mounts
[29095.898225] BTRFS info (device sdp): use no compression, level 0
[29095.898226] BTRFS info (device sdp): disk space caching is enabled
[29095.898227] BTRFS info (device sdp): has skinny extents
[29095.899308] BTRFS warning (device sdp): devid 1 uuid abc0612f-26ae-432c-998f-2982859d9734 is missing
[29096.680236] BTRFS critical (device sdp): corrupt leaf: root=7 block=11411345801216 slot=1, unexpected item end, have 16124 expect 16118
[29096.680238] BTRFS error (device sdp): block=11411345801216 read time tree block corruption detected
[29096.680311] BTRFS error (device sdp): failed to verify dev extents against chunks: -5
[29096.808034] BTRFS error (device sdp): open_ctree failed

Tried to recover the sdp device:
# btrfs check /dev/sdp
Opening filesystem to check…
warning, device 1 is missing
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
bad tree block 24231936, bytenr mismatch, want=24231936, have=0
Couldn’t read chunk tree
ERROR: cannot open file system

# btrfs rescue zero-log /dev/sdp
warning, device 1 is missing
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
bad tree block 24231936, bytenr mismatch, want=24231936, have=0
Couldn't read chunk tree
ERROR: could not open ctree

# btrfs check --repair /dev/sdp
enabling repair mode
Opening filesystem to check...
warning, device 1 is missing
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
bad tree block 24231936, bytenr mismatch, want=24231936, have=0
Couldn't read chunk tree
ERROR: cannot open file system


# btrfs rescue super-recover /dev/sdp
All supers are valid, no need to recover

Tried:
# btrfs rescue chunk-recover /dev/sdp
Scanning: 542572544 in dev0, 560910336 in dev1, 543621120 in dev2, 540463104 in dev3
but got segemntation fault after a while.

Any clue from anyone?

shocker · April 8, 2021, 3:48pm

btrfs fi show

warning, device 1 is missing
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
checksum verify failed on 24231936 found B3CF6D9D wanted 572CD02B
bad tree block 24231936, bytenr mismatch, want=24231936, have=0
Couldn't read chunk tree
Label: 'HDD14'  uuid: 710c2b48-a0c9-427f-9e10-e744810d05f4
        Total devices 5 FS bytes used 42.95TiB
        devid    2 size 12.73TiB used 10.74TiB path /dev/sdp
        devid    3 size 12.73TiB used 10.74TiB path /dev/sdj
        devid    4 size 12.73TiB used 10.74TiB path /dev/sdl
        devid    5 size 12.73TiB used 10.74TiB path /dev/sdk
        *** Some devices missing

Hooverdan · April 8, 2021, 4:33pm

What version of btrfs-progs are you running? Have you tried to recover the chunk tree with something like this (for reference: https://btrfs.wiki.kernel.org/index.php/Btrees)

btrfs chunk-recover -v /dev/sdp (the -v option is the more verbose output)

https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-rescue

shocker · April 8, 2021, 5:34pm

# rpm -qa|grep -i btrfs
btrfsmaintenance-0.4.2-lp152.2.2.noarch
btrfsprogs-udev-rules-4.19.1-lp152.6.3.1.noarch
libbtrfs0-4.19.1-lp152.6.3.1.x86_64
btrfsprogs-4.19.1-lp152.6.3.1.x86_64

Yes, I have listed above:

Thanks

Hooverdan · April 8, 2021, 8:47pm

@shocker, oops, somehow overlooked that. Sorry about that. Only other option I could see that a newer kernel and btrfs-progs might allow for a better recovery possibility, since there have been quite a few improvements in the last 2 years … but that can of course be risky in itself.
I am running on kernel 5.11.12 and have the 5.9 btrfs-progs installed under the last Rockstor 3 version … but I also switched from RAID5/6 to RAID10 when I upgraded all my drives last year … so not much more experience in that space anymore.

shocker · April 9, 2021, 5:22am

Installed the latest versions and I’ll try again with the chunk recovery. Thank you for that.
# rpm -qa|grep btrfs
btrfsmaintenance-0.5-lp152.67.3.noarch
btrfsprogs-5.11-lp152.339.3.x86_64
btrfsprogs-udev-rules-5.11-lp152.339.3.noarch
libbtrfs0-4.19.1-lp152.6.3.1.x86_64

Yes, I’m keep thinking to switch to zfs, it’s the second time when I have a failure with raid56 on btrfs

Hooverdan · April 9, 2021, 2:47pm

@shocker, is there a particular need you have to RAID56 as opposed to the other options up to RAID10? I did run on RAID56 for almost 4 years (and, luckily without issues) before switching over to RAID10. RAID56 still is considered not “stable” but at the same time, quite a few improvements had been made, that didn’t make it into Rockstor 3 due to the decision to do replatforming onto OpenSUSE. Personally, I have continued to update the Kernel and btrfs programs manual over time to ensure that even before moving to Rockstor 4 I would benefit from additional fixes and performance improvements. Then again, my appliance is for home use, so it’s not like I am moving data by the terabytes every day, which probably greatly reduces the likelihood of catastrophic failures in general.

shocker · April 9, 2021, 2:51pm

The reason is the cost
I have ±800TB and running this with raid10 will add a huge increase in costs.

Now with everything up to date, segmentation fault occurred again.
# btrfs rescue chunk-recover /dev/sdp
Scanning: 2852180512768 in dev0, DONE in dev1, 2834515087360 in dev2, 2852750737408 in dev3Segmentation fault (core dumped)

Running kernel 5.11.12-1.g92a542e-default
# rpm -qa|grep btrfs
btrfsmaintenance-0.5-lp152.67.3.noarch
btrfsprogs-5.11-lp152.339.3.x86_64
btrfsprogs-udev-rules-5.11-lp152.339.3.noarch
libbtrfs0-5.11-lp152.339.3.x86_64

phillxnet · April 9, 2021, 3:39pm

@shocker & @Hooverdan
Re:

In such circumstances I’ve often recommended a Tumbleweed live instance. It’s a shame our Tumbleweed variant fell foul of our still being on Python 2 and them dropping that. I’ll try to revive that in time but better we focus, especially given our current contributor count.

https://get.opensuse.org/tumbleweed/

Hope that helps.

shocker · April 10, 2021, 1:48pm

The reason for getting segmentation fault was that also the affected drive had IO issues. Strange that nothing was shown in dmesg. I have destroyed the pool, re-created it with only 4 drives instead of 5 and that one where I was trying to recover the checksum is getting

blk_update_request: I/O error, dev sdp, sector 49675776 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0

Second time as well when two drivers are crashing at the same time

phillxnet · April 10, 2021, 3:17pm

@shocker Thanks for the update.

Bit of bad luck there. Maybe a change in your drive make/model selection is in order ?

Some drives are know to have just-plain-wrong firmware that out-and-out lies as to what has been written to disk also. Tha’t can’t be good, especially if one is a file-system, or in our cases file-system users :).

Keep us posted.

shocker · April 10, 2021, 3:55pm

I had 5 TOSHIBA MG07ACA14TA drives with a raid5. Seems that two of them died in the same time
Replaced them with seagate ST16000NM001G now, hope that they are better