I am having a bit of a pcikle over here.
After adding an additional HDD to my pool (data single, metadata raid1) in the GUI a balance was initiated automatically. After less than an hour a power surge in my neighborhood occured.
Once the server came back up all the sharew were unmounted and the pool with the new drive reports as unmounted. Also several other pools report the same status.
After tryung to click on the pool in question I get the following in the GUI: Unknown internal error doing a GET to /api/pools/10/balance?page=1&format=json&page_size=9000&count=&snap_type=admin
Also the GUI Storage-Pools reports 2.99GB usage of the pool whereas prior to this the pool was filled with about 13TB of data.
At thait point I tryed: btrfs rescue super-recover
the result is that all superblocks are good. btrfs rescue chunk-recover gives me back full successfull read and crashes with some kind od systrace output (can be replicated).
any atempt at btrfs check gives me an error that one of the drives is used and cannot run. lsof states the device is not in use at all. btrfs filesystem show properly identifies the pool with the drives and their capacity and usage.
Trying to use “skip_balance” as a mount option and rebooting is not effective.
At this point I am clueless what to try next. The data are sort of backed up, however I would prefer not to restore 15TB (with my connectivity and hw performance this would mean 3 week process). I care only about the “balanced” pool. All other data can be lost without any damage.
Any suggestion or request for logs would be more than appreciated.
Given you are in, as you say, a pickle; it might be worth trying to see if the newer kernel and btrfs backports of our Leap15.1 or Leap15.2beta testing rpm versions will help you out here.
No installer as yet but the following forum thread, and the dev wiki it links to, now pretty much walk you through the customizations necessary from a sever install of openSUSE Leap sufficient to install the Rockstor testing rpm. But the main relevance here is using their newer btrfs backports that I’m assuming will do a much better job of de-pickling your pickle.
Also take great care with using some of those repair commands. You may find that the newer btrfs software in Leap will effectively sort you out. Or your skip_balance idea may end up working there where it hasn’t here. Also be super careful with the leap install and ideally have only the intended system drive connected. Then once all is configure and you have the system updated etc you can shutdown and connect the data drives ready to power up and have another go at it. Our target distro for our pending re-launch is Leap15.2 but that’s still in beta but openSUSE do some pretty aggressive backporting of the btrfs stuff to their kernels so definitely worth a try as there have been hundreds of improvements in btrfs since the last Rockstor kernel update and most are likely included in the Leaps.
Sorry I can’t be of more help right now but that should at least give you a more up-to-date btrfs platform to do the repair and we have almost feature parity in the testing channel so if it does end up sort you out you may be able to hang in there with that install until we release a stable ‘Built on openSUSE’ rpm version which is likely to not be much more time as it goes.
A was finally physically at the system in question.
As @phillxnet suggested I installed Leap 15.2 (including zstd in the gui would be awesome btw) and Rockstor on it.
The system recognosed BTRFS filesystem on all the drives in the mentioned pool. After the import attempt i received a non-descript error.
Rebooting just for the sake of grabbing a cigarette to help me think. In the console output while shutting down I saw a mention of /mnt2/poolname.
Second import attempt gave me a process deadlock. So I bounced the rockstor service and tryed again. The pool was imported (shares unmounted).
Booting back to my initial rockstor instance the pool was recognised and accessible.
All the shares however report unmounted. Manual mount seems to work. And after a reboot of the system (yes, I am a heavy smoker) the mounts came back as mounted.
This concludes my troubles and as always @phillxnet you have been most helphul.
Thank you for your support and looking forward to the openSUSE release of Rockstor