Pool doesn't mount (raid 1) - can't add new disk to pool

bhp · August 25, 2025, 9:36pm

Brief description of the problem

Trying to add new disk to raid 1 pool.
One disk has gone bad, so pool is not mounted. Get error when I try to add disk to the (unmounted pool).

Detailed step by step instructions to reproduce the problem

I go to pool; click on the pool (“media”). and click “Resize/ReRaid Pool”. Click on “2. Add disks (optional RAID change)”; Pick “no” for " Would you like to change RAID level of this pool also?". I select the “new” (never mounted) disk; when I resize, I get the attached error.

Web-UI screenshot

[Drag and drop the image here]

Error Traceback provided on the Web-UI


            Traceback (most recent call last):
  File "/opt/rockstor/.venv/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 68, in run_for_one
    self.accept(listener)
  File "/opt/rockstor/.venv/lib/python2.7/site-packages/gunicorn/workers/sync.py", line 27, in accept
    client, addr = listener.accept()
  File "/usr/lib64/python2.7/socket.py", line 206, in accept
    sock, addr = self._sock.accept()
error: [Errno 11] Resource temporarily unavailable

Hooverdan · August 25, 2025, 10:20pm

@bhp, hello again.

I think the gunicorn message is more a symptom rather than the root cause.

What version of Rockstor are you running on (considering the usage of python 2.7, I assume it’s a somewhat older version)?

Also, how many disks do you have in your RAID1 pool, 2 (including the failed one) or more? The failed disk shows as detached?

You could consider attempting to mount the pool in degraded state (though if you lost 1 of 2 disks that might not be possible).

bhp · August 26, 2025, 3:14am

I left out the screen shot.
openSUSE Leap Linux: 5.14.21-150400.24.100-default
Rockstor 4.6.1-0
Raid 1 using (only) two 6TB disks…
It will be easier with screen shots, see:
ScreenShots

Hooverdan · August 26, 2025, 4:40pm

Thanks for the additional info. I’m not near any system right now, but I suspect you need to manually mount the pool in degraded status (as you can see as the mount options in the details screenshot).

You can do this using the command line (either via the WebUI or if you use an SSH client like PuTTY).

Then you could add the new disk (also via the command line) and remove the missing one.

bhp · August 26, 2025, 5:50pm

Thanks for your quick response.
I’m not a total moron, but I can’t mount the disk.
I did this:
mount -t btrfs -o rw,degraded,recovery,nosuid,nodev,nofail,x-gvfs-show /dev/disk/by-uuid/9ef2c956-56e3-4105-b094-9253a6caf102 /mnt2/media

Got this:
mount: /mnt2/media: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error.

Hooverdan · August 26, 2025, 9:29pm

I’m pretty sure you’re not a moron

What output does btrfs fi show give you? I assume, the UUID you have there represents the pool UUID?

You probably want to try to mount like this, though:

the device you still have seems to be /dev/sdc so you could mount it like this, which in a functioning pool with all devices present you can just pick one of the pool drives and it will mount the entire pool (like in the WebUI):

mount -o degraded,rw /dev/sdc /mnt2/media

if that succeed you could remove the dead drive via this:

btrfs device delete missing /mnt2/media

add the new disk to the RAID (you should be able to find the new device name using lsblk), e.g.:

btrfs device add /dev/sdd /mnt2/media

I don’t think that the recovery option works as it was deprecated starting in kernel 4.5 or something around there (and you’re on 5.x).

I think if that all works, you want to reboot, so the pool mounts based on what it was configured in Rockstor originally. You would also start a balance (or have an automated job via the WebUI take care of it).

bhp · August 27, 2025, 5:52pm

No luck. Here is TMI (in no particular order).
Any other suggestions?
Time for btrfs check --repair ?

Command: btrfs fi show
results:
Label: ‘ROOT’ uuid: ae33ccad-e775-4ca3-b1bc-07c0e993e1e7
Total devices 1 FS bytes used 11.71GiB
devid 1 size 463.70GiB used 13.38GiB path /dev/sda4

Label: ‘Backups’ uuid: 4932815b-58b9-479b-8ba7-63d7bb0d3a00
Total devices 1 FS bytes used 406.62GiB
devid 1 size 1.82TiB used 435.02GiB path /dev/sdd

warning, device 1 is missing
parent transid verify failed on 3767092281344 wanted 322575 found 319524
parent transid verify failed on 3767092281344 wanted 322575 found 319524
Ignoring transid failure
Label: ‘media’ uuid: 9ef2c956-56e3-4105-b094-9253a6caf102
Total devices 2 FS bytes used 2.11TiB
devid 2 size 5.46TiB used 2.11TiB path /dev/sdc
*** Some devices missing

command: mount -o degraded,rw /dev/sdc /mnt2/media
results:
mount: /mnt2/media: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error.

command:btrfs fi show
results (truncated):
parent transid verify failed on 3767092281344 wanted 322575 found 319524
parent transid verify failed on 3767092281344 wanted 322575 found 319524
Ignoring transid failure
[1/7] checking root items
parent transid verify failed on 3767216717824 wanted 322332 found 319202
parent transid verify failed on 3767216717824 wanted 322332 found 319202
Ignoring transid failure
…
ERROR: child eb corrupted: parent bytenr=3322817986560 item=36 parent level=1 child bytenr=3323112243200 child level=2
ERROR: failed to repair root items: Input/output error
…
Extent back ref already exists for 3767432806400 parent 0 root 2
Extent back ref already exists for 3767492788224 parent 0 root 2
…
extent back ref already exists for 3767047602176 parent 0 root 7
extent back ref already exists for 3767048159232 parent 0 root 2
parent transid verify failed on 3767049011200 wanted 322561 found 319497
parent transid verify failed on 3767049011200 wanted 322561 found 319497
Ignoring transid failure
Segmentation fault

command: dmesg:
results (truncated)
[181537.748215] BTRFS error (device sdc): open_ctree failed
[181540.760860] BTRFS info (device sdc): use no compression
[181540.760961] BTRFS info (device sdc): disk space caching is enabled
[181540.761014] BTRFS info (device sdc): has skinny extents
[181540.762885] BTRFS error (device sdc): devid 1 uuid 9b3da543-8e1e-43ba-9139-45d539825833 is missing
[181540.762969] BTRFS error (device sdc): failed to read chunk tree: -2

command: lsblk
results (truncated):
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 465.8G 0 disk
??sda1 8:1 0 2M 0 part
??sda2 8:2 0 64M 0 part /boot/efi
??sda3 8:3 0 2G 0 part [SWAP]
??sda4 8:4 0 463.7G 0 part /mnt2/Rock-ons/btrfs
/mnt2/home
/mnt2/.snapshots/1/snapshot/mnt2/Rock-ons/btrfs/subvolumes/1c9ad62f170c16db4749730e9cb5c0a6037ba8632fb99980a186cdac1a8e163c
/mnt2/.snapshots/1/snapshot/mnt2/Rock-ons/btrfs/subvolumes/1469205fdc91db3597b67b91e2b45569c90796d5ff77ecfba8a79223d52842fb
/mnt2/ROOT
/boot/grub2/x86_64-efi
/var
/usr/local
/tmp
/srv
/root
/home
/opt
/boot/grub2/i386-pc
/.snapshots
/
sdb 8:16 0 5.5T 0 disk
sdc 8:32 0 5.5T 0 disk
sdd 8:48 0 1.8T 0 disk /export/XXX_Backup
/export/YYY_Backup
/mnt2/XXX_Backup
/mnt2/YYY_Backup
/mnt2/Backups
sde 8:64 0 5.5T 0 disk

Hooverdan · August 27, 2025, 8:04pm

well, short of blowing away the pool and re-establishing a new one with both working disks, that might be your only option at this point. I do hope that you have been making backups of your data on that pool. Unless, @phillxnet, @Flox you have any other suggestions on how to further investigate.

Tex1954 · August 28, 2025, 2:20am

It’s been a while since I tested this scenario, but what I ended up doing once it was mounted degraded was change from Raid to Single, then add the new disk, then re-raid back to Raid-1.

Sorry I don’t understand the CLI stuff…

Hooverdan · August 28, 2025, 2:41pm

That’s good information. Unfortunately, it seems that getting the device mounted in degraded mode is the first step that’s not working (yet).