Drive btrfs error

wdc · April 14, 2021, 10:33pm

[Please complete the below template with details of the problem reported on your Web-UI. Be as detailed as possible. Community members, including developers, shall try and help. Thanks for your time in reporting this issue! We recommend purchasing commercial support for expedited support directly from the developers.]

Brief description of the problem

(Please note - if there is a not a means to recover the array, I can reinstall Rockstor and start over; I’m not prod yet. However, If this isn’t recoverable, then I’m hesitant to proceed with btrfs.)

A drive in a RAID1 array is failing (or rather, the file system has an error). From /var/log/messages:

Apr 14 16:18:50 rockstor kernel: BTRFS info (device sda): use lzo compression
Apr 14 16:18:50 rockstor kernel: BTRFS info (device sda): disk space caching is enabled
Apr 14 16:18:50 rockstor kernel: BTRFS info (device sda): has skinny extents
Apr 14 16:18:50 rockstor kernel: BTRFS error (device sda): failed to read chunk tree: -5
Apr 14 16:18:50 rockstor kernel: BTRFS error (device sda): open_ctree failed
Apr 14 16:19:51 rockstor kernel: BTRFS info (device sda): use lzo compression
Apr 14 16:19:51 rockstor kernel: BTRFS info (device sda): disk space caching is enabled
Apr 14 16:19:51 rockstor kernel: BTRFS info (device sda): has skinny extents
Apr 14 16:19:51 rockstor kernel: BTRFS error (device sda): failed to read chunk tree: -5
Apr 14 16:19:51 rockstor kernel: BTRFS error (device sda): open_ctree failed

etc. ad nauseum.

The pool is unmounted. I can click the pool in the UI but ‘Resize’ is not an option (I hoped to go there to remove it from the array).

Detailed step by step instructions to reproduce the problem

This morning, I rebooted to try installing Rockstor 4. I noted that /dev/sda was not listed as a potential install target (that’s OK in that I did not intend to install to it anyway). On seeing an error, I booted back to Rockstor 3 and this is the current state.

Web-UI screenshot

[Drag and drop the image here]

Error Traceback provided on the Web-UI


            Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/pool_balance.py", line 47, in get_queryset
    self._balance_status(pool)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/opt/rockstor/src/rockstor/storageadmin/views/pool_balance.py", line 72, in _balance_status
    cur_status = balance_status(pool)
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 1064, in balance_status
    mnt_pt = mount_root(pool)
  File "/opt/rockstor/src/rockstor/fs/btrfs.py", line 253, in mount_root
    run_command(mnt_cmd)
  File "/opt/rockstor/src/rockstor/system/osi.py", line 121, in run_command
    raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /bin/mount /dev/disk/by-label/TestPool /mnt2/TestPool -o ,compress=lzo. rc = 32. stdout = ['']. stderr = ['mount: wrong fs type, bad option, bad superblock on /dev/sda,', '       missing codepage or helper program, or other error', '', '       In some cases useful info is found in syslog - try', '       dmesg | tail or so.', '']

wdc · April 15, 2021, 12:21am

Resolved:
Chalk this up to EBKAC.
When installing Rockstor 4, I chose to install to a disk that was part of the pool. The answer dawned on me while reading the * Data loss Prevention and Recovery in RAID1 Pools* document. The message log is complaining about /dev/sda but the real failure is on /dev/sde, another member of the pool but also the drive I chose as a target for the Rockstor 4 install. That explains why the I have an error and why the error is coincident with the attempted RS4 install.

Flox · April 15, 2021, 11:30am

Glad you solved it, @wdc!

This is exatly the kind of easy-to-make mistakes that we’re striving to limit as much as we can. If I may, would it be possible to know the size of your drives and if there is anything in particular that led to confusion on your end or was it simply unfortunate distraction at the time?
There is indeed a setting in our kiwi recipe that sets a threshold for the installer to show/hide a disk based on its size (any disk above this threshold will not be listed as a choice for the OS install) and we’re trying to find the best value possible for this setting. It’s unfortunately not easy to find a setting that will satisfy everybody but we can try to improve as much as possible nonetheless.
Note that this setting can be changed “on-the-fly” as a boot option (and I believe we have that documented in our how-to), but a good default is always the best option.

Thanks in advance for any feedback you might have on that!

wdc · April 15, 2021, 2:12pm

The 500 setting is good from my perspective. This was a matter of running the install before finishing my first cup of coffee, and then posting this before I really had time to search the docs etc (work gets in the way). Since it was a test install, my fuzzy brain said “just use the little 74GB drive”.

Rockstor:~ # lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 232.9G 0 disk
sdb 8:16 0 931.5G 0 disk
└─sdb1 8:17 0 931.5G 0 part
sdc 8:32 0 223.6G 0 disk
├─sdc1 8:33 0 500M 0 part
├─sdc2 8:34 0 222.6G 0 part
└─sdc3 8:35 0 517M 0 part
sdd 8:48 0 465.8G 0 disk
sde 8:64 0 74.5G 0 disk
├─sde1 8:65 0 2M 0 part
├─sde2 8:66 0 33M 0 part /boot/efi
├─sde3 8:67 0 2G 0 part [SWAP]
└─sde4 8:68 0 72.5G 0 part /
sdf 8:80 0 232.9G 0 disk
└─sdf1 8:81 0 232.9G 0 part
sdg 8:96 0 232.9G 0 disk
├─sdg1 8:97 0 500M 0 part
├─sdg2 8:98 0 7.9G 0 part
└─sdg3 8:99 0 224.5G 0 part