Unable to remove detached disk in pool

magicalyak · August 3, 2019, 3:55pm

I’ve replaced a Btrfs disk and it shows as a detached member in the pool in the UI. The disk does not appear in the Btrfs fi show and the remove command (UI) reports not enough space to remove it.

        Traceback (most recent call last):

File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 68, in run_for_one
self.accept(listener)
File “/opt/rockstor/eggs/gunicorn-19.7.1-py2.7.egg/gunicorn/workers/sync.py”, line 27, in accept
client, addr = listener.accept()
File “/usr/lib64/python2.7/socket.py”, line 202, in accept
sock, addr = self._sock.accept()
error: [Errno 11] Resource temporarily unavailable

![image|690x429](upload://4QCltBDL8AHgIoN1yvZy5U45N5G.png)

[root@rocky ~]# btrfs fi show
Label: ‘rockstor_rockstor’ uuid: 3e3b17e7-2490-484f-9c8a-f402b2a51517
Total devices 1 FS bytes used 14.54GiB
devid 1 size 122.66GiB used 18.06GiB path /dev/md125

Label: ‘tv’ uuid: 76b16cb2-0f13-401d-8395-c408cfc0fdfe
Total devices 4 FS bytes used 6.11TiB
devid 1 size 3.64TiB used 3.06TiB path /dev/sdc
devid 2 size 3.64TiB used 3.06TiB path /dev/sdi
devid 3 size 3.64TiB used 3.06TiB path /dev/sdh
devid 4 size 3.64TiB used 3.06TiB path /dev/sdb

Label: ‘backup’ uuid: 8ac79908-cb09-4568-bfc8-b0fd377dcf15
Total devices 1 FS bytes used 439.69GiB
devid 1 size 3.64TiB used 479.02GiB path /dev/sdg

Label: ‘movies’ uuid: c77c9722-7a5d-458c-bb1f-c077a950771d
Total devices 4 FS bytes used 5.40TiB
devid 7 size 3.64TiB used 2.71TiB path /dev/sdm
devid 8 size 3.64TiB used 2.71TiB path /dev/sdk
devid 9 size 3.64TiB used 2.71TiB path /dev/sdj
devid 10 size 3.64TiB used 2.71TiB path /dev/sde

[root@rocky ~]# btrfs scrub status /mnt2/tv
scrub status for 76b16cb2-0f13-401d-8395-c408cfc0fdfe
scrub started at Fri Aug 2 16:40:54 2019 and finished after 08:18:55
total bytes scrubbed: 12.22TiB with 0 errors
[root@rocky ~]#

magicalyak · August 3, 2019, 3:57pm

phillxnet · August 4, 2019, 11:07am

@magicalyak Hello again.

Did you do this via btrfs commands at the cli or via the Web-UI?

I’m assuming you are using latest stable here by the way: 3.9.2-48.

Although this is not related to the core cause of this issue it does in this case represent the ‘blocker’ and we have an issue for it here:

github.com/rockstor/rockstor-core

Incorrect size calculation while removing disk from disk pool

opened 10:37AM - 14 Apr 18 UTC

closed 02:10PM - 09 Jul 19 UTC

niu-lin

Use case is: I have two disks, both are 2.7TB on a RAID1 setup, so actual space… to store data is 2.7TB. Used space is 1.8T (on each disk), free space allowed for data is 0.9TB. Now I add a 3rd disk of 2.7TB to the pool, effectively increases total usable disk space to 4.05TB, and free space allowed for data is 2.25TB. The balance process started, and I canceled it at around 15% on the command line. Then I use the "Resize pool" function to delete the newly added disk, but rockstor gives me the error like "the operation is not supported, because the pool size will shrink by 2.7TB, however, free size is only 2.25TB, there's no enough disk space to move data". Actually though the pool is going to shrink by 2.7TB, the actual used space on the 3rd disk is only a few hundred GB, which is OK to spread to the other 2 disks, so btrfs should allow for that. To confirm it, I started the device delete operation on the command line using `btrfs device delete xxx`, and btrfs accepted the operation and finished it without errors. The conclusion is the calculation and protection by rockstor is incorrect. To calculate whether there's enough free space to store the data from the deleted device, we need to base it on the used size of the device-to-be-deleted not the total size of it.

which was addressed in the now merged pr:

github.com/rockstor/rockstor-core

pool resize disk removal unknown internal error and no UI counterpart. Fixes #1722

rockstor:master ← phillxnet:1722_pool_resize_disk_removal_unknown_internal_error_and_no_UI_counterpart

opened 05:46PM - 27 Jan 19 UTC

phillxnet

+871 -414

Fix disk removal timeout failure re "Unknown internal error doing a PUT .../remo…ve" by asynchronously executing 'btrfs dev remove'. The pool_balance model was extended to accommodate for what are arbitrarily named (within Rockstor) 'internal' balances: those automatically initiated upon every 'btrfs dev delete' by the btrfs subsystem itself. A complication of 'internal' balances is their invisibility via 'btrfs balance status'. An inference mechanism was thus constructed to 'fake' the output of a regular balance status so that our existing Web-UI balance surfacing mechanisms could be extended to serve these 'internal' variants similarly. The new state of device 'in removal' and the above mentioned inference mechanism required that we now track and update devid and per device allocation. These were added as disk model fields and surfaced appropriately at the pool details level within the Web-UI. Akin to regular balances, btrfs dev delete 'internal' balances were found to negatively impact Web-UI interactivity. This was in part alleviated by refactoring the lowest levels of our disk/pool scan mechanisms. In essence this refactoring significantly reduces the number of system and python calls required to attain the same system wide dev / pool info and simplifies low level device name handling. Existing unit tests were employed to aid in this refactoring. Minor additional code was required to account for regressions (predominantly in LUKS device name handling) that were introduced by these low level device name code changes. Summary: - Execute device removal asynchronously. - Monitor the consequent 'internal' balances by existing mechanisms where possible. - Only remove pool members pool associations once their associated 'internal' balance has finished. - Improve low level efficiency/clarity re device/pool scanning by moving to a single call of the lighter get_dev_pool_info() rather than calling the slower get_pool_info() btrfs disk count times; get_pool_info() is retained for pool import duties as it’s structure is ideally suited to that task. Multiple prior temp_name/by-id conversions are also avoided. - Improve user messaging re system performance / Web-UI responsiveness during a balance operation, regular or 'internal'. - Fix bug re reliance on "None" disk label removing a fragility concerning disk pool association within the Web-UI. - Improve auto pool labeling subsystem by abstracting and generalising ready for pool renaming capability. - Improve pool uuid tracking and add related Web-UI element. - Add Web-UI element in balance status tab to identify regular or 'internal' balance type. - Add devid tracking and related Web-UI element. - Use devid Disk model info to ascertain pool info for detached disks. - Add per device allocation tracking and related Web-UI element. - Fix prior TODO: re btrfs in partition failure point introduced in git tag 3.9.2-32. - Fix prior TODO: re unlabeled pools caveat. - Add pool details disks table ‘Page refresh required’ indicator keyed from devid=0. - Add guidance on common detached disk removal reboot requirement (only affects older kernels). - Remove a low level special case for LUKS dev matching (mapped devices) which affected the performance of all dev name by-id look-ups. - Add TODO re removing legacy formatted disk raid role pre openSUSE move. - Update scan_disks() unit tests for new 'path included' output. - Address TODO in scan_disks() unit tests and normalise on pre-sort method. Fixes #1722 And by way of a trivial application of the added per device allocation: Fixes #1918 "Incorrect size calculation while removing disk from disk pool" @suman Ready for review. Please note that this pr assumes the prior merge of: "regression in unit tests - environment outdated since 3.9.2-45. Fixes #1993" #1994 (Fixes unit tests) "pin python-engineio to 2.3.2 as recent 3.0.0 update breaks gevent. Fixes #1995" #1996 (Fixes basic build fail) and: "Implement Add Labels feature for already-installed Rock-Ons. Fixes #1998" #1999 (has a prior storageadmin db migration 0007_auto_20181210_0740.py - I’m trying to keep our migrations path simple) Testing: All existing osi and btrfs unit test were confirmed to pass prior to and post pr (given #1994) however as indicated above the scan_disks() unit tests required modification but only to accommodate the new behaviour introduced in scan_disks() where we request from lsblk all device paths. From the osi unit tests point of view this was a cosmetic change in test data: and no functional changes were made bar a trivial robustness improvement by way of an existing TODO. Many of the system configurations used to originally generate the osi unit test data were also tested in their install instance counterparts (ie bios raid system disk, LUKS, btrfs in partition, etc) and were also used during development to help ensure minimal regression. A full functional test on real hardware was also conducted over multiple cycles of removing (and re-adding a post 'wipefs -a' disk where appropriate). These tests are details in the comments below and indicate expected behaviour in both legacy CentOS and openSUSE (Tumbleweed in this case) installs. Caveats: Our keying from devid = 0 (for 'Page refresh required' UI element) may cause confusion during a disk replace (as yet unimplemented: see issue #1611 ) as it is understood that currently within btrfs one of the two disks involved during a 'btrfs replace start ...' operation is temporarily assigned a devid of 0. The cited issue can address this as and when needed.

Which also in turn addresses a bug we had in removing drives in the first case (i.e. as per the pull requests title).

This pr will be included in the pending 3.9.2-49 release. But that pr is actually quite substantial and changes a number of disk and pool management / reporting mechanisms. So we would be best to approach this issue once that code has been released. The issues re the detached disks are only really cosmetic and hopefully the new code should allow you to remove the drive by the normal Web-UI means anyway.

So in short lets circle back around to this one once that code is out. It may be that the new code doesn’t deal with this case ‘after the fact’ but we can address that once we are there. But it should, going forward, at least be a little more intelligent re disk removal and size.

The preservation of detached disks, read prior attached disks, is by design but can bite us back occasionally. And that newer code changes a bunch of stuff about how we track such removals and ended up removing a whole bunch of inefficiencies we had build up over time.

Thanks again for the report and lets pick this up again post 3.9.2-49 release. Assuming you are running a rockstor rpm based install in this case that is.

Hope that helps.

magicalyak · August 4, 2019, 12:12pm

I did use the btrfs replace command on the CLI. The UI gave me some error that said it wouldn’t do it. I’m fine waiting until the next release, the error/issue is cosmetic as far as I can tell. I reset the stats on the pool which cleared the disk error and since that detached disk was removed from btrfs it cleared. I’ve had times where this btrfs not enough size bites me but I thought there was enough space. It’s 4 x 4TB drives (i have two pools of this configuration with RAID10).

I toyed with using pgadmin and editing the database but I’d much rather wait since I risk having to reinstall especially if I don’t want to mess up quotas.

Yes on using latest stable.

Glad to hear you appear to agree on this being cosmetic. I wish I followed the advice of rescan, swap drive, rescan, replace. I’ll see if we have an opportunity to call that out better in the docs unless it’s there and I missed it. Would it be prudent to ask for a scan prior to disk operations (perhaps a question to the user defaulting to yes?)

Off-topic alert: Did you look at stratis in RHEL8 as an option? This may be worth a look before going to SLES and the updated BTRFS as it is the FS Red Hat decided to go ground up with after bailing on BTRFS.

phillxnet · August 4, 2019, 1:23pm

@magicalyak Re:

Stratis is not a Filesystem but a set of management tools to help with easier management of LVM (I think) and mdraid and it uses XFS as it’s actual file system: and XFS does not have data check summing, but it has recently, in filesystem time, picked up (by default) metadata check summing. So just a rework / wrapper around existing technologies really. Which is more than welcome as we have had all these disparate technologies / layers around for ages and had to work them together ‘by hand’ for far too long now. But Rockstor is focused very much on btrfs from the ground up and as such we have a single upstream project that does both disk and file system management. Also Stratis is very early days, although granted it stands on the shoulders of far more established technologies, I’m at least of the opinion that once we get our openSUSE move sorted we should be in a better position to benefit from the now maturing nature of btrfs.

Thanks for the pointer and yes Stratis looks like a fantastic project but it is not for Rockstor given it’s use of XFS and the consequent limitations their in. I like the idea of a single ‘entity’ to do both disk and file management and so far there is only ZFS and btrfs to fit that bill, with btrfs being the new comer but preserving the block pointer re-write capability that was originally planned for ZFS but has since been abandoned.

Also, I think we have enough on our plate as is .

Yes RedHat only ever had btrfs as a technology preview so it was never actually supported. It was never much of an investment for them and once they no longer had a staff member who worked on btrfs it made sense for them to not see it through to support. However openSUSE / SUSE are very much invested in btrfs, at least for their root filesystem so we should be in better shape once we inherit their massively backported kernels (Leap variants) and plain up modern Tumbleweed kernels. As such it’s no surprise that the btrfs developers themselves, a few employed by SUSE, are considering using the tumbleweed kernels for their CI system: