Running older CentOS version 3.9.2-57
I had a detached disk showing in the pool and a degraded pool alert. The stats page for the disk looked OK, so I rebooted the system to see if it would clear and now I have no access to the pool data in the UI, but I can see the disks in the UI. I can SSH into the box. The detached removable drive has been there for a while with no problems and previous attempts to remove it have failed, but I tired again today again with no luck. I get an error that says it canât remove it because itâs not attached. I have no access to shared from windows machines on network. OpenVPN rockon doesnât appear to be running. Everything was working prior to reboot, but with pool degraded alert at top of page. I should have a replacement hard drive in a few hours.
Pool data now available in UI, maybe I wasnât patient enough the first time. Still showing the same GET error on the top of the UI.
Detailed step by step instructions to reproduce the problem
Unknown internal error doing a GET to /api/shares?page=1&format=json&page_size=9000&count=&sortby=name
Rockstor.log tail provided the following.
[root@jonesville log]# tail rockstor.log
return func(*args, **kwargs)
File â/opt/rockstor/src/rockstor/storageadmin/views/command.pyâ, line 348, in post
import_shares(p, request)
File â/opt/rockstor/src/rockstor/storageadmin/views/share_helpers.pyâ, line 86, in import_shares
shares_in_pool = shares_info(pool)
File â/opt/rockstor/src/rockstor/fs/btrfs.pyâ, line 685, in shares_info
pool_mnt_pt = mount_root(pool)
File â/opt/rockstor/src/rockstor/fs/btrfs.pyâ, line 525, in mount_root
âCommand used %sâ % (pool.name, mnt_cmd))
Exception: Failed to mount Pool(Removable-Backup) due to an unknown reason. Command used [â/usr/bin/mountâ, uâ/dev/disk/by-label/Removable-Backupâ, uâ/mnt2/Removable-Backupâ]
I assume I need to remove the drives via command line to restore the pool, but not sure where to start.
That would still leave you with the non-functional drive for your jones-pool pool. I think you can mount that pool as degraded using the command line, and then use the replace function via the WebUI.
while it has been updated to account for the Built on OpenSUSE versions, I believe the process of mounting the pool degraded and then attempt the replace (or add new disk and then remove broken disk) still holds true. Ideally, once you are able to mount it as degraded do:
Configuration back up via the WebUI - and offload to a non-Rockstor location
ideally back up your pool data to another device (assuming your Removable-backup drive is for?)
Then deal with getting that pool clean again ⊠and then
move to the latest testing release or at least latest stable.
Luckily I just finished a full backup to 2 removable USB drives connected to a windows client. The one that shows in the pool is too small these days. Itâs been living there for a while with no issues. Drive 3 being detached, on of the HGDT drives appears to be the one that caused the error. Iâm running so close to full Iâm going to need to add a drive to the pool before I remove one, or Iâll have to move to RAID 0, remove drive, add drive, and then convert back to 1.
If your existing pool is pretty full, I would then opt for, if you can, add the new drive, before removing the other one âofficiallyâ. I would not mess with the RAID level until you have a healthy pool again.
IT gremlins were kind to me today. Powered down added new drive, re-seated errored drive and everything came back up. I just need to investigate the error alert and figure out if itâs the missing USB drive or not.
If youâre not using that USB drive, you could remove that pool via the WebUI ⊠that would give you an immediate answer.
For your working-again pool, it might be useful to do a scrub and a balance to clean out inconsistencies and get the system to balance the load more across the new drive as well âŠ
With regard to the repair scenarios and removing pools with all missing drives etc, we have addressed quite a few bugs on that front since the CenOS days and 3.9.2-57 (April 2020). Plus the underlying btrfs is way better; especially on our new preferred target of Leap 15.6. Not that helpful to you right now however. So what @Hooverdan said re scrub etc and once you have a healthy main pool again you can approach (time allowing) the transition to our modern openSUSE variant. Note that we have also recently update the following how-to:
The errors reported on that flaky dive in the main pool are accumulative. I.e. they show all errors encountered since the drive was part of this pool, or since they were last reset. The indicated command there (text above table) can be used to reset those stats so you can more easily see if there are any new errors generated.
Assuming the USB drive is not connected, and itâs the only member of that Pool âRemovable-Backupâ, you may have success with deleting the pool. But again we had more bugs back then in this area !!
Well done for stringing along this old CentOS install for so long by the way. And once that main Pool is healthy again - we can attempt approach any issues with itâs import. But a good scrub before hand, with your existing kernel, without any custom mount options, can only help.
New drive added, old drive removed, balance running, fingers crossed! Going to ignore USB drive for a while, even deleting to pool didnât work. CENTOS here I come, ifâs funny how the threat of data loss can change your priorities. Unfortunately out of town for 10 days before I can get to that.