[SOLVED] Kernel and BTRFS-tools upgrade resulting in all shares unmounted and WebUI inaccessible

I know, I know, bleeding edge is never a good idea. But I just had to upgrade my kernel and btrfs-progs again since I hadn’t have any issues before on my box doing so (running the CentOS base still with the 3.9.2.57 version).
Interestingly, (or scary), this time it did not go well.

The kernel upgrade itself (to 5.7) with the newest btrfs-progs 5.6.1 seemed to have gone fine (at least no error messages).
However, upon reboot the trouble started. The WebUI is inaccessible and things like the docker service failed (I assume, because the RockOn root is not mounted).

journalctl -xe | grep ‘docker’ shows that

rockstorw systemd[3578]: Failed at step EXEC spawning /opt/rockstor/bin/docker-wrapper: No such file or directory

and under
systemctl list-units --state=failed

UNIT LOAD ACTIVE SUB DESCRIPTION
● docker.service loaded failed failed Docker Application Container Engine

lsblk shows
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdd 8:48 0 9.1T 0 disk
sdb 8:16 0 9.1T 0 disk
sde 8:64 0 9.1T 0 disk
sdc 8:32 0 9.1T 0 disk
sda 8:0 0 119.2G 0 disk
├─sda2 8:2 0 11.9G 0 part [SWAP]
├─sda3 8:3 0 106.8G 0 part /home
└─sda1 8:1 0 500M 0 part /boot

so, at least the block devices are there.
ls /mnt2/ gives me my shares, but, since nothing has been mounted, they’re obviously empty

btrfs subvolume list /mnt2/4xRAID5
shows me a ton of subvolumes connected to the RockOn root, but not the other shares

Obviously, I couldn’t leave good enough alone… But before I reinstall and try to remember what additional services at the OS level I have created or installed, I wanted to see whether the community can lend me a hand - even if it means “reinstall, you dummy, and don’t do this again until OpenSUSE is ready” :wink:

Thanks!

:-\ …

How about btrfs fi show /? I remember a few old posts on the forum about failure to mount pools in the past, so I wonder what that will report.

Also, how about the usual rockstor service checks?

systemctl status -l rockstor-pre rockstor-bootstrap rockstor

I’m particularly curious about what rockstor-bootstrap has to say.
Similarly, if the latter failed, maybe systemd-analyze critical-chain rockstor-bootstrap can be useful.

well, it shows only one device … which is obviously not good:

btrfs fi show /
Label: ‘rockstor_rockstor’ uuid: xxxxxxxxx-xxxxxx-xxx-xxx (the actual uuid is shown here)
Total devices 1 FS bytes used 7.56GiB
devid 1 size 106.83GiB used 60.03GiB path /dev/sda3

systemctl status -l rockstor-pre rockstor-bootstrap rockstor

Unit rockstor-pre.service could not be found.
Unit rockstor-bootstrap.service could not be found.
Unit rockstor.service could not be found.

yay! nothing there anymore …

rockstor-bootstrap
`

-bash: rockstor-bootstrap: command not found

`

systemd-analyze critical-chain rockstor-bootstrap

Failed to get ID: Unit name rockstor-bootstrap is not valid.
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

I guess, I somehow killed off everything? That is extremely weird …

That is curious, indeed… it almost looks like Rockstor was uninstalled. Do you still have something in /opt/rockstor/? What does yum info rockstor show?

2 Likes

yum info rockstor still shows this:

Name : rockstor
Arch : x86_64
Version : 3.9.2
Release : 57
Size : 17 M
Repo : Rockstor-Stable
Summary : Btrfs Network Attached Storage (NAS) Appliance.
URL : http://rockstor.com/
License : GPL
Description : Software raid, snapshot capable NAS solution with built-in file
: integrity protection. Allows for file sharing between network
: attached devices.

and /opt/rockstor/ contains these directories still:
drwxr-xr-x 1 root root 92 Nov 14 2016 certs
drwxr-xr-x. 1 root root 10 Jun 13 09:45 etc
drwxr-xr-x 1 root root 140 Jun 7 14:32 rockons-metastore
drwxr-xr-x. 1 root root 16 Apr 19 15:47 src
drwxr-xr-x. 1 root root 28 Jun 13 09:45 static
drwxr-xr-x. 1 root root 6 Jun 13 09:45 var

but, of course I don’t remember what is missing here …

Ha! @Flox you put me on the right track …
I spun up a VM where I have the OpenSUSE version installed and saw, that all of the things like bootstrap etc. were missing in that directory. I figured, if I just kick off a reinstall that might work. Alas, it did not.

yum reinstall rockstor

Loaded plugins: changelog, fastestmirror
No Match for argument: rockstor
Loading mirror speeds from cached hostfile

This is indeed pointing to a deinstallation (though I still have no clue why and how that happened). So, I went and
yum install rockstor

Loaded plugins: changelog, fastestmirror
Loading mirror speeds from cached hostfile

Dependencies Resolved

================================================================================
Package Arch Version Repository Size

Installing:
rockstor x86_64 3.9.2-57 Rockstor-Stable 17 M

Transaction Summary

Install 1 Package

Total download size: 17 M
Installed size: 85 M
Is this ok [y/d/N]: y

And the installation went just fine.
Then again your suggestion:
systemctl status -l rockstor-pre rockstor-bootstrap rockstor
which shows:

● rockstor-pre.service - Tasks required prior to starting Rockstor
Loaded: loaded (/etc/systemd/system/rockstor-pre.service; enabled; vendor preset: disabled)
Active: inactive (dead)

● rockstor-bootstrap.service - Rockstor bootstrapping tasks
Loaded: loaded (/etc/systemd/system/rockstor-bootstrap.service; enabled; vendor preset: disabled)
Active: inactive (dead)

● rockstor.service - RockStor startup script
Loaded: loaded (/etc/systemd/system/rockstor.service; enabled; vendor preset: enabled)
Active: inactive (dead)

so … systemctl restart rockstor and then
systemctl status rockstor

● rockstor.service - RockStor startup script
Loaded: loaded (/etc/systemd/system/rockstor.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2020-06-13 14:44:48 PDT; 15s ago
Main PID: 11504 (supervisord)
CGroup: /system.slice/rockstor.service
├─11504 /usr/bin/python2 /opt/rockstor/bin/supervisord -c /opt/roc…
├─11507 nginx: master process /usr/sbin/nginx -c /opt/rockstor/etc…
├─11508 /usr/bin/python2 /opt/rockstor/bin/gunicorn --bind=127.0.0…
├─11509 /usr/bin/python2 /opt/rockstor/bin/data-collector
├─11510 /usr/bin/python2 /opt/rockstor/bin/django ztaskd --noreloa…
├─11511 nginx: worker process
├─11512 nginx: worker process
├─11533 /usr/bin/python2 /opt/rockstor/bin/gunicorn --bind=127.0.0…
└─11542 /usr/bin/python2 /opt/rockstor/bin/gunicorn --bind=127.0.0…

Jun 13 14:44:48 rockstorw supervisord[11504]: 2020-06-13 14:44:48,255 CRIT S…g
Jun 13 14:44:48 rockstorw supervisord[11504]: 2020-06-13 14:44:48,255 INFO s…4
Jun 13 14:44:49 rockstorw supervisord[11504]: 2020-06-13 14:44:49,259 INFO s…7
Jun 13 14:44:49 rockstorw supervisord[11504]: 2020-06-13 14:44:49,262 INFO s…8
Jun 13 14:44:49 rockstorw supervisord[11504]: 2020-06-13 14:44:49,264 INFO s…9
Jun 13 14:44:49 rockstorw supervisord[11504]: 2020-06-13 14:44:49,265 INFO s…0
Jun 13 14:44:51 rockstorw supervisord[11504]: 2020-06-13 14:44:51,554 INFO s…)
Jun 13 14:44:51 rockstorw supervisord[11504]: 2020-06-13 14:44:51,554 INFO s…)
Jun 13 14:44:54 rockstorw supervisord[11504]: 2020-06-13 14:44:54,559 INFO s…)
Jun 13 14:44:54 rockstorw supervisord[11504]: 2020-06-13 14:44:54,559 INFO s…)
Hint: Some lines were ellipsized, use -l to show in full.

and voila … WebUI is back, all shares are mounted, SMB works, docker/Rockons are running, my HDHomerun DVR service is back up …

and I’m still running:
image and the corresponding btrfs tools

Thanks again @Flox, I knew if somebody talked me off the ledge it would be ok :slight_smile:

I will continue to look into my logs, but I cannot explain why I somehow uninstalled rockstor during a kernel upgrade. This has certainly never happened to me before. I guess, if one has a running system, the best way to waste your Saturday is to try to muck with it “just for 5 minutes” …

2 Likes

Glad you got it sorted!

Yes that is very curious indeed! I wonder if the yum logs could be indicating what happened indeed. I know of a few depencies that will uninstall rockstor if you uninstall them (docker, or postgresql might fit the bill, for instance), but I thought a proper uninstall would have removed all files and folders in /opt/rockstor/, and not just /bin, for instance… unless of course those directories were empty or contained only files that were added in these directories (independent of Rockstor).

Whatever happened, very puzzling.

I hope everything is going well with the newer btrfs-progs and kernel versions. You probably will see problems with rendering the btrfs scrub status output in the pool pages due to changes in formatting in newer btrfs-progs version, but know that these were fixed recently in the “Built on openSUSE” flavors.

2 Likes

well, I guess, somewhere doing the installation of the latest btrfs-progs I also uninstalled the “standard” btrfs-progs that came with the rockstor image… that was the difference compared to my previous updates … apparently I did not read that it would do additional stuff I didn’t want it to… and, boom, rockstor was uninstalled:

Jun 13 09:45:21 Erased: rockstor-3.9.2-57.x86_64
Jun 13 09:45:22 Erased: snapper-libs-0.2.8-4.el7.x86_64
Jun 13 09:45:22 Erased: btrfs-progs-4.12-0.rockstor.x86_64

still doesn’t explain why not everything was removed related to Rockstor, but explains and reminds me again to always read ALL message where yum is involved …

Oh well, at least I was able to recover and learned yet again something new. Thanks for reading and helping.

2 Likes