Rockstor 3.9.2-3 shares unmounted after reboot

I’ve been running a rockstor machine for a month on the development updates. I had obtained a key to use the stable branch the other day. I did a yum updated to installed the stable branch and rebooted. Rockstor is currently 3.9.2-3 stable. I was looking for files under /mnt2 to work on some rockons. That’s when I noticed that the shares were not mounting after reboot. Earlier after the update, I had to restart the docker service because the rockon service couldn’t start after reboot. Restarting the docker service allowed the rockon service to start from the web UI.

Any thoughts on fixing the rockstor install so shares automatically mount, because I hope not to find Rockons not coming back online after system reboot. Or needing the cli that often after setup. Are there cases where tuning on some dependent service, repairing file system, running a maintenance script, or some disk pool rebuild utility will fix this. I was luckily able to use /opt/rockstor/bin/mnt-share to manually mount the shares. Although, I’ll have to see what happens the following days when I try rebooting for a third time the issue started.

Although, I’m not sure if this is related. As maybe I need to do something other than just mount the volumes. Or maybe some other service needs to be on? I have Space free - 24.91 GB on the volumes for the mariadb service. Although, the following message about storage capacity being fully utilized, appears when trying to start the mariadb service manually in the docker container.

‘’'
linuxserver-mariadb:/# service mysql start
df: /config/databases/.: No such file or directory

  • /etc/init.d/mysql: ERROR: The partition with /config/databases is too full!
    ’’’

Then after restarting the mariadb rockon the service started up fine. So, that issue with mariadb may be unrelated.

The same thing has happened to me today as well. I had to temporarily shut down (graciously) the server to rewire the network and upon reboot, all shares show as unmounted. After rebooting (instead of shutdown and startup) the shares are still unmounted … which obviously makes Rockstor unusable at this time :slight_smile:

In the interim, can I force a mount via terminal?

Thank you.

try to enable btrfs quota. In my case it helped mounting all shares. But on the WebUI it shows 0 bytes usage.
Perhaps a bug?

Hope that helps

Ulrich,
yes, that helped. Now I just shouldn’t reboot the system until it’s clear what’s missing in the startup scripts :slight_smile: (which could be as simple as the enablement of quotas).

I did:
btrfs quote enable /

and then:
systemctl restart rockstor-pre
systemctl restart rockstor
systemctl restart rockstor-bootstrap

in my case, it shows the correct btrfs usage for the items that are on the RAID pool (hence on btrfs), the OS drive shares have 0 bytes, but that would be expected in my case.

Dan

@Horatio Hello again.
@Hooverdan and @upapi_rockstor A belated welcome to the Rockstor community to you both.

I haven’t gotten to the bottom of this hole issue just yet but I have managed to reproduce the failure to mount due to the exception being raised, along with the obvious repercussions, during the attempt to establish share usage when quotas are disabled. This would seem obviously ‘over the top’ and non robust so I have started by submitting the following pr which is ready for review:

https://github.com/rockstor/rockstor-core/pull/1868

which was created against the issue I raised earlier from all of your most detailed reports:

https://github.com/rockstor/rockstor-core/issues/1867

Too right. As above there is a pending pr to address at least this element by way of making Rockstor more robust ‘quota not enabled’ state. My initial suspicion is that we have a compound issue here were by recent changes have compounded to surface this problem over what appears to be an unrelated update ie that of 3.9.2-3 which simply moved from legacy docker to docker-ce.

Obviously we have more to do here and are in the midst of changing the update strategies and I think this had lead to these breaks leaking through to stable channel. It is expected that we are to setup a new dev and testing routine which should help to guard against such things.

If any of the more ‘brave’ (read happy to edit code and take a risk) participants would like to try out the changes as indicated in the above pr then that may help with confirming the fix for at least the unmounted at boot element of this issue. Please don’t attempt this if you are not familiar with Python file editing and happy to try a reboot and take a risk.

Sorry for the inconvenience folks and thanks for helping to support Rockstor’s development.

Hope that helps and please be advised that our dev release model is in flux but soon to settle again:

And as always @suman’s (project lead) word on the matter supersedes mine as I may have this all wrong thus far.

I didn’t have this level of luck unfortunately (running the built in scripts), I had to re-mount everything manually :frowning: and on top of this I wont be able to restart until this issue is fixed or do it allllll over again…

I had activated automatic-updates so really its my own fault for allowing Rockstor to update itself with the many issues faced in an on-going manner with this project and stability. I have since turned this off but now I will have to remember to yum update every few weeks manually…

I really hope the stable channel one day is actually as described, as someone who financially contributed to gain access to stability, for these issues to make it into the stable branch really concerns me. As I am software developer in my day-to-day; I suggest far better testing should be implemented in this project.

Hi @phillxnet,
thank you for your kindly reply and opening this issue.
Will we get an announcement when this issue is fixed? I’ve seen, that we are now at 3.9.2-4, but I assume the update does not include this fix, doesn’t ist?

Hi Brian
how did you manage to re-mount everything? And do you see the correct btrfs-usage in the webUI on the shares site? I still see 0 bytes there :face_with_raised_eyebrow:

@phillxnet

No not as yet. The above noted pending pull request make a minor change that fixes Rockstor’s failure to mount shares when quotas are disabled. But the strange thing is that Rockstor doesn’t ever disable quotas itself and once set they are supposed to be sticky, ie:

btrfs quota enable /mnt2/rock-pool/

where rock-pool is the pool in question on my reproducer here.

With the pending fix, which is a part line addition so easy to try out if you fancy (noting the above caveats), a Rockstor system will then not fail to mount all shares on boot: even if the quotas on the associated pool are not enabled (ie they didn’t stick).

That is once set by the above command a pool should retain the quota enabled status but in my reproducer instance and yours by the looks of it the quotas enabled state is not sticking and Rockstor depends on quotas being enabled. We enable them upon pool creation or import and they are then expected to stick enabled as we don’t disable them: which is the current quandary.

ie if one creates a pool outside of Rockstor on the command line and mounts it:

mkfs.btrfs -f -L external-pool /dev/disk/by-id/virtio-serial-3
mount /dev/disk/by-id/virtio-serial-3 /mnt3/test/

then attempts a quota related command:

btrfs quota rescan /mnt3/test/
ERROR: quota rescan failed: Invalid argument

It fails as there is no quota enabled state for that pool.

But once we enable quotas and try the same command again it works:

btrfs quota enable /mnt3/test/
btrfs quota rescan /mnt3/test/
quota rescan started

Now if we reboot and remount this pool as it is not managed by Rockstor:

mount /dev/disk/by-id/virtio-serial-3 /mnt3/test/

we then find that the quota enabled state has stuck:

btrfs quota rescan /mnt3/test/
quota rescan started

as evident from the above quote rescan not complaining. Ergo the quandry.

Hope that helps to identify the nature of the problem. Ie quotas are not sticking on some installs. And the only thing that was updated on the Rockstor side was docker so that’s strange.

As per:

we use quotas to discern usage hence if it’s disabled that’s a problem and we default to returning 0. Once you have enabled quotas you can enact a rescan as in above example commands and you should get more informative results. Note that it can take a while to rescan quotas and you can see the progress via:

btrfs quota rescan -s /mnt3/test/
no rescan operation in progress

In this example it’s finished.

Hope that helps.

The mount on boot fix for a pool without sticky quotas is detailed in the sighted pull request and if one then manually enabled quotas upon the fresh boot and do a rescan (as in above comamnds) you may be up and running again. This at least is more palatable than doing all the mounts manually.

I may have missed something obvious here but I still have my reproducer system and am working on this and related issues currently.

@suman @Horatio @Hooverdan @upapi_rockstor @absentbri @mluetz
OK, i’ve made a little progress with the following:

We have the following docker-ce related report:

Not exactly broken or unrecoverable but it does look like we have our culprit.

@suman In the above upstream issue the reporter remarked out a code line within docker-ce and avoided this problem entirely. Bit heavy but does look convincing re our current quota issues.

1 Like

You need to manually mount from the command line if nothing else works, so what I did is the following (logged in as root, or just sudo each of these):

 # enable quotas or system isn't happy
btrfs quota enable /

# get your block devices that relate to your btrfs structure
lsblk

# list the subvolumes in a certain pool
btrfs subvolume list /mnt2/[Pool Mount]

# mount each subvolume one-by-one
mount -t btrfs -o subvol=[subvol from above command] [blk device that relates from lsblk] /mnt2/[subvol-related folder from above command]

You might need to do this with the /mnt2/home directory that is probably in your rockstor_rockstor filesystem…

My btrfs usage column is wrong but I really couldn’t care less as it does not mean near as much to me as being able to access my data and have my dockers running.

@absentbri Hello again.

Not necessarily as an alternative and hopefully easier approach (at least for the time being) is available in the following thread by way of manually applying a potential pending improvement.

@upapi_rockstor in that thread has confirmed this as a work around with the same caveat as you mention, ie the btrfs-usage column.

Hope that helps.

EDIT: As from version 3.9.2-5 the above referenced patch is already applied / included.

I was just pointing out what I did to get my system up and running again as I was asked by another user. My way might not be the best but when you need access to your data, you get creative with your workarounds.

Any word on how this bug made it into the “stable” release?

People like me have manually disabled quotas because they are unstable. A better solution would be to decouple rockstor from quotas so it has no dependence on them. Stability is more important than knowing available disk space (if that’s the main reason this hasn’t been done already).

@grizzly Hello again

Yes, agreed. And that is what I am currently working on. The first step towards this is in the now released 3.9.2-5 which includes the first referenced issue and pull request in this thread. Ie Rockstor does not now fail to at least mount shares when quotas are disabled, such as we found was happening inadvertently, and most unexpectedly with that new version of docker-ce. My current sub-goals should lead towards the following issue’s resolution:

https://github.com/rockstor/rockstor-core/issues/1592

1 Like

@Horatio @Hooverdan @upapi_rockstor @absentbri @mluetz @grizzly

Just an update that the referenced pending patch re shares unmounted on boot when quotas disabled earlier in this thread is now included and released in version 3.9.2-5 so just that little bit easier now.

Bit by Bit.

I noticed that when trying to add a share to a volume existing before the issue started the following error is displayed.

‘’’

        Traceback (most recent call last):

File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/share.py”, line 170, in post
pqid = qgroup_create(pool)
File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 700, in qgroup_create
qid = (’%s/%d’ % (QID, qgroup_max(mnt_pt) + 1))
File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 687, in qgroup_max
o, e, rc = run_command([BTRFS, ‘qgroup’, ‘show’, mnt_pt], log=True)
File “/opt/rockstor/src/rockstor/system/osi.py”, line 121, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /sbin/btrfs qgroup show /mnt2/RAID1-A. rc = 1. stdout = [’’]. stderr = [“ERROR: can’t list qgroups: quotas not enabled”, ‘’]

‘’’

I’m going to attempt that fix, as soon as I can, because, while I can restart the machine and rockons restart in their last state and are live. I can’t seem to add additional shares to my data pool, ATM.

@Horatio

Glad you are sorted thus far at least.

Already included in 3.9.2-5.

This is a known limitation when quotas are not enabled:

The now included in 3.9.2-5 mount fix does not address this; only the all shares unmounted on boot when quotas are disabled.

If you execute the following as root:

btrfs quota enable /mnt2/RAID1-A

where “RAID1-A” is your affected pool name, you should again be able to create shares in that pool.

I am currently working on better behaviour re quotas disabled;

but the core issue here is that they should never have been disabled in the first place (by docker-ce it seems) which has thus exercised our existing poor behaviour when quotas are disabled.

Hope that helps.

1 Like