Seems that during install something got unsyncronized.
If I’m enabling the NFS trigger on WebGui, the nfs daemon stops. If I’m disabling it then it starts.
In order to get it in sync, I have disabled it on WebGui and on CLI and now it’s working fine.
I do think it’s a bug as the status needs to be read if the pid is running, not dropping dummy commands
By open do you mean enabled. We generally use systemd commands to start / stop services and a variety of mechanisms, depending on the service, to assess if current state. But again it’s mostly systemd commands but from your report some of these look to be switching whatever service state is found which is obviously not correct. But given we haven’t had this reported before it’s a little strange as there is little difference in these regards between our CentOS base and the ‘Built on openSUSE’ endeavour.
From memory the restore service state mechanism is meant to only enable a service if that service was previously enabled when the config was saved. The service state restore mechanism was last updated on 3.9.2-51:
So taking a look at that issue and it’s linked pull request should lead to the related code. And a glance at the following file:
should give an indication of the triggers used to inform the Web-UI switch reports of service status. The intention is that the system is the single point of truth for service state and we update the db wheneve the page is refresh with the state as reported at the lowest level from the above file’s various mechanism.
Definitely worth taking a look if you fancy as it’s all pretty readable and we do have some switching functions in there that may be the root of this behaviour. I.e. we me be inadvertently switching something when we should be explicitly enabling / disabling. But again we haven’t had reports of this, at least in more recent years. So this out-of-sync state may be, as you say, an artefact of the more recent restore mechanism.
Thanks again for the reports. Can’t look into these issues just yet as am deep in some other stuff but we now have these reports which is great. Keep us informed of any progress and do take a look at that services.py file if you suspect stuff originates at that level.
@shocker Also do take a look at @Flox’s pull request linked to from the issue I’ve highlighted re the service state restoration as it is, as per the @Flox norm, excellently presented and tested. So there is a tone of pointer there for just this king of down-the-road investigation. I.e. an explanation of how this feature works and the like. It may be we missed something that’s causing your inverted service reporting that just didn’t register with either of us at the time and didn’t show up in our testing prior to me merging it.
Likely to be useful to take a read if you end up looking into this.
The samba toggle in webgui under services was showing that was enabled. By checking the status on opensuse with service smb status I have seen that is disabled. I have manually enabled from cli smb then played with web toggle on off and everything went back to normal.
The problem is that I cannot reproduce this now as it was the first state after migration.
I have checked the code above and my behavior is really strange is something imported in the db with the current state of the service in the backup config?
I will also try to play with this later on as I’m not using smb and I can test if.
Thanks for this pointer, @phillxnet… I quickly looked at my recent PR on the matter (for the config restore bit), and I actually am using the current db info to check whether the service is currently ON or OFF. I don’t know whether that can participate to explain this report, but we may want to switch to using service_status() from system.services instead. See relevant snippet below:
I unfortunately won’t have time to look into that further for the next few days as I’m still out of any free time from work, but I wanted to point that out before forgetting. If you agree on switching to using systemctl status instead of the db info, I’ll work on it as soon as I can have some free time–unless somebody else beat me to it.
I finally could find some time to look into that one and think I found the origin of the problem. If I’m correct, it seems to be yet another illustration of small differences between distributions that make or break things for us. In this case, it all seems to relate to how systemd files are dealt with.
TL;DR: CentOS uses a symlink nfs.service --> nfs-server.service, whereas Leap 15.2 uses an alias service nfs.service to refer to the nfs-client. As rockstor uses systemctl status nfs to check the status of the NFS service, it was seen as OFF and surfaced as such in the webUI (which is correct as the nfs client was indeed off). Checking for the status of nfs-server instead should fix that.
For reference, here’s the related code that checks for the status of the nfs service:
This, in the end, runs the systemctl status nfs command, which equals to the following:
root@rockdev ~]# systemctl status nfs
● nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
Active: inactive (dead) since Mon 2020-05-04 05:19:21 PDT; 32s ago
Process: 3596 ExecStopPost=/usr/sbin/exportfs -f (code=exited, status=0/SUCCESS)
Process: 3593 ExecStopPost=/usr/sbin/exportfs -au (code=exited, status=0/SUCCESS)
Process: 3590 ExecStop=/usr/sbin/rpc.nfsd 0 (code=exited, status=0/SUCCESS)
Main PID: 2116 (code=exited, status=0/SUCCESS)
May 04 05:18:16 rockdev systemd: Starting NFS server and services...
May 04 05:18:16 rockdev systemd: Started NFS server and services.
May 04 05:19:21 rockdev systemd: Stopping NFS server and services...
May 04 05:19:21 rockdev systemd: Stopped NFS server and services.
in Leap 15.2:
rockdev:~ # systemctl status nfs
● nfs.service - Alias for NFS client
Loaded: loaded (/usr/lib/systemd/system/nfs.service; disabled; vendor preset: disabled)
Active: inactive (dead)
This is due to the following symlink in centOS:
[root@rockdev ~]# ls -lhtr /usr/lib/systemd/system/nfs*
-rw-r--r-- 1 root root 567 Aug 8 2019 /usr/lib/systemd/system/nfs-utils.service
-rw-r--r-- 1 root root 1.1K Aug 8 2019 /usr/lib/systemd/system/nfs-server.service
-rw-r--r-- 1 root root 395 Aug 8 2019 /usr/lib/systemd/system/nfs-mountd.service
-rw-r--r-- 1 root root 330 Aug 8 2019 /usr/lib/systemd/system/nfs-idmapd.service
-rw-r--r-- 1 root root 375 Aug 8 2019 /usr/lib/systemd/system/nfs-config.service
-rw-r--r-- 1 root root 413 Aug 8 2019 /usr/lib/systemd/system/nfs-client.target
-rw-r--r-- 1 root root 350 Aug 8 2019 /usr/lib/systemd/system/nfs-blkmap.service
lrwxrwxrwx 1 root root 19 Mar 7 16:43 /usr/lib/systemd/system/nfs-rquotad.service -> rpc-rquotad.service
lrwxrwxrwx 1 root root 18 Mar 7 16:43 /usr/lib/systemd/system/nfs-idmap.service -> nfs-idmapd.service
lrwxrwxrwx 1 root root 17 Mar 7 16:43 /usr/lib/systemd/system/nfs-lock.service -> rpc-statd.service
lrwxrwxrwx 1 root root 16 Mar 7 16:43 /usr/lib/systemd/system/nfs-secure.service -> rpc-gssd.service
lrwxrwxrwx 1 root root 18 Mar 7 16:43 /usr/lib/systemd/system/nfs.service -> nfs-server.service
lrwxrwxrwx 1 root root 17 Mar 7 16:43 /usr/lib/systemd/system/nfslock.service -> rpc-statd.service
rockdev:~ # cat /usr/lib/systemd/system/nfs.service
Description=Alias for NFS client
# The systemd alias mechanism (using symlinks) isn't rich enough.
# If you "systemctl enable" an alias, it doesn't enable the
# This service file creates a sufficiently rich alias for nfs-client
# (which is the canonical upstream name)
# "start", "stop", "restart", "reload" on this will do the same to nfs-client.
# "enable" on this will only enable this service, but when it starts, that
# starts nfs-client, so it is effectively enabled.
# nfs-server.d/nfsserver.conf is part of this service.
Of course, all we care about in our case is the nfs-server, not the client. This is why you observed a discrepancy between the nfs-server status on the machine and what the webUI was reporting.