Rockstor-hdparm.service has failed during/after upgrade

@Hooverdan Well done on getting that sorted.

That’s the peculiarity there, and the bit we need to work out. There is not dependency on that service from within Rockstor’s other service:

  • the first to run (rockstor-pre.service):

https://github.com/rockstor/rockstor-core/blob/master/conf/rockstor-pre.service.in

  • the middle service (rockstor.service):

https://github.com/rockstor/rockstor-core/blob/master/conf/rockstor.service.in

  • the last to run (rockstor-bootstrap.servcie):

https://github.com/rockstor/rockstor-core/blob/master/conf/rockstor-bootstrap.service.in

So it’s strange that, in the exact same state, this hdparm service files’ outdated entries are fine during boot but end up upsetting / blocking the update process.

Yes, this service file was my way to instantiate hdparm settings (spin-down and APM settings) that were power cycle safe. And I had purposefully kept it independent of the other services as these changes can be made live so they are nothing to do with the other services actually.

Yes, that file is not only a systemd service to instantiate the setting requested (post filter / sanity check of sort via the Web-UI) but it also serves as the database of entered settings, assuming those settings were deemed ‘doable’ by said filters. Hence the strict formatting warning within it. It turns out it’s pretty much impossible to read some of these settings from a dive once they have been requested. So I just stashed the requested setting in that file and read them back into the Web-UI to ‘know’ what they were requested to be. I think, from memory, Synology used a custom kernel module to ‘stash’ some of this info; I used this file.
The original pull request for those wanting to dig into this anomaly is as follows:

And contains an exposition of how it works. And @Hooverdan your linked issue was a minor addition to the hdparm service file to have it wait for the udev system so it’s by-id names would have been created by the time it runs.

Exactly, and I’ve just found the following:

which looks to be the ‘tidy up after yourself’ hdparm issue so we just need to work out why we have this update failure when our hdparm service has rouge entries. I’m still not convinced we have an absolute causal relation here but something is happening.

A failure in this hdparm service should not affect any others really. So we have a fragility here and the cause is a likely candidate for another issue, once we know what that cause is of course.

Agreed, I think this was waiting in the wings until the next update came along. There is a slight time difference between an update that has a database migration and one that does not. But from the pull request for that database change we have the normal sequence that is used for these Web-UI initiated rockstor updates:

systemctl stop rockstor
systemctl stop rockstor-pre
/usr/bin/find /opt/rockstor -name "*.pyc" -not -path "*/eggs/*" -type f -delete
zypper --non-interactive refresh
zypper --non-interactive install rockstor
systemctl start rockstor

with the fairly obvious yum counterparts if we find we are running on a ‘rockstor’ linux (CentOS) base.

That code is in system/pkg_mgmt.py:

and the distro specific commands are defined earlier:

So from that I don’t yet see why this failure in an essentially unrelated service is tripping up our main rockstor services.

Good.

Thanks for another great report by the way. But bar the above,

I don’t have any more light to shed than yourself or @Flox, and you are both probably more aware of systemd issues than me anyway. So:

So that would be great, but we need a reproducer for this. There should be enough info here hopefully to generate one though. And once we have asserted the root cause we can get a focused issue opened and a resolution sorted.

Hope that helps and well done all.

As an aside, I’m pretty sure one can simply delete this service file and then, ideally:

systemctl daemon-reload

and all that will be lost is the spin-down and APM setting for all current and past drives. Resetting these within the Web-UI should then re-assert the hdparm service this file configures. The file is the ‘source of truth’ from Rockstor’s perspective. And this work around may be safer than editing the file anyway.

2 Likes