Network UPS Tools (NUT) - Rockstor's implementation

This is a wikified post documenting how NUT is implimented in Rockstor and is intended to serve as a live document on the same; user documentation is available at UPS / NUT Setup . Please update this developer document when you make any changes to the NUT instantiation within Rockstor’s code. Also any suggestions on how this might be done better are very welcome; but please make sure that these suggestions are clearly indicated as such; this helps to maintain the relationship between this document and the code. The idea is to aid in on-boarding and to provide a place where those with domain specific knowledge can contribute idea’s on how things might be done better. Please see About the Wiki category for overall guidance.

Reason to add nut

Nut = a collection of services (akin to smbd & nmbd in samba) that manages communication with one or more UPS (Uninterruptible Power Supply) devices and typically shuts a machine down upon a UPS being in both “on battery” and “low battery” states. This ensure that the Rockstor system and it’s managed shares will not suffer a sudden power outage, hence reducing the exposure to risk of the associated file systems.

Influential factors in the implementation

As NUT runs as a system / systemd service it has been added to the list of services and the existing service class hierarchy in code. As such it appears along with the likes of snmp etc and it employs the same model for config of a spanner icon to access the set-up page style that each of the existing services has. It was considered to treat it like tasks ie for it to have it’s own area / section but the original author thought this would be unnecessary and complicate the options presented to the user. Nut can communicate with ups devices directly via usb/serial or over the network via smtp / ipml or via another nut server / service on another machine. Hence it has several modes of operation and these modes greatly affect the required number of configuration entries. This is used to try and minimize the number of required fields.

User story and implementation summary

Nut appears as just another service (JAS) so does not complicate the existing web interface. It is off by default and will prompt the user for configuration if they attempt to switch it on without an existing config. The configuration screen is available via the spanner icon next to the service name of UPS-NUT on the Services page; as not everybody is familiar with NUT’s ubiquitous use in linux for UPS monitoring. Once a valid configuration has been submitted the service on/off switch should function to enable the service or fail to activate if it finds the configuration invalid and invite the user to configure / re-configure. Note that this process can take up to 25 seconds especially with serial port attached units. This can lead to issues with the GUI timing out and consequently miss-reporting the current state (a browser refresh sorts this) . Direct USB connected UPS’s don’t appear to suffer from this.

The GUI entered config is instantiated in NUT’s config files upon leaving the config screen via the “Submit” button. However the existing state of nut is unaffected, beyond NUT’s own re-reading of the files, until the service is turned off and on again. This state switch serves to both disable and stop the nut systemd services or alternatively enable and start the same services. This is relevant to those configuring NUT outside of the GUI.

Implementation detail

In CentOS 7 nut packages are not available so the buildout build system added epel-release as a dependency and in the “rpm-deps-nut” task added the epel nut packages, the equivalent proved problematic with upgrades so the required packages nut nut-client (and in the future nut-xml) were copied over from epel to rockstor’s repository and declared as dependencies of the Rockstor package. This may facilitate Rockstor maintaining a newer version of NUT than that currently found in epel (2.7.2-3.el7), obviously any changes will be made public and hopefully up streamed in compliance with the GPL.

System level tools used

All interaction with the NUT subsystem is via it’s configuration files and systemd. Healthy status is assumed by systemd feedback and it’s success in starting the nut-monitor service. This is the lowest common denominator of all modes and so should be revised especially in light of changes in systemd service file changes as NUT is still in active transition from initd to systemd, see services.py

The Configuration Process

The Bootstrap interface initially provides some degree of sanity checking and establishes some defaults, on submission the config values are available in a Python dictionary key value pair. Where possible these key value elements are a direct drop in for the nut config files. The main configuration process (nut.py) involves comparing this config dictionary to multiple other dictionaries to effect on the fly translation of entered or default config components to required configuration file entries. The result is a dictionary of ordered dictionaries where the top level index is the config file that the top level values (which are ordered dictionaries or config entries) relate to. The standard Rockstor method of adding all configuration to the end of system config files is maintained along with an identifiable header warning of their ‘machine written’ mature. The exception to this is that to avoid multiple entries of a config item, if an existing entry in a config file is found to conflict with post processed config items it will be remarked out; where every it may be in the config file. This ensures that existing config items do not interfere with the applied configuration where they share a key. Otherwise all NUT defaults are observed as the standard upstream config files are used; with the exception of enhanced schedule config files that are dropped wholesale over the existing default schedule config files. The source of the schedule configs can be found in the /opt/rockstor/conf directory. Every config submission will result in a fresh overwrite of these schedule config files.

Configuration beyond the limits of the GUI

Since configuration is not re-written until the Submit button is pressed, advanced configuration that lays outside the current scope of Rockstor’s NUT config component can be effected by simply hand editing after the Rockstor header in the file concerned and not re-submitting the GUI entered config. This will allow for service management and feedback via the GUI to continue. As usual a full state cycle will be required for this config to be adopted.

Configuration saved state

All nut config is established from code default or previously selected values held in Bootstrap’s database. The nut configuration is simple re-written from these values, changed or otherwise, on each config submission. The existing nut files are not referenced for content, except where that content conflicts; only written to.

Redundant config

  • The nut-scanner program returns the driver as “nutclient” when returning a scan of a nutserver attached remote ups. Current config reproduces this config when in nutclient mode, even though this mode needs no entry in ups.conf for the UPS it addresses and only requires the install of nut-monitor (not always broken out as a separate package; depends on distro).
  • In situations where no mode is selected (should not be possible via GUI config) we set the mode of none. The Original author is unsure if this is honoured by the current systemd managed services. This serves mainly as an indication of failed config submission as it should not be possible to enter this via the GUI. Further API entered configuration enhancements are due.

Nut modes

  • None (default) - nut is disabled - this mode is not offered as disabling the service make this selection redundant.
  • standalone - locally attached ups serving only Rockstor - only required fields will be shown
  • netserver - like standalone + nut server for other nut netclient machines - only required fields will be shown
  • netclient - contacts a nut server on the network (only needs upsmon) - only required fields will be shown

As indicated above once the overall mode has been selected only the appropriate other fields are displayed.

Missing config fields

NUT is very versatile and so can be challenging to configure but in many situations a minimum of configuration fields are required. The initial implementation has tried to keep the number of fields down to a minimum. Further fields could be added but their value / scope of relevance should be weighed against the inevitable complexity that additional fields brings, both to the configuration and the user. N.B. it is intended to reduce the fields further with auto creation of the nut user and pass in standalone mode: see
Future Enhancements

A selection from the following is offered for configuration depending on the overall mode selected:-

  • monitor mode - master or slave
  • ups name - the name of the ups eg “rockups”, this may relate to a predefined name on a remote nut server (single word requirement)
  • ups description - friendly name for the ups eg “Rockstor UPS”
  • ups driver - the driver required to drive the named ups eg apcsmart or usbhid-ups
  • ups port - often auto with usb but otherwise a specific serial port or
    * ups auxiliary options ie cable type but this is infrequently required.
  • nutserver name - the name or ip of a nut server on the network
  • username - nut has it’s own users ie “monuser” or “testuser”
    * user capabilities - can be fine grained but I think we should start off with either an admin or upsmon drop down pre-select.
  • password - a password to authenticate the user

Automatically selected config entries include:-

  • those pertaining to the scheduler and notify script.
  • the LISTEN directive. N.B. this is not yet adequately tested on systems with more than one network interface. Consider this directive broken until this issue has been tested and if need be resolved / improved. See the Ports used section in this doc for more details.
  • a “upsmon” class of nut user is created regardless of input, this minimizes the capabilities of the user to those required for monitoring purposes only. A safer option given the vintage of this implementation.

NUT config files used:-

  • /etc/ups/nut.conf - specifies the overall mode of the nut system ie netclient, standalone.
  • /etc/ups/ups.conf - specifics of the ups ie driver, port, friendly description of unit
  • /etc/ups/upsd.users - users and their capabilities and passwords
  • /etc/ups/upsd.conf - the config for the deamon ie the ip to listen on eg 127.0.0.1 for no remote access. maxage of a connection prior to it being considered old some certificate settings, mas number of connections etc.
  • /etc/ups/upsmon.conf - config of the monitor side of ups ie deadtime, minimum number of ups considered safe but mainly just a bunch of defaults and our all important monitor line which encompasses the ups name location a nut user and pass and the mode of this monitor slave (we are talking to a server) or master (we are a server for this ups, or are standalone with this ups directly connected). Also contains the command to use to shutdown the system.
  • /etc/ups/upssched.conf - config of NUT’s build in scheduler that is enabled by default in Rockstor, but not by default in NUT.
  • /usr/bin/upssched-cmd - bash script to manage the response to upssched.conf defined events, ie log messages and email alerts.

Scheduler basics

“upsmon —> calls upssched —> calls your CMDSCRIPT” see: “7.2 The advanced approach, using upssched” in the NUT documentation.

Ports used

The default nut port is 3493 for upsc, upscmd, upsrw, nut-monitor etc.
http://www.networkupstools.org/docs/user-manual.chunked/ar01s09.html
In Netserver mode the LISTEN 0.0.0.0 directive is used, otherwise LISTEN localhost is used.

Package dependencies

Additional package dependencies required:-

  • nut (brings in nut-client)
    * nut-xml. planned for later inclusion

Testing provision

NUT has the capability to read from a text file as if it were the output from a UPS. It will repeatedly re-read this file in the same way that it polls information on the UPS status. The facility is provided via the dummy-ups driver and a sample UPS output (gained via the “upsc ups” command where “ups” is the configured UPS name) is available as “/opt/rockstor/conf/dummy-ups.dev” provide this path/file name as the port entry to a dummy-ups driver selection and the NUT system should behave as if a local ups was attached and giving out the readings indicated by the contents of that file.

Shutdown test

Also note that a test of Forced Shut Down (fsd) can be initialted by root executing a

upsmon -c fsd

command. This is what will happen when a UPS is found to be in both an On Battery (OB) and a Low Battery (LB) state at the same time. Rockstor’s implementation goes with the NUT default of executing a “/sbin/shutdown -h +0”. Also note that if the last state of a UPS was OB and communications are lost then after a set period NUT will assume the worst and shutdown the system regardless. Please note that there is no provision made to shut down the UPS itself. This is not a facility that all UPS’s have and can create some tricky race conditions. An attached UPS will most likely continue until it’s internal variables deem it time to power off. Further function regarding returning to power on state varies from UPS to UPS. The default Rockstor implimentation is to keep services running for as long as is maintainable via the UPS battery and to shutdown safely when the UPS informs Rockstor’s nut that it’s end of power time in nigh. This is the most common and broadest configuration.

Shutdown after a set period on battery

A remarked out configuration entry is available in /opt/rockstor/conf/upssched.conf to effect a non standard shut down after # number of seconds; unremark the 2 self documented lines to enable this as yet unexposed feature.

Rockstor source file changes / additions list

The files changed in the addition of this feature:-
Initial merge of the GUI config element

Future enhancements

  • the original author would also like an auto-config button on the configuration form that will do an appropriate scan using the nut-scanner program. This will in turn fill out what it can in the currently exposed form. This button will only be visible once the overall mode has been selected, this way we can avoid network wide scans of multiple protocols if the user just need help / hints with a local UPS set-up (most common scenario).
  • More user feedback on status of UPS communication is required beyond just that of the enable switch staying on. This is a planned feature and required for full and friendly implimentation. ie just displaying the current UPS model and maybe a load reading would be a start.
  • A test button on the config page would be nice also so we don’t have the config entered and then the on off switch not work in services and cause the user to bounce back and forth trying different config / user / pass entries. Avoids ping pong user interaction and nice to have settings confirmed as working, also in many more modern configs the original author was aiming at a simple - auto-config/scan button press then a test button press and we are away. In a scenario such as this if a user is not entered (ie only for local access to upsd anyway) we should just create a default user and a random password.
  • Akin to the test button an additional widget on the dashboard could provide feedback of the UPS load. At a later stage we can add a details section that will show the various data available from the UPS which can sometimes include temperature and power drawn. This would be like the smart data sections for each drive only for the UPS date instead. Python bindings are available as part of the nut package and there are plan to use these in reading data / testing the config. ie Does the provided config allow us to get a connection and read a UPS’s model, if so we are good to go; see test button above.
  • Purhaps expose the shutdown regardless after a period of time feature in the GUI config, see Shutdown after a set period on battery above.

Unresolved issues:-

  • No multiple nut user config options - but this is quite an advanced use case anyway and provides a challenge particularly to the nut.py instantiated configuration method.

  • The original author has yet to establish the best way to declare the service as enabled ie should it be considered enabled / active if it can actually read data from the configured ups but we can’t declare the service as disabled as soon as we have a potentially momentary comm error such as is known to happen on some usb connected ups’s that have very slow response times ie just under the USB default of 5 seconds, especially on some older versions of nut where a 4 second ups timeout was in effect for many years. The current approach of relying on systemd feedback seem appropriate at present.

3 Likes