SMART service won't turn on

[Please complete the below template with details of the problem reported on your Web-UI. Be as detailed as possible. Community members, including developers, shall try and help. Thanks for your time in reporting this issue! We recommend purchasing commercial support for expedited support directly from the developers.]

Brief description of the problem

Under services tab trying to turn on SMART service results in error repeatedly even after reboot. All updates applied on clean install of 4.1.0

Detailed step by step instructions to reproduce the problem

Enter services tab and tap on beside SMART service.

Web-UI screenshot

[Drag and drop the image here]

Error Traceback provided on the Web-UI

Traceback (most recent call last): File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception yield File "/opt/rockstor/src/rockstor/smart_manager/views/smartd_service.py", line 60, in post self._switch(command) File "/opt/rockstor/src/rockstor/smart_manager/views/smartd_service.py", line 67, in _switch systemctl(cls.service_name, "start") File "/opt/rockstor/src/rockstor/system/services.py", line 81, in systemctl return run_command(arg_list, log=True) File "/opt/rockstor/src/rockstor/system/osi.py", line 224, in run_command raise CommandException(cmd, out, err, rc) CommandException: Error running a command. cmd = /usr/bin/systemctl start smartd. rc = 1. stdout = ['']. stderr = ['Job for smartd.service failed because the control process exited with error code.', 'See "systemctl status smartd.service" and "journalctl -xe" for details.', '']
![smart|690x482](upload://9bzYP5zjB0qqbUbTZ2ABQZnq7L3.gif) 

@dcbailey Hello again. Nice you are now on v4.1.0. We haven’t had any reports of issues with the Smart service to date.

Re:

If you could paste the output of those commands into this thread it may help folks here chip in with more info. The smart service should be enabled by default. So you are likely seeing a similar start failure on each boot-up log.

Could you also say about your hardware as there may be somthing there that is throwing the smart service and casing it to fail to start. It may be you will have to add a custom config to exclude a certain drive or other.

Hope that helps.

1 Like

This reminds me of an issue I had a few years back where a 3rd party PCI SATA card caused SMART to not start at all. Unfortunately I cannot remember the hardware or driver specifics I’m afraid, and IIRC I never got it to work and ended up changing the PCI card for a different one which worked fine with SMART. I wasn’t using Rockstor or openSuse back then, so this could be completely unrelated.
Just a thought, though :slight_smile:

2 Likes

● smartd.service - Self Monitoring and Reporting Technology (SMART) Daemon
Loaded: loaded (/usr/lib/systemd/system/smartd.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2022-05-02 18:39:59 EDT; 14h ago
Docs: man:smartd(8)
man:smartd.conf(5)
Process: 15127 ExecStart=/usr/sbin/smartd -n $smartd_opts (code=exited, status=17)
Main PID: 15127 (code=exited, status=17)
Status: “No devices to monitor”

May 02 18:39:59 rockstornas systemd[1]: Starting Self Monitoring and Reporting Technology (SMART) Daemon…
May 02 18:39:59 rockstornas smartd[15127]: smartd 7.2 2021-09-14 r5237 [x86_64-linux-5.3.18-150300.59.63-default] (SUSE RPM)
May 02 18:39:59 rockstornas smartd[15127]: Copyright © 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
May 02 18:39:59 rockstornas smartd[15127]: Opened configuration file /etc/smartd.conf
May 02 18:39:59 rockstornas smartd[15127]: Configuration file /etc/smartd.conf parsed but has no entries
May 02 18:39:59 rockstornas smartd[15127]: Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting…
May 02 18:39:59 rockstornas systemd[1]: smartd.service: Main process exited, code=exited, status=17/n/a
May 02 18:39:59 rockstornas systemd[1]: smartd.service: Failed with result ‘exit-code’.
May 02 18:39:59 rockstornas systemd[1]: Failed to start Self Monitoring and Reporting Technology (SMART) Daemon.

Subject: A start job for unit UNIT has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit UNIT has finished successfully.
░░
░░ The job identifier is 1.
May 03 09:22:41 rockstornas systemd[20747]: Startup finished in 251ms.
░░ Subject: User manager start-up is now complete
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The user manager instance for user 1000 has been started. All services queued
░░ for starting have been started. Note that other services might still be starting
░░ up or be started at any later time.
░░
░░ Startup of the manager took 251840 microseconds.
May 03 09:22:41 rockstornas systemd[1]: Started User Manager for UID 1000.
░░ Subject: A start job for unit user@1000.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit user@1000.service has finished successfully.
░░
░░ The job identifier is 5196.
May 03 09:22:41 rockstornas systemd[1]: Started Session 1 of user sobadmin.
░░ Subject: A start job for unit session-1.scope has finished successfully
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit session-1.scope has finished successfully.
░░
░░ The job identifier is 5310.
May 03 09:22:41 rockstornas login[20682]: pam_unix(remote:session): session opened for user sobadmin by SHELLINABOX(uid=0)
May 03 09:22:41 rockstornas login[20682]: LOGIN ON pts/0 BY sobadmin FROM 127.0.0.1
May 03 09:22:58 rockstornas systemd[1]: systemd-hostnamed.service: Succeeded.
░░ Subject: Unit succeeded
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit systemd-hostnamed.service has successfully entered the ‘dead’ state.
May 03 09:24:29 rockstornas sudo[20927]: sobadmin : TTY=pts/0 ; PWD=/home/sobadmin ; USER=root ; COMMAND=/usr/bin/journalctl -xe
May 03 09:24:29 rockstornas sudo[20927]: pam_unix(sudo:session): session opened for user root by sobadmin(uid=1000)
lines 3383-3425/3425 (END)

@dcbailey

and:

Would suggest that the service didn’t start as there were no devices found. Could you give us an indication of the mature of your system. I.e. are you running fully virtualized for instance as virtio devices, for one, don’t support smart capability. In which case it is better to pass the devices or better again the entire controller through to the Rockstor instance if you want it to have smart access.

The “Configuration file /etc/smartd.conf parsed but has no entries” bit is a little strange thought. But this could have the same root cause of “No devices to monitor”.

Here on a virtualized instance I have just now reproduce your issue with:

rleap15-3:~ # /usr/sbin/smartd -d
smartd 7.0 2019-05-21 r4917 [x86_64-linux-5.3.18-150300.59.54-default] (SUSE RPM)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

Opened configuration file /etc/smartd.conf
Configuration file /etc/smartd.conf parsed but has no entries
Unable to monitor any SMART enabled devices. Exiting...

Also note that our newer smartd looks to auto disable if it finds itself in a virtualization condition, e.g. even if there are sata emulations in play, which emulate smart output but are of course fake, we still have the following result:

rleap15-3:~ # systemctl status smartd.service
● smartd.service - Self Monitoring and Reporting Technology (SMART) Daemon
     Loaded: loaded (/usr/lib/systemd/system/smartd.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
  Condition: start condition failed at Wed 2022-05-04 11:57:22 BST; 43s ago
             └─ ConditionVirtualization=false was not met
       Docs: man:smartd(8)
             man:smartd.conf(5)

May 04 11:13:36 rleap15-3 systemd[1]: Condition check resulted in Self Monitoring and Reporting Technology (SMART) Daemon being skipped.
May 04 11:57:22 rleap15-3 systemd[1]: Condition check resulted in Self Monitoring and Reporting Technology (SMART) Daemon being skipped.

In this case the system was a kvm hosted instance:

rleap15-3:~ # systemd-detect-virt
kvm

So in the case above the smartd service is skipped due to the virtualisation conditional in the associated systemd config file:

rleap15-3:~ # cat /usr/lib/systemd/system/smartd.service      
[Unit]
Description=Self Monitoring and Reporting Technology (SMART) Daemon
Documentation=man:smartd(8) man:smartd.conf(5)
ConditionVirtualization=false

[Service]
Type=notify
EnvironmentFile=-/var/lib/smartmontools/smartd_opts
ExecStart=/usr/sbin/smartd -n $smartd_opts
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

So if I for example configure the SMART service in this VM example via the Rockstor Web-UI via the S.M.A.R.T spanner on the services page using the mouse-over popup/tool-tip example to monitor all devices:
monitor-all-devices
a debug run looks to have responded accordingly:

rleap15-3:~ #  /usr/sbin/smartd -d
smartd 7.0 2019-05-21 r4917 [x86_64-linux-5.3.18-150300.59.54-default] (SUSE RPM)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

Opened configuration file /etc/smartd.conf
Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Device: /dev/sda, type changed from 'scsi' to 'sat'
Device: /dev/sda [SAT], opened
Device: /dev/sda [SAT], QEMU HARDDISK, S/N:QM00001, FW:2.5+, 25.7 GB
Device: /dev/sda [SAT], not found in smartd database.
Device: /dev/sda [SAT], can't monitor Current_Pending_Sector count - no Attribute 197
Device: /dev/sda [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198
Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
Device: /dev/sdb, type changed from 'scsi' to 'sat'
Device: /dev/sdb [SAT], opened
Device: /dev/sdb [SAT], QEMU HARDDISK, S/N:QM00003, FW:2.5+, 4.29 GB
Device: /dev/sdb [SAT], not found in smartd database.
Device: /dev/sdb [SAT], can't monitor Current_Pending_Sector count - no Attribute 197
Device: /dev/sdb [SAT], can't monitor Offline_Uncorrectable count - no Attribute 198
Device: /dev/sdb [SAT], is SMART capable. Adding to "monitor" list.
Monitoring 2 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
Device: /dev/sda [SAT], opened ATA device
Device: /dev/sda [SAT], previous self-test completed without error
Device: /dev/sdb [SAT], opened ATA device
Device: /dev/sdb [SAT], previous self-test completed without error
Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.QEMU_HARDDISK-QM00001.ata.state
Device: /dev/sdb [SAT], state written to /var/lib/smartmontools/smartd.QEMU_HARDDISK-QM00003.ata.state

Incidentally the normal CONTROL (key) and c simply reloads in debug mode. To exit you need a:

CONTROL (key) and “” (key).

So in my test case here I also had to remark out the virtualization test in the systemd file, i.e. via:

nano /usr/lib/systemd/system/smartd.service

so the line was for eample:

# ConditionVirtualization=false

as well as configure to monitor all devices as per the above picture and tooltip popup.
I then get a successful enablement of the smart service via the Web-UI:
smart-enabled-on-virtual-instance

Hope this helps.

@Flox @Hooverdan @GeoffA This behaviour, both in virtualisation sensitivity and defaulting to no monitoring, is functionally different from our prior CentOS base where we had a pre-populated DEVICESCAN -a option I believe. This is good to know and we should maybe add this to the docs. I’m not keen on diverging from our openSUSE Leap 15.3 default smartd config but we should consider amending our: “S.M.A.R.T” howto https://rockstor.com/docs/howtos/smart.html"
with what we already prompt in the tooltip popup. I’ve created the following doc issue for this:

1 Like

This isn’t a virtualized box. Just an Asus MB with a WD 80 gb system drive and 2 WD Red NAS drives. I was running 3.9.1-16 on this same hardware for years but decided to move to 4.1.0. As per the instructions I installed on a clean system drive with the data drives disconnected, shut down after install completed, attached the data drives and imported them after booting the system.

I did go into settings for SMART just now and added DEVICESCAN -a and submitted and the service started. I then dropped to the system shell and ran smartd -d which returned the following…
Opened configuration file /etc/smartd.conf
Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Device: /dev/sda, type changed from ‘scsi’ to ‘sat’
Device: /dev/sda [SAT], opened
Device: /dev/sda [SAT], WDC WD800AAJS-00L7A0, S/N:WD-WCAV31886116, WWN:5-0014ee-1ac679070, FW:01.03E01, 80.0 GB
Device: /dev/sda [SAT], found in smartd database: Western Digital Caviar Blue Serial ATA
Device: /dev/sda [SAT], is SMART capable. Adding to “monitor” list.
Device: /dev/sda [SAT], state read from /var/lib/smartmontools/smartd.WDC_WD800AAJS_00L7A0-WD_WCAV31886116.ata.state
Device: /dev/sdb, type changed from ‘scsi’ to ‘sat’
Device: /dev/sdb [SAT], opened
Device: /dev/sdb [SAT], WDC WD40EFRX-68N32N0, S/N:WD-WCC7K7XDC8AU, WWN:5-0014ee-265143ec7, FW:82.00A82, 4.00 TB
Device: /dev/sdb [SAT], found in smartd database: Western Digital Red
Device: /dev/sdb [SAT], is SMART capable. Adding to “monitor” list.
Device: /dev/sdb [SAT], state read from /var/lib/smartmontools/smartd.WDC_WD40EFRX_68N32N0-WD_WCC7K7XDC8AU.ata.state
Device: /dev/sdc, type changed from ‘scsi’ to ‘sat’
Device: /dev/sdc [SAT], opened
Device: /dev/sdc [SAT], WDC WD40EFRX-68N32N0, S/N:WD-WCC7K7ZRN1U9, WWN:5-0014ee-265143ba9, FW:82.00A82, 4.00 TB
Device: /dev/sdc [SAT], found in smartd database: Western Digital Red
Device: /dev/sdc [SAT], is SMART capable. Adding to “monitor” list.
Device: /dev/sdc [SAT], state read from /var/lib/smartmontools/smartd.WDC_WD40EFRX_68N32N0-WD_WCC7K7ZRN1U9.ata.state
Monitoring 3 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
Device: /dev/sda [SAT], opened ATA device
Device: /dev/sda [SAT], previous self-test completed without error
Device: /dev/sdb [SAT], opened ATA device
Device: /dev/sdb [SAT], previous self-test completed without error
Device: /dev/sdc [SAT], opened ATA device
Device: /dev/sdc [SAT], previous self-test completed without error
Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.WDC_WD800AAJS_00L7A0-WD_WCAV31886116.ata.state
Device: /dev/sdb [SAT], state written to /var/lib/smartmontools/smartd.WDC_WD40EFRX_68N32N0-WD_WCC7K7XDC8AU.ata.state
Device: /dev/sdc [SAT], state written to /var/lib/smartmontools/smartd.WDC_WD40EFRX_68N32N0-WD_WCC7K7ZRN1U9.ata.state

I then rebooted the system and the service is auto starting on boot and smartd -d shows the same info so this appears to be working properly now.
Thanks for the assistance.

2 Likes

@dcbailey
Re:

That’s good news. Yes this is a change in behaviour from our prior v3 but we are super keen to stick as closely as possible to our new upstreams defaults where ever we can. And now we know this new default behaviour we can update the docs accordingly. The other smart functions were still functional in my test setup so this only really concerns the periodic smart monitoring. You may also like to enable:

Email alerts: https://rockstor.com/docs/interface/system/email_setup.html

as this effectively forwards root’s emails to whatever you setup in the email alerts config. That way you should end-up receiving the smart reports by email. I’m not sure if you will have to configure email notifications to root@localhost in the SMART config section for this to work however.

Hope that helps and glad your now sorted.

1 Like