SMART log issues

I’m trying to use the built in SMART service to monitor my drives, but I’m not sure how to configure it properly.

Today, my hard drives are sda, sdc, sdd, sde, sdf, sdg. My USB OS drive is sdb. This is the first time I’ve seen my USB drive mount as sdb, but I’m aware that could always happen.

Because my drives mount points may change, and because my OS drive doesn’t support SMART, I can’t use DEVICESCAN, I think. So, I’ve found another suggestion that I expected would work.

Under Services --> Smart, I just have the text:
/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YFG4B0HA -a -m email.address@gmail.com
/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YFG6D8NA -a -m email.address@gmail.com
/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YFG770TA -a -m email.address@gmail.com
/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YFHV3XGA -a -m email.address@gmail.com
/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YGG3KD3A -a -m email.address@gmail.com
/dev/disk/by-id/ata-Hitachi_HUA723020ALA641_YGGLE0AB -a -m email.address@gmail.com

Under Storage --> Disks, all 6 of my drives are shows as being SMART capable, and my USB drive as not SMART capable. This seems perfectly valid to me.

The issue I’m having is that /opt/rockstor/var/log/rockstor.log is getting slammed with SMART scanning issues. It appears that there are two of them that repeat over and over.

File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 226, in _update_disk_state
do.name, do.smart_options)
File "/opt/rockstor/src/rockstor/system/smart.py", line 311, in available
[SMART, '--info'] + get_dev_options(device, custom_options))
File "/opt/rockstor/src/rockstor/system/osi.py", line 98, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = ['/usr/sbin/smartctl', '--info', '/dev/sdb']. rc = 1. stdout = ['smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.6.0-1.el7.elrepo.x86_64] (local build)', 'Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org', '', '/dev/sdb: Unknown USB bridge [0x0781:0x5583 (0x100)]', 'Please specify device type with the -d option.', '', 'Use smartctl -h to get a usage summary', '', '']. stderr = ['']
[22/Jun/2016 08:20:28] ERROR [storageadmin.views.disk:228] Error running a command. cmd = ['/usr/sbin/smartctl', '--info', '']. rc = 1. stdout = ['smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.6.0-1.el7.elrepo.x86_64] (local build)', 'Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org', '', ': Unable to detect device type', 'Please specify device type with the -d option.', '', 'Use smartctl -h to get a usage summary', '', '']. stderr = ['']
Traceback (most recent call last):
File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 226, in _update_disk_state
do.name, do.smart_options)
File "/opt/rockstor/src/rockstor/system/smart.py", line 311, in available
[SMART, '--info'] + get_dev_options(device, custom_options))
File "/opt/rockstor/src/rockstor/system/osi.py", line 98, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = ['/usr/sbin/smartctl', '--info', '']. rc = 1. stdout = ['smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.6.0-1.el7.elrepo.x86_64] (local build)', 'Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org', '', ': Unable to detect device type', 'Please specify device type with the -d option.', '', 'Use smartctl -h to get a usage summary', '', '']. stderr = ['']

It LOOKS like smartctl is being run on sdb (my USB drive) as well as without a drive specified. How do I correct this?

I’ve verified that in smartd.conf, the only non-comment lines are those I pasted above. I don’t know why it is scanning for a non-specified drive or for sdb.

@Noggin Firstly a belated welcome to the Rockstor community.

Yes this is a little confusing from the user perspective. Essentially I think what is happening here is unrelated to you smartd.conf settings but to a background poling mechanism used to establish the SMART support status and capability of all the attached drives. Hence the _update_disk_state which updates the db on what drives are attached and various aspects of their nature, ie partitioned etc; but also if they support SMART. That way the UI can appropriately populate the S.M.A.R.T column on the Disks page.

In your case this test of SMART support / status is causing a report back to indicate an issue with SMART support on this usb-to-storage bridge. This in turn is interpreted correctly as Not Supported.

This is harmless but a pain and inelegant on the logs. But if you click on the pen icon in the S.M.A.R.T column for the USB device you may well be able to add custom S.M.A.R.T options that will enable reading SMART info from the drive over the USB bridge. In fact on that page there are links to pages that may help with what settings to try.

From the returned error ‘/dev/sdb: Unknown USB bridge [0x0781:0x5583 (0x100)]’ in your logs as posted above the link on the custom SMART options page USB Device Support would indicate that device to be SanDisk device of sorts (first number) but unfortunately there is no mention of the second number. But a search suggests this is a SanDisk Ultra Fit USB 3.0. In which case it may be worth trying the addition of:

-d sat

as that may reduce the verbosity of the logging and in the case of for example the SanDisk Extreme USB 3.0 actually enables full SMART info for the underlying ssd device, though in the case of your device it may only shorten the messages returned by the ‘smartctl --info devname’ probe carried out by _update_disk_state. Worth a try though.

and the consequent info on this device is then available:

Note that the above screen capture also indicates a pending by-id type drive name move that is currently pending review, thought that might interest you given you have done the same in your smartd.conf posting.

On an older SanDisk Cruzer Fit I have here the following custom options work to silence the exact same (bar usb id of course) message in the logs.
ie before:


and then after adding the following option:

-d scsi

Although the device will still yield errors from the other smart commands executed when clicking on the device name followed by a Refresh button click in that area. Probably due to no proper smartmontools support / SMART support ie the following message when trying suggests this:-

'Device does not support Self Test logging'

which is assumed to be supported by all smart capable device in Rockstor.
Still at least it doesn’t spam the logs with this setting: inelegant but effective.

After each custom config change it is required that you do a Rescan for the change to take effect by the way, at least from the UI perspective. Also one can simply press the Rescan button to do the same as the backgroud pole will be doing anyway so good to circumvent waiting to see the effect on the logs.

There are potential plans to move to an event driven update rather than pole driven which in the future may help with this kind of log spamming, ie only on drive change would such a refresh be required and the consequent message of failure to probe the devices SMART capability and status. Though this may be a way off just yet.

Sorry not to comment on your smartd.conf config but you seem to have that under way or in progress.

Hope that helps and let us know how you get on.

Thank you for the detailed response. Your research has proven to be correct, it is an Ultra Fit USB flash drive. I’m a bit worried about the drive due to the frequency of logging long term as the logs appear to be updated several times per minute. In reality, it might not be that big of a deal however

That said, ‘-d scsi’ appears to work on the flash drive. The output is now:

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.6.0-1.el7.elrepo.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SanDisk
Product:              Ultra Fit
Revision:             1.00
User Capacity:        62,109,253,632 bytes [62.1 GB]
Logical block size:   512 bytes
Serial number:        4C530001251226102193
Device type:          disk
Local Time is:        Wed Jun 22 19:07:06 2016 CDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Error Counter logging not supported

Device does not support Self Test logging

The good news is that the logging of the /sdb device is no longer appearing. However, the device ’ ’ is still appearing in the logs. My flash drive is encrypted at boot, so I have to enter a passphrase to unlock it. I THINK this makes the drive show up with the name “luks-bunch-o-numbers” on the drives page and I’m thinking that this might be the ’ ’ device showing up in the logs.

Am I correct in assuming that on a reboot, if my flash drive is mounted as a different letter I’ll get those errors in the logs again?

I clicked on the /dev/sdb link to view the smart data, clicked on Refresh, and received this in a “Houston we’ve had a problem window”:

        Traceback (most recent call last):
File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 40, in _handle_exception
yield
File "/opt/rockstor/src/rockstor/storageadmin/views/disk_smart.py", line 123, in post
return self._info(disk)
File "/opt/rockstor/eggs/Django-1.6.11-py2.7.egg/django/db/transaction.py", line 371, in inner
return func(*args, **kwargs)
File "/opt/rockstor/src/rockstor/storageadmin/views/disk_smart.py", line 73, in _info
test_d, log_lines = test_logs(disk.name, disk.smart_options)
File "/opt/rockstor/src/rockstor/system/smart.py", line 255, in test_logs
screen_return_codes(e_msg, overide_rc, o, e, rc, smart_command)
File "/opt/rockstor/src/rockstor/system/smart.py", line 378, in screen_return_codes
raise CommandException(('%s' % command), o, e, rc)
CommandException: Error running a command. cmd = ['/usr/sbin/smartctl', '-l', 'selftest', '-l', 'selective', '-d', 'scsi', '/dev/sdb']. rc = 4. stdout = ['smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.6.0-1.el7.elrepo.x86_64] (local build)', 'Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org', '', '=== START OF READ SMART DATA SECTION ===', 'Device does not support Self Test logging', '']. stderr = ['']

Is this because the flash drive doesn’t support it? I’m not to worried about it, just want the logs to stop really :slight_smile:

I’ve rebooted the system. My flash drive changed to /dev/sbc and the -d scsi option followed it, which is good news. The bad news, however, is that the Rock-On service won’t start. Rock-On issues is what prompted me to look at the logs in the first place.

So at this time, as far as this thread is concerned (SMART logging issues), the only problem is that the SMART daemon is polling a device with no name. That may be an expected behavior, but I’m curious to know if it is because I’ve encrypted my OS drive.

I’ll need to spend some time trying to debug the docker stuff before I open a new thread.

@Noggin Thanks for the update

Yes pretty much, as the custom option enabled some smartctl tests, such as the basic --info one so that the disk updater could then read the smart capability / enabled status and not log the error that results in just asking, ie the usb bus one.

Yes this seems to be the nub of it, currently Rockstor doesn’t support (read doesn’t yet understand) LUKS encryption but we do have an open issue on the same:

But this may initially only pertain to the data drives.

Yes, all drive settings / configurations are tied to the devices serial number so if all is well they should stick with the device irrespective of where it appears from one boot to the next (that’s the hope anyway).

Yes I very much suspect the same as you do here. And at a guess I think the cause is that given there is (currently anyway) no accounting for LUKS device mapping the method that looks for the base device on which to perform the smartctl calls is defaulting to an empty string ie the ‘’. This I think is happening in the get_base_device function in:
/opt/rockstor/src/rockstor/system/osi.py file where it initialises a base_dev variable prior to attempting to work out what the base device is ie:

    base_dev = ['', ]

And so if non can be found (because we don’t cater for LUKS devices) then it remains ‘’ as that made sense for a value that can’t be established; which results in the log entry you are seeing, indicating the name can’t be found. The get_dev_options() function in smart.py in the same directory then uses this returned device name.

Of course a horrible horrible hack that I would never recommend is for the ‘’ to be filled by a device name that reports without error and without need for custom options (although if the luks-bunch-o-numbers in question can have them added successfully this may also work) But note that the actual device will then have these options applied and given we can’t know what the device will be (hence the horrible) care should be taken . This way when the get_base_device() function is called and can’t find the base device for the passed device name (as we both suspect) then it will instead pass back a device that won’t cause your logging irritation. Name changes boot to boot obviously hamper this approach and a reboot will be needed for it to take effect by the way.

As I say it’s a horrible hack but this system has recently been re-written a tad in the by-id pending changes so I would rather address this issue properly after that and when we get to look at the pending LUKS issue proper. Although if the by-id changes pass review that will also help with the name change issue of this horrible hack as well.

Of course the sensible and ‘correct’ answer is to re-install without using disk encryption as per the defaults then there is no risk of breaking things, however if you were game you could always try the horrible hack for the time being. Please note though that I can’t really spend any time on diagnosing any issue with the horrible hack as of course no one would every really do that :innocent:. Plus it’s also only based on a guess.:slight_smile:

Thanks for bringing this shortcoming to light and I have added a note to our LUKS issue to watch out for this and update this thread when significant progress in LUKS support is achieved.

A request on my part: I would be interested in seeing a screen grab of what your current Disks page looks like with the LUKS encrypted system disk. No worries otherwise but it may help to see what needs to be done to support this. And could you detail the procedure you used to enable this encryption so that it might be re-create when considering the LUKS support.

Hope that helps and you should probably ignore the horrible hack idea and go with a default install unless of course there is entertainment value therein and you are prepared to potentially break your current install. I was just sketching out how it appears to have happened so that it might inform the LUKS issue in the future. And note that updates will overwrite any changes you make to these files.

Good luck on diagnosing the docker issue, I see you have opened another thread on this. Thanks.

I’m not a python guy, but I’d feel comfortable doing what you don’t suggest. However, it seems like a better fix would be not to run smartctrl against something without a device name. On the other hand, maybe you should get an error message if a device that can’t be named does appear. I’m not sure which is really the better option in the long run. In my position, however, it seems like smartctrl just shouldn’t be run if the device name is ‘’

For the time being, I’ve opted to not modify the python scripts. I at least feel like I understand the issue, but I don’t want to create other headaches. As for reinstalling without encryption, I’m considering it… but that’ll come down to whether or not I can get the dockers up and running again.

Thank you for your help, and here’s my screen grab:

@Noggin

Yes I agree but I couldn’t come up with a trivial single word change to address your issue that was also a proper fix. The behaviour in the code you are running is expected as it indicates by managed command exception the short coming. But then so does the big red warning on that same drive :slight_smile:. However your particular edge case should be addressed by ‘pending review’ code changes that disables all smart calls, including skipping the polled --info one of your logs, for all drives that have serial number issues. So due to Rockstor’s lack of support/understanding (currently) for LUKS there is also a serial issue which should in the pending code then auto disable all SMART calls hence addressing you circumstance. But also note that this whole ‘find base device’ mechanism has been reworked and for a while, in the pending code development, even had an “if (bus_based_devname is None)” variation but has since moved on to dealing only in by-id type names and removing the need entirely to flag (via empty string dev names ‘’), which in turn led to simpler arrangement all in.

The addition of LUKS support is non trivial and will require enhancing a few of Rockstor’s subsystems, hence your system drive appearing twice, once with a repeat serial. So I was attempting to provide a hack workaround to address your log query. There are no devices without names, it’s just an internal way to flag such issues which is I suspect what happened: a lack of LUKS knowledge in the subsystems is cause a failure to find a base device on which to execute the SMART calls, surfaced as an empty string base device name as one couldn’t be found (due to LUKS lack).

I think dropping the LUKS for the time being is going to give you a much smoother ride all-in as the knock-ons could be quite extensive, which is far more significant on the headaches front of course. Especially given help on the forum will be far harder to come by given it’s unsupported status. However the LUKS support issue is currently fairly high on the list but there is further unit testing to be put in place prior to that issue being addressed, so it may be some time.

I think many users of Open Source are so due to the ability it affords to ‘hack’ their own installs, and I like to encourage this as it can be an effective bridge to becoming a contributor by way of ‘I see how this works’. We do have a growing number of people who review the pull requests on GitHub and anyone is free to comment on how they think things could be done better. Do please consider reviewing any changes pending in pr’s as they are often left open for a review period for this purpose.

Thanks for the screen grab, oh and did you enable LUKS via our relatively unmodified upstream anaconda installer?

You may also be interested in the Reinstalling Rockstor section of the official docs and the Configuration Backup and Restore it links to.

Hope that helps, at least from a piece of mind point of view, and thanks for bringing up this issue in the forum.

Honestly, this is very tempting to me whether it is only my own installation, or possibly even being a contributor. Unfortunately, my time currently does not allow it.

Yes, I did.

Another note that may not be related to LUKS/Rockstor is that since the latest stable update, I’ve only had about a 50% success rate getting the LUKS decryption prompt on startup. I’m using a Supermicro motherboard and entering the prompt over IPMI. During POST, I see the normal screens and progression. When the system gets to where it should provide me with a prompt for the decryption key, there’s been about a 50% chance I see the the prompt and a 50% chance that my IPMI window reports no source data is coming in (I forget the wording that is uses).

I’ve only been a Rockstor user for about 6 weeks and only paid for the subscription a few days ago. I only had 3 or 4 reboots under my belt until a day or two ago. So the 50% failure rate on LUKS prompt may not have anything to do with the update at all[quote=“phillxnet, post:7, topic:1653”]
You may also be interested in the Reinstalling Rockstor section of the official docs and the Configuration Backup and Restore it links to.
[/quote]

.I am, thank you. I’ve not decided which of the three paths I’m going to take. Reinstalling without LUKS sure would make life easier if I ever need to get my wife to reboot the server. I have my 23 character randomly generated password memorized (upper case, lower case, numbers, special symbols, and spaces!) but I think the universe will end in heat death before I could successfully communicate it over the phone to someone else, much less get them logged into the IPMI interface.

It does, thank you for all of your help!