Hello,
I’ve started receiving those messages today:
Aug 8 19:18:10 storage smartd[6280]: Device: /dev/sdf [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Aug 8 19:48:15 storage smartd[6280]: Device: /dev/sdg [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Aug 8 20:18:10 storage smartd[6280]: Device: /dev/sdo [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
Aug 8 22:18:10 storage smartd[6280]: Device: /dev/sdf [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
The Rockstor specific code, drive power status wise, all uses hdparm and given your messages are from smartd I don’t think it’s a Rockstor specific issue. I.e. I’d look to smartmontools and it’s settings for the cause.
Strange that it’s only recently been happening. First thought on this is that the upstream smartmontools package has been updated and now includes this warning. I chose the hdparm settings / drive power probe as it was stated to ‘mostly’ not wake drives. Check your:
/etc/smartmontools/smartd.conf
also Rockstor can, under user intervention, edit this file. The relevant code is:
So if you have used this facility you should also see the Rockstor header within that file.
Thank you @phillxnet for your feedback.
I have checked my smartd.conf file and this is the only line that is uncommented: DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q
To be honest I have this server for two years and it was error free in /var/log/messages. Now I’ve noticed this as my NFS server was hanging due to this behaviour.
On other server where I mounted the rockstor server via NFS I get this:
4174881.921475] nfs: server 10.10.2.2 not responding, still trying
[4174881.921508] nfs: server 10.10.2.2 not responding, timed out
[4174886.433518] nfs: server 10.10.2.2 not responding, still trying
[4174966.123268] nfs: server 10.10.2.2 not responding, timed out
[4174987.179735] nfs: server 10.10.2.2 not responding, timed out
[4175002.220039] nfs: server 10.10.2.2 not responding, timed out
[4175002.220052] nfs: server 10.10.2.2 not responding, still trying
[4175282.562638] nfs: server 10.10.2.2 OK
[4175282.562666] nfs: server 10.10.2.2 OK
[4175282.562730] nfs: server 10.10.2.2 OK
On rockstor management all my HDD’s have: active/idle No spin down option.
So that’s the default contents of that file. I.e. you haven’t applied any custom config.
At a guess, if these are related, then it could be that the drives are powering down and only powering up upon NFS access, with the consequence of a wait that is the nfs server not responding perhaps. This might tie in with the smartd message indicating that it’s check on power status will spin up the drive. It may be that an update in smartmontools has changed the behaviour so that this message is now produced rather than actually spinning up the drive. Just a guess really.
Your best bet is to research this smartd / smartmontools message and with a quick look it seems related to the ‘-n’ element of that default config.
Let us know how this goes.
The relevant code in the upstream smartd project has some comments on what this message means:
There have been no recent changes in Rockstor code (assuming you have been updating regularly) that account for your observed changes so it does rather point to an upstream update.
It might help others on the forum if you give a description and details of the hardware you are seeing this issue with.
Could you indicate where you got the following from:
I have changed now all the APM’s from 255 (off) to 254 to see if there will be any improvement as I don’t need my disks to idle.
There are some virtual machines that are using some mounted folders as virtual disks and this should keep my nas busy all the time, no idea why the idle is happening.
For the last 4h I don’t see anything else happening with APM 254 but I’ll keep an eye on /var/log/messages