Disk Activity Plot Not Working With SAS Drives

It looks like the homepage “Disk Activity” display isn’t plotting any data for SAS disks. I have two HGST SAS drives connected to a SAS 9211-8i along with 6 x SATA drives. Even though data is being written to the drives, the plot is never updated (see screenshot). Currently running Rockstor 3.9.2-41

vs

@juchong Hello again and thanks for the report.

I think this is probably down to either:

  1. The stats not actually surfacing in /proc/diskstats (as is the case with md devices) as that is where we grab them from.
  2. We are not usefully resolving the correct by-id name for the device, ie the /proc/diskstats uses the temp canonical type names but internally Rockstor uses the boot safe by-id type names: but there can be multiple by-id type names and for speed we take a short cut for these stats that may be coming back to bite us in this case (if I remember correctly).

The following code is the nub/core of the retrieval and includes the potential by-id reference:

with the byid_disk_map referencing the output of:

To help discern which is the cause you could give us the output of:

cat /proc/diskstats

while there is confirmed activity on the problematic reporting devices (two consecutive readings would be nice as well) and also a screen shot of your disks page to see what by-id names Rockstor is using, so we can check for (2) above.

It will also be necessary to report the full output of the following command:

ls -la /dev/disk/by-id

Note that at least all of the above requested information will be required to pin this down. But if these drives are like the md case they just don’t report their activity in that way. But I’m guessing we have a (2) type cause on this one. Just a quick guess for the time being though until get more info.

Hope that helps and thanks again for the report.

I’m assuming your last report ended up resolving itself successfully in time as there was no further update after my subsequent reply in that thread?


Best to reply contextually there I would say on that one.

Hi,

Attached is the output of cat /proc/diskstats

[root@rockstor ~]# cat /proc/diskstats
8 0 sda 205435 4144 17008972 242441 999051 255011 59026092 2717219 0 195614 2956225
8 1 sda1 323 1 16988 229 10 2 36 19 0 136 248
8 2 sda2 362 0 13240 166 0 0 0 0 0 79 166
8 3 sda3 204720 4143 16976616 242039 999041 255009 59026056 2717200 0 195440 2955773
8 16 sdb 7680 2039 520336 161752 8640534 56436210 8180529096 1450388380 0 21611664 1450560327
8 32 sdc 17688 23 563416 186630 8251575 56806538 8180491248 1931871653 0 27635338 1932072120
8 48 sdd 15138 1 659992 148535 16039990 49042660 8180539064 1805170354 0 19318054 1805306656
8 64 sde 25576 7 996192 115587 15720742 49359295 8180544592 1950796054 0 20446533 1950893634
8 80 sdf 7871 2071 525776 236148 8263610 56794545 8180490720 1951927490 0 27911870 1952175632
8 96 sdg 7647 2066 522568 202611 8183724 56904252 8180464456 1915875214 0 27426124 1916089701
11 0 sr0 0 0 0 0 0 0 0 0 0 0 0
8 112 sdh 17713 18 568736 181839 8188066 56899924 8180464456 1921337124 0 27458104 1921534725
8 128 sdi 17725 27 565768 181488 8250861 56828192 8180523016 1932743394 0 27609997 1932940382

Output of ls -la /dev/disk/by-id:

[root@rockstor ~]# ls -la /dev/disk/by-id
total 0
drwxr-xr-x 2 root root 540 Oct 22 23:39 .
drwxr-xr-x 6 root root 120 Oct 22 23:39 …
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-HGST_HUH728080ALE600_2EKXANGP -> …/…/sdb
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-VMware_Virtual_SATA_CDRW_Drive_00000000000000000001 -> …/…/sr0
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-WDC_WD60EFRX-68L0BN1_WD-WX11D6651995 -> …/…/sdg
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-WDC_WD60EFRX-68L0BN1_WD-WX31DB58YJF0 -> …/…/sdh
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-WDC_WD60EFRX-68MYMN1_WD-WX11DB4H8VJJ -> …/…/sdf
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-WDC_WD60EFRX-68MYMN1_WD-WX21DC42ELAT -> …/…/sdc
lrwxrwxrwx 1 root root 9 Oct 22 23:40 ata-WDC_WD60EFRX-68MYMN1_WD-WXM1H84CXAUJ -> …/…/sdi
lrwxrwxrwx 1 root root 9 Oct 22 23:40 scsi-35000cca252017870 -> …/…/sdd
lrwxrwxrwx 1 root root 9 Oct 22 23:40 scsi-35000cca252017ef4 -> …/…/sde
lrwxrwxrwx 1 root root 9 Oct 22 23:40 scsi-36000c29df1181dd53db36b41b3582636 -> …/…/sda
lrwxrwxrwx 1 root root 10 Oct 22 23:40 scsi-36000c29df1181dd53db36b41b3582636-part1 -> …/…/sda1
lrwxrwxrwx 1 root root 10 Oct 22 23:40 scsi-36000c29df1181dd53db36b41b3582636-part2 -> …/…/sda2
lrwxrwxrwx 1 root root 10 Oct 22 23:40 scsi-36000c29df1181dd53db36b41b3582636-part3 -> …/…/sda3
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x5000cca23bf728eb -> …/…/sdb
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x5000cca252017870 -> …/…/sdd
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x5000cca252017ef4 -> …/…/sde
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x50014ee20b77be35 -> …/…/sdf
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x50014ee20dd22144 -> …/…/sdg
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x50014ee260e5c696 -> …/…/sdi
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x50014ee26114d4cb -> …/…/sdc
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x50014ee26293bbea -> …/…/sdh
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x6000c29df1181dd53db36b41b3582636 -> …/…/sda
lrwxrwxrwx 1 root root 10 Oct 22 23:40 wwn-0x6000c29df1181dd53db36b41b3582636-part1 -> …/…/sda1
lrwxrwxrwx 1 root root 10 Oct 22 23:40 wwn-0x6000c29df1181dd53db36b41b3582636-part2 -> …/…/sda2
lrwxrwxrwx 1 root root 10 Oct 22 23:40 wwn-0x6000c29df1181dd53db36b41b3582636-part3 -> …/…/sda3

@juchong Thanks for the info.

Bit strange that one.

Can you confirm that the only drives that are not successfully showing activity are the following:

lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x5000cca252017870 -> …/…/sdd
lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x5000cca252017ef4 -> …/…/sde

ie the only 2 that are using the wwn by-id name?

It’s also odd that they don’t have the normal more multi part type name such as all the others have (bar the system disk).

Could it be that they are mapped differently (ie in the hypervisor) or are they the only 2 connected in a certain way. As I see the others with the more normal type by-id name ie ‘ata-WDC_model-serial-etc’ are also showing as SAS.

Given this looks to be a WMware install of sorts, are you sure the drives are passed through in the same was as those that are showing stats correctly.

Whatever it looks like it could still be some kind of internal Rockstor naming anomaly with regard to the stats as the given name, and that used by Rockstor, ‘wwn-0x5000cca252017870’ still resolves to /dev/sdd which in turn is still showing stats.

I’ll circle back around to this one as it’s non critical, unless some one beats me to it of course, and thanks for any more info you can give as what is different about the drives that don’t show activity in comparison to the ones that do show activity in the dashboard widget.

Also I meant to say before that we do have a pending pull request by @Flox in the rockon-registry:

for the very impressive netdata which may serve your needs for the time being. But given it’s currently pending merge you could follow the following wiki guide to try it out:

It’s a little ‘contained’ due to docker’s container type nature but seems to work fully for drive activity.

Nice setup that by the way.

Hi Phillip,

Yes, the two drives with the ID beginning with wwn are the only drives not showing in the dashboard. All drives are connected to a SAS 9211-8i card which is passed through to the VM directly. The hypervisor has no idea that there are drives connected to the PCI device and all 8 drives are connected the same way.

@juchong Thanks for the confirmation.

Pretty sure we have a (2) scenario here but we wouldn’t if those 2 drives had their expected ‘long’ by-id names. So I was trying to fathom just why that might be. Have they been added ‘live’ ie has the vm running Rockstor been rebooted since this was the case. Bit of a long shot that one. I’m just perplexed as to why they don’t have the expected names such as the others the by-id’s are dished out by udev but I don’t see why these 2 disks should be assigned different by-id names that the others.

No worries though, would just be nice to fathom why just those 2 drives are not being assigned the usual multi-part by-id.

They were not added live. The VM (and host) have been rebooted multiple times. These drives are the latest and greatest from HGST, so maybe that has something to do with it? The drives are so new that even this hardware change applies: https://www.hgst.com/sites/default/files/resources/HGST-Power-Disable-Pin-TB.pdf

1 Like

@juchong Thanks for the head up on that new connector and power disable pin, I didn’t know about that so cheers.

OK, managed to peek in on this issue again and have something for you to try if you are game.

Obvious caveat is don’t do this if you are not confident about returning the python source file to it’s original form (ie you take a copy first or are happy enough with local editing a python source file: just have to watch spacing really)

In a recent enhancement to our main ‘proper’ by-id name selection we made an alteration that was not reflected in the quick and lightweight by-id name map created currently for use only on the dashboard.

Anyway I think if you add an ‘r’ character to the line indicated here:

So that it reads:

    out, err, rc = run_command([LS, '-lr', '/dev/disk/by-id'],

ie we are simply asking for a reverse lexicographical list ordering for our by-id names.

We usually take the longest, ie the one made up from device bus + model + serial etc but given that one is for some reason missing with these new drives (although I suspect it will appear once udev is updated) then we can hopefully at least address this bug where I suspect the quick and dirty doesn’t line up with out proper by-id name selection when the only options available are the exact same length, as in your case:

lrwxrwxrwx 1 root root 9 Oct 22 23:40 scsi-35000cca252017870 -> …/…/sdd

lrwxrwxrwx 1 root root 9 Oct 22 23:40 wwn-0x5000cca252017870 -> …/…/sdd

ie both 22 chars.

Your systems local copy of that file should be at:

/opt/rockstor/src/rockstor/system/osi.py

so something like:

nano -c /opt/rockstor/src/rockstor/system/osi.py

should give you an easy editor and show the line number.

Let me know if you are game to try this change out, and for it to take effect you will need to restart your rockstor service thereafter:

systemctl restart rockstor

or reboot, and then hopefully your ‘rouge’ short name only devices activity should show up as intended.

If this works then I’ll prepare an explanation by way of issue and put this change in a pull request against that issue as their is good reason to apply it anyway.

No worries if this is all a bit messy or inappropriate for you to try on that setup, but it would help as a proof of fix to justify the time getting it documented and put in ahead of something else.

Thanks and let me know if you are game to try that change and the results either way. But do bear in mind that the change must be applied carefully and a back out plan in readiness if you are to try it. Should be pretty safe as that procedure is only used by the activity widget but of course if you break the file all bets are off. But don’t copy the file in the same place by another name as that might upset something; but you could pop it in /root or something by way of backup if needed.

Thanks and hope that helps.

@juchong Quick update as per:

As of Rockstor stable channel release version 3.9.2-42 your reported Dashboard disk activity anomaly re 2 of your disks should now be sorted via issue:

and it’s associated pull request which passed review and has now been merged:

It’s essentially that same single character change but I developed a couple of unit tests (based on your supplied command output) to go along with the change that proved the (before and after) behaviour of the associated dev name mapping function used only by the dashboard and from that it looks like you were also getting no activity indication for your system drive as well. In short if a dev had multiple by-id names (usual) that were also all the same length then we had a recent regression that lead to your reported no disk activity in the dashboard widget for those devices.

Thanks again for your report on this and do let us know if you end up working out why just those 2 devices, and as it goes your system disk, had no associated longer by-id type name as your other similarly attached devices had.

Hope that helps and let us know if I’ve missed the mark on this particular fix.

Hello. I’m a total newbie to Rockstor. I’m currently testing the AARCH64 distribution via KVM running on a 64-bit Arm system (macchiatobin). I only have a single 1 TB volume attached to Rockstor. It’s been working quite well so far with the exception of the Disk Activity plot on the dashboard which is blank.

I’m providing here the following information:

Rockstor 4.0.5-0 (AARCH64)

ls -la /dev/disk/by-id
total 0
drwxr-xr-x 2 root root 140 Feb 27 11:15 .
drwxr-xr-x 7 root root 140 Feb 27 11:15 ..
lrwxrwxrwx 1 root root   9 Feb 27 11:15 ata-QEMU_HARDDISK_QM00003 -> ../../sda
lrwxrwxrwx 1 root root   9 Feb 27 11:15 scsi-0ATA_QEMU_HARDDISK_QM00003 -> ../../sda
lrwxrwxrwx 1 root root   9 Feb 27 11:15 scsi-0QEMU_QEMU_CD-ROM_drive-scsi0-0-0-0 -> ../../sr0
lrwxrwxrwx 1 root root   9 Feb 27 11:15 scsi-1ATA_QEMU_HARDDISK_QM00003 -> ../../sda
lrwxrwxrwx 1 root root   9 Feb 27 11:15 scsi-SATA_QEMU_HARDDISK_QM00003 -> ../../sda

cat /proc/diskstats
 253       0 vda 8217 6 848733 17195 24117 11922 5014721 46856 0 72360 16840 0 0 0 0
 253       1 vda1 441 0 12672 289 2 0 9 0 0 410 150 0 0 0 0
 253       2 vda2 7586 6 824608 16727 22463 11922 5014712 39218 0 71680 15880 0 0 0 0
 253       3 vda3 135 0 8901 132 0 0 0 0 0 250 30 0 0 0 0
  11       0 sr0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   8       0 sda 422 0 18345 792 102148 258299 5153480 184797 0 153470 107370 0 0 0 0

Screenshot of disk activity follows

disk_activity

And I see the following in the UI log:

CommandException: Error running a command. cmd = /usr/bin/mount -t btrfs -o subvolid=264 /dev/disk/by-id/vda2 /mnt2/home. rc = 32. stdout = [’’]. stderr = [‘mount: /mnt2/home: special device /dev/disk/by-id/vda2 does not exist.’, ‘’]
[27/Feb/2021 20:53:55] INFO [storageadmin.views.disk:139] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-d40916b4-86ca-4c52-b917-38596779774e).
[27/Feb/2021 20:53:56] ERROR [storageadmin.views.command:113] Skipping pool (ROOT) mount as attached disk (vda2) has no by-id name (no serial # ?)
[27/Feb/2021 20:53:57] ERROR [system.osi:196] non-zero code(1) returned by command: [’/usr/sbin/btrfs’, ‘qgroup’, ‘show’, ‘/mnt2/ROOT’]. output: [’’] error: [“ERROR: can’t list qgroups: quotas not enabled”, ‘’]
[27/Feb/2021 20:53:57] INFO [fs.btrfs:1155] Mount Point: /mnt2/ROOT has Quotas disabled, skipping qgroup show.
[27/Feb/2021 20:53:57] ERROR [system.osi:196] non-zero code(1) returned by command: [’/usr/sbin/btrfs’, ‘qgroup’, ‘show’, ‘/mnt2/ROOT/home’]. output: [’’] error: [“ERROR: can’t list qgroups: quotas not enabled”, ‘’]
[27/Feb/2021 20:53:57] ERROR [storageadmin.middleware:33] Exception occurred while processing a request. Path: /api/commands/refresh-share-state method: POST
[27/Feb/2021 20:53:57] ERROR [storageadmin.middleware:34] Error running a command. cmd = /usr/bin/mount -t btrfs -o subvolid=264 /dev/disk/by-id/vda2 /mnt2/home. rc = 32. stdout = [’’]. stderr = [‘mount: /mnt2/home: special device /dev/disk/by-id/vda2 does not exist.’, ‘’]
Traceback (most recent call last):
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/core/handlers/base.py”, line 132, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/views/decorators/csrf.py”, line 58, in wrapped_view
return view_func(*args, **kwargs)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/views/generic/base.py”, line 71, in view
return self.dispatch(request, *args, **kwargs)
File “/opt/rockstor/eggs/djangorestframework-3.1.1-py2.7.egg/rest_framework/views.py”, line 452, in dispatch
response = self.handle_exception(exc)
File “/opt/rockstor/eggs/djangorestframework-3.1.1-py2.7.egg/rest_framework/views.py”, line 449, in dispatch
response = handler(request, *args, **kwargs)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py”, line 145, in inner
return func(*args, **kwargs)
File “/opt/rockstor/src/rockstor/storageadmin/views/command.py”, line 410, in post
import_shares(p, request)
File “/opt/rockstor/src/rockstor/storageadmin/views/share_helpers.py”, line 239, in import_shares
mount_share(nso, “{}{}”.format(settings.MNT_PT, s_in_pool))
File “/opt/rockstor/src/rockstor/fs/btrfs.py”, line 665, in mount_share
return run_command(mnt_cmd)
File “/opt/rockstor/src/rockstor/system/osi.py”, line 198, in run_command
raise CommandException(cmd, out, err, rc)
CommandException: Error running a command. cmd = /usr/bin/mount -t btrfs -o subvolid=264 /dev/disk/by-id/vda2 /mnt2/home. rc = 32. stdout = [’’]. stderr = [‘mount: /mnt2/home: special device /dev/disk/by-id/vda2 does not exist.’, ‘’]

Any pointers would be greatly appreciated. I’m really enjoying Rockstor so far!

2 Likes

@gsamu Welcome to the Rockstor community, and I’m glad you are enjoying the testing; much appreciated.

Your issue looks very much like there are no serial numbers assigned to your virtual disk devices. Notice that the sda (CDROM) has a by-id counterpart, where as the vda* devices do not. This is due to your virtual machine env auto assigning device serials to the emulated ATA devices, but not the libvirt devices (vda*). That is a common behaviour with VMM/libvirt qemu.

And no serial on devices pretty much writes them off for Rockstor use. See:

And our Minimum System Requirements doc section:
http://rockstor.com/docs/quickstart.html#minimum-system-requirements
in our Quick Start doc section:
http://rockstor.com/docs/quickstart.html#quick-start

Try adding unique serials to all the libvirt devices with the virtual machine powered down. Then on power up the fake-serial issue should go away and you should get a far better impression of device management.

The ‘fake-serial-uuid’ reports are where we are stop-gapping with a temporary serial when we don’t find one.

Do let us know here or in a dedicated forum post if this sorts your issue as it may well b e we have another issue going on here. But first make sure you virtual env is providing unique serials to all devices and then ensure the udev is assigning by-id names to all of those devices. There after it’s down to us to use those by-id names and there may still be an issue lurking in that layer that it would be great to resolve with your help/feedback.

Hope that helps and do let us know how you get on. And thanks for the AArch64 testing. Some more details of your VM setup would be good here to enable/inform others here of how folks can test this rather new platform for us.

2 Likes

Phillip,

Thank you for the detailed response. I can confirm that after adding a serial number to the KVM virt disk, that the issue is resolved. I’m waiting on 2 x 4TB SATA drives which I’ve ordered now to have something with greater performance, capacity. I’ll write something more detailed about my experience and share here to the forum hopefully by the end of the week.

4 Likes

Well, it’s super high level but I threw this together at the weekend. Just a quick run down on building a NAS with Rockstor and a MACCHIATObin Arm board: https://www.gaborsamu.com/blog/aarch64_nas/

4 Likes

@gsamu Brilliant write-up! Interesting stuff there, I always enjoy an article like this where I am clicking in to links or scooting off into Google investigating the stuff being written about.
Nice one :slight_smile:

2 Likes

@gsamu That’s a really nice read. Well done.

Also, I like you not-a-case. Very swish. :slight_smile:

Philip thanks. Yeah cases just get in the way :slight_smile: