Disk unusable as pool member

Hello all and sorry for my poor english…
We installed Rockstor on one of our servers (HP proliant DL585 G7) that we would like to use as a NAS. The Installation went smoothly, the system also recognized the network card which with some versions of Linux often gave problems. But when we try to add disks to the one we used for installation (the server has 8 slots, controlled by HP Smart Array P410i Controller) all the disks we tried to use except the first one gave us this error:" Warning! Disk unusable as pool member - serial number is not legitimate or unique."
We have tried to enable S.M.A.R.T service in Rockstor but it failed with errors.

This is systemctl status result:

  Loaded: loaded (/usr/lib/systemd/system/smartd.service; enabled; vendor preset: enabled)                                                        
 Active: failed (Result: exit-code) since Wed 2022-05-25 16:28:14 CEST; 1min 22s ago
   Docs: man:smartd(8)                                                                                                                           
         man:smartd.conf(5)                                                                                                                      
Process: 8462 ExecStart=/usr/sbin/smartd -n $smartd_opts (code=exited, status=17)

Main PID: 8462 (code=exited, status=17)
Status: “No devices to monitor”

mag 25 16:28:14 rockstorage systemd[1]: Starting Self Monitoring and Reporting Technology (SMART) Daemon…
mag 25 16:28:14 rockstorage smartd[8462]: smartd 7.2 2021-09-14 r5237 [x86_64-linux-5.3.18-59.37-default] (SUSE RPM)
mag 25 16:28:14 rockstorage smartd[8462]: Copyright © 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
mag 25 16:28:14 rockstorage smartd[8462]: Opened configuration file /etc/smartd.conf
mag 25 16:28:14 rockstorage smartd[8462]: Configuration file /etc/smartd.conf parsed but has no entries
mag 25 16:28:14 rockstorage smartd[8462]: Unable to monitor any SMART enabled devices. Try debug (-d) option. Exiting…
mag 25 16:28:14 rockstorage systemd[1]: smartd.service: Main process exited, code=exited, status=17/n/a
mag 25 16:28:14 rockstorage systemd[1]: smartd.service: Failed with result ‘exit-code’.
mag 25 16:28:14 rockstorage systemd[1]: Failed to start Self Monitoring and Reporting Technology (SMART) Daemon.

We have tried to mount disks in every possible way without results (single, spare, all disks in a single one…)
Is there something else we can try?

Thank you in advance
Bob

@ITstaff, welcome to the Rockstor community. While I am not the expert, if I remember correctly there are challenges with the SmartArray controllers, as they don’t really support a native RAID pass-through like JBOD.

While this post is based on the previous Linux CentOS distro, I think it might still hold true for the new OpenSUSE based version of Rockstor:

The only workaround I have found was to configure the SmartArray to essentially configure each drive to be its own RAID0 setup. Further, I did read that, unless a 256MB Battery Backed Write Cache BBWC memory module is installed you can only create two RAIDs using the controller.

Not sure that this is still the only way (as those posts were a few years old), but possibly @phillxnet can shed more light on this. With the nature of btrfs underlying Rockstor, the intent is to have it manage the disks and not the hardware RAID controller.

3 Likes

Thank you for your prompt response. It seems to me that we had already tried to configure any disk in raid 0 but I am not sure (we tried lots of things :smile:) so let’s do some more testing. I would really like to use this NAS which I found very intuitive and easy to configure. Thanks again for your help

@ITstaff Hello there.
Re:

As @Hooverdan indicated this controller has some quirks and under one kernel driver presents drive names that are just not recognised by Rockstor. They kernel drivers presented the drives as regularly named scsi devices. This is what you need. Rockstor uses by-id names.
What is the output of for example (run as the root user setup during install):

ls -la /dev/disk/by-id/

also to see the drivers that are being loaded could you past the output of:

lsmod

It may be the older driver is still being used and we may have to try and persuade the our openSUSE Leap OS kernel to do the right thing with this controller. If it’s possible. As @Hooverdan stated there is the need to have the hardware scsi setup (via it’s bios/firware setup) to present the drives individually, and we need the correct (later) driver to be used so that the drive names presented to the rest of the operating system (and thus to Rockstor).

Take a look also as:

So partially from memory, on the raid card driver from, we have:
cciss = non scsi compatible and has weird device names that Rockstor known nothing about.
hpsa = newer scsi driver that presents drives in a way Rockstor does know about and can use.

But the problem may be, especially given the serial number error, that the cciss is the default driver for older P410i cards. But one can coax the newer compatible driver into driving the older raid controllers via an “hpsa_allow_any=1” module option.

The following thread, although pertaining to our now legacy CentOS based v3 may also help with some background on getting the cciss driver out of the way and the hpsa in by default via grub kernel command line options:

Again this may all not be the problem at all, but one first has to have the raid controller itself setup to present the drives individually. Then the OS should see them. Then they have to be of the newer scsi type.

Apologies for a load of info at once here but just trying to present the potential hickups with some of these older raid controllers. I really don’t know which driver is now active for this raid controller, hence the lsmod request. As if it’s the older cciss that is in charge then there is no compatibility there. But if it’s the newer hpsa then you should be OK.

This, at least initially, is because by default we have an empty smartmontools configuration. We follow our default upstream on this front. See the new doc entry by @Hooverdan on this:
https://rockstor.com/docs/howtos/smart.html#configure-monitoring

But smart monitoring through hardware raid controller can be tricky. See later on in that same doc section for some more info. Plus you need to ensure you have proper scsi drive names first, not the weird cciss names.

I hope I haven’t confused matters more. You are the first forum poster to be engaging with us since the new v4 “Built on openSUSE” (which is openSUSE Leap 15.3 based) who is using one of these older between-drivers raid controllers. And as stated I’m really not sure which driver, by default, grabs this older raid controller. And the last forum link above indicates the potentially required module ‘switches’/optinos. One to have cciss leave well alone, and the other to allow the hpsa to reach-back to manage these older cards so they can be presented in scsi standard form.

Unfortunately I don’t have one of these controllers here to test with. But we may be able to sort out what is needed here and then add a doc section on some installer tweak to address this module issue if it in fact is down to that.

Hope that help

2 Likes

@ITstaff Hello again.

Just posing the following canonical reference for the hspa driver:

https://docs.kernel.org/scsi/hpsa.html

HPSA - Hewlett Packard Smart Array driver

This file describes the hpsa SCSI driver for HP Smart Array controllers. The hpsa driver is intended to supplant the cciss driver for newer Smart Array controllers. The hpsa driver is a SCSI driver, while the cciss driver is a “block” driver. Actually cciss is both a block driver (for logical drives) AND a SCSI driver (for tape drives). This “split-brained” design of the cciss driver is a source of excess complexity and eliminating that complexity is one of the reasons for hpsa to exist.

Supported devices

-Smart Array P212
-Smart Array P410
-Smart Array P410i
-Smart Array P411
-Smart Array P812
-Smart Array P712m
-Smart Array P711m
-StorageWorks P1210m

Additionally, older Smart Arrays may work with the hpsa driver if the kernel boot parameter “hpsa_allow_any=1” is specified, however these are not tested nor supported by HP with this driver. For older Smart Arrays, the cciss driver should still be used.

So given it lists the P410i as ‘native’ you may already be using this driver. This is good. And note the reference to the hpsa_allow_any=1 is for ‘older’ cards so again hopefully you are already using this via the openSUSE Leap 15.3 defaults we use.

So it looks like my comments on module options requried may relate only to the older e.g. P400 Smart Array hardware.

Again apologies for the potential confusion, only as mentioned, this is the first encounter for our v4 with Smart Arrays. We just need to tent to the hardware setup of the raid controller itself vi it’s setup program I suspect.

Hope that helps.

2 Likes

Hello @phillxnet and thanks for your time…

This is the result of “ls -la /dev/disk/by-id/”:
“drwxr-xr-x 2 root root 700 mag 30 2022 .
drwxr-xr-x 8 root root 160 mag 30 2022 …
lrwxrwxrwx 1 root root 9 mag 30 2022 ata-DV-28S-W_10100628181955 -> …/…/sr0
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_00000000 -> …/…/sdb
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_00000000-part1 -> …/…/sdb1
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_00000000-part2 -> …/…/sdb2
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_00000000-part3 -> …/…/sdb3
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_00000000-part4 -> …/…/sdb4
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_01000000 -> …/…/sda
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_02000000 -> …/…/sdc
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_03000000 -> …/…/sdd
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_04000000 -> …/…/sde
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-3600508b1001c407eb88387d13ca29d2f -> …/…/sde
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-3600508b1001c6b17ed447eece1c911e8 -> …/…/sdb
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-3600508b1001c6b17ed447eece1c911e8-part1 -> …/…/sdb1
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-3600508b1001c6b17ed447eece1c911e8-part2 -> …/…/sdb2
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-3600508b1001c6b17ed447eece1c911e8-part3 -> …/…/sdb3
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-3600508b1001c6b17ed447eece1c911e8-part4 -> …/…/sdb4
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-3600508b1001ca686d9b47b52e6f088e8 -> …/…/sda
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-3600508b1001ca890fe95e5d67356b192 -> …/…/sdc
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-3600508b1001cbe43d8d4d871ae513cd7 -> …/…/sdd
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-SHP_LOGICAL_VOLUME_5001438010043670 -> …/…/sda
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-SHP_LOGICAL_VOLUME_5001438010043670-part1 -> …/…/sdb1
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-SHP_LOGICAL_VOLUME_5001438010043670-part2 -> …/…/sdb2
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-SHP_LOGICAL_VOLUME_5001438010043670-part3 -> …/…/sdb3
lrwxrwxrwx 1 root root 10 mag 30 2022 scsi-SHP_LOGICAL_VOLUME_5001438010043670-part4 -> …/…/sdb4
lrwxrwxrwx 1 root root 9 mag 30 2022 wwn-0x600508b1001c407eb88387d13ca29d2f -> …/…/sde
lrwxrwxrwx 1 root root 9 mag 30 2022 wwn-0x600508b1001c6b17ed447eece1c911e8 -> …/…/sdb
lrwxrwxrwx 1 root root 10 mag 30 2022 wwn-0x600508b1001c6b17ed447eece1c911e8-part1 -> …/…/sdb1
lrwxrwxrwx 1 root root 10 mag 30 2022 wwn-0x600508b1001c6b17ed447eece1c911e8-part2 -> …/…/sdb2
lrwxrwxrwx 1 root root 10 mag 30 2022 wwn-0x600508b1001c6b17ed447eece1c911e8-part3 -> …/…/sdb3
lrwxrwxrwx 1 root root 10 mag 30 2022 wwn-0x600508b1001c6b17ed447eece1c911e8-part4 -> …/…/sdb4
lrwxrwxrwx 1 root root 9 mag 30 2022 wwn-0x600508b1001ca686d9b47b52e6f088e8 -> …/…/sda
lrwxrwxrwx 1 root root 9 mag 30 2022 wwn-0x600508b1001ca890fe95e5d67356b192 -> …/…/sdc
lrwxrwxrwx 1 root root 9 mag 30 2022 wwn-0x600508b1001cbe43d8d4d871ae513cd7 -> …/…/sdd”

and this is from lsmod:
"Module Size Used by
af_packet 53248 4
rfkill 32768 1
radeon 1617920 1
i2c_algo_bit 16384 1 radeon
amd64_edac_mod 40960 0
edac_mce_amd 32768 1 amd64_edac_mod
nls_iso8859_1 16384 1
ttm 118784 1 radeon
nls_cp437 20480 1
kvm_amd 110592 0
vfat 20480 1
fat 90112 1 vfat
drm_kms_helper 262144 1 radeon
ccp 102400 1 kvm_amd
kvm 786432 1 kvm_amd
cec 65536 1 drm_kms_helper
irqbypass 20480 1 kvm
rc_core 57344 1 cec
syscopyarea 16384 1 drm_kms_helper
ipmi_ssif 40960 0
sysfillrect 16384 1 drm_kms_helper
sysimgblt 16384 1 drm_kms_helper
pcspkr 16384 0
fb_sys_fops 16384 1 drm_kms_helper
k10temp 16384 0
i2c_piix4 28672 0
joydev 28672 0
netxen_nic 122880 0
hpilo 24576 0
hpwdt 20480 0
ipmi_si 69632 0
ipmi_devintf 20480 0
ipmi_msghandler 118784 3 ipmi_devintf,ipmi_si,ipmi_ssif
button 24576 0
drm 618496 4 drm_kms_helper,radeon,ttm
fuse 139264 1
configfs 57344 1
btrfs 1515520 1
libcrc32c 16384 1 btrfs
xor 24576 1 btrfs
raid6_pq 122880 1 btrfs
hid_generic 16384 0
sr_mod 28672 0
usbhid 61440 0
cdrom 73728 1 sr_mod
sd_mod 61440 4
ohci_pci 20480 0
t10_pi 16384 1 sd_mod
ohci_hcd 61440 1 ohci_pci
uhci_hcd 53248 0
ahci 40960 0
ehci_pci 20480 0
ata_generic 16384 0
ehci_hcd 98304 1 ehci_pci
libahci 40960 1 ahci
pata_atiixp 16384 0
usbcore 311296 6 ohci_hcd,ehci_pci,usbhid,ehci_hcd,ohci_pci,uhci_hcd
libata 286720 4 libahci,ahci,ata_generic,pata_atiixp
hpsa 110592 3
serio_raw 20480 0
scsi_transport_sas 49152 1 hpsa
sg 45056 0
dm_multipath 45056 0
dm_mod 155648 1 dm_multipath
scsi_dh_rdac 16384 0
scsi_dh_emc 16384 0
scsi_dh_alua 20480 0
scsi_mod 262144 10 scsi_dh_emc,scsi_transport_sas,sd_mod,dm_multipath,scsi_dh_alua,hpsa,libata,sg,scsi_dh_rdac,sr_mod "

Looks like the problem is related to the hard disks serial.

This is the result of “lshw -class disk”:
" *-disk:0
description: SCSI Disk
product: LOGICAL VOLUME
vendor: HP
physical id: 1.0.0
bus info: scsi@0:1.0.0
logical name: /dev/sdb
version: 3.52
serial: 5001438010043670
size: 465GiB (500GB)
capabilities: 15000rpm gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=5 guid=8153e902-98f9-48d7-a3ac-4fe49680f98e logicalsectorsize=512 sectorsize=512
*-disk:1
description: SCSI Disk
product: LOGICAL VOLUME
vendor: HP
physical id: 1.0.1
bus info: scsi@0:1.0.1
logical name: /dev/sda
version: 3.52
serial: 5001438010043670
size: 465GiB (500GB)
capabilities: 15000rpm
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512
*-disk:2
description: SCSI Disk
product: LOGICAL VOLUME
vendor: HP
physical id: 1.0.2
bus info: scsi@0:1.0.2
logical name: /dev/sdc
version: 3.52
serial: 5001438010043670
size: 136GiB (146GB)
capacity: 355GiB (381GB)
capabilities: 15000rpm
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512
*-disk:3
description: SCSI Disk
product: LOGICAL VOLUME
vendor: HP
physical id: 1.0.3
bus info: scsi@0:1.0.3
logical name: /dev/sdd
version: 3.52
serial: 5001438010043670
size: 136GiB (146GB)
capacity: 355GiB (381GB)
capabilities: 15000rpm
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512
*-disk:4
description: SCSI Disk
product: LOGICAL VOLUME
vendor: HP
physical id: 1.0.4
bus info: scsi@0:1.0.4
logical name: /dev/sde
version: 3.52
serial: 5001438010043670
size: 136GiB (146GB)
capabilities: 15000rpm
configuration: ansiversion=5 logicalsectorsize=512 sectorsize=512
*-cdrom
description: DVD reader
product: DV-28S-W
vendor: TEAC
physical id: 0.0.0
bus info: scsi@3:0.0.0
logical name: /dev/cdrom
logical name: /dev/dvd
logical name: /dev/sr0
version: C.2D
capabilities: removable audio dvd
configuration: ansiversion=5 status=nodisc"

All disks (in raid 0) have the same serial…
Unfortunately, the smart array cannot be bypassed on this server model.

Thanks again for your help. now I’ll look at the other part of your reply.

Bob

Hello again @phillxnet,
unfortunately tomorrow I have a really delicate work but in the next days I’ll try to:

  • Upgrade firmware for Smart Array
  • Use HPSA driver

I will inform you as soon as possible and thanks again for your time.

Bob

1 Like

Hello,
small update… I have upgraded firmware for:

  • Bios
  • ILO 3
  • P410i Controller (smart array)
  • EH0146FARWD drive (this was suggested directly in the bios at first boot after all the other upgrades)

Unfortunately without luck, problems are the same…
S.M.A.R.T. is unable to start (I have tried with different parameters) and disks have all the same serial number…

This is part of Rockstor log (if it can help I can upload the full version):

"
[01/Jun/2022 15:53:05] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-e91e6af7-023f-496f-8344-787001aaa139).
[01/Jun/2022 15:53:05] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-5224c875-af1f-45ab-b582-8117ff90f0ce).
[01/Jun/2022 15:53:05] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-47a4c9d8-a111-403b-b1b0-94cdcf91c2b8).
[01/Jun/2022 15:53:05] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-db4b2484-6172-4c25-acba-c7d226972188).
[01/Jun/2022 15:53:06] ERROR [storageadmin.util:45] Exception: duplicate key value violates unique constraint “storageadmin_disk_name_key”
DETAIL: Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.
Traceback (most recent call last):
File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/disk.py”, line 504, in post
return self._update_disk_state()
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py”, line 145, in inner
return func(*args, **kwargs)
File “/opt/rockstor/src/rockstor/storageadmin/views/disk.py”, line 426, in _update_disk_state
dob.save()
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py”, line 734, in save
force_update=force_update, update_fields=update_fields)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py”, line 762, in save_base
updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py”, line 846, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py”, line 885, in _do_insert
using=using, raw=raw)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/manager.py”, line 127, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/query.py”, line 920, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/sql/compiler.py”, line 974, in execute_sql
cursor.execute(sql, params)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py”, line 64, in execute
return self.cursor.execute(sql, params)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/utils.py”, line 98, in exit
six.reraise(dj_exc_type, dj_exc_value, traceback)
File “/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py”, line 64, in execute
return self.cursor.execute(sql, params)
IntegrityError: duplicate key value violates unique constraint “storageadmin_disk_name_key”
DETAIL: Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.
"

How can I check whether Rockstor is using the HPSA driver? I’m not sure about it…
Thanks in advance.

Bob

I might be wrong, but you could probably use lspci -v or some other options to list the used kernel drivers. This utility is not part of the Rockstor install based, but you could add it via:
zypper in pciutils
it uses around 200KB so not very big.

1 Like

@ITstaff Hello again.
Re:

Agreed, nice find. That’s a pain.

Cheers. This is a tricky one. As real independant disks always have unique serial numbers and we have this as a hard dependency.

And that log snippet is consistent with Rockstor fighting with independatn drives having the same serial. We basically have to ‘cover’ each duplicate found with a fake-uuid serial that is temporary and changed on each refresh. See the following doc for some background:

The fact that there are sda sdb type names here for the drives concerned pretty much guarantees that hpsa is already being used. Plus we have:

OK, we may be stuffed here with the current raid controller function, as is. But take a look at exactly how we extract the serial using udev here:

Basically it calls

/usr/bin/udevadm info --name=device_name_here

Where the device name is the canonical one such as sdX. And parses the output to gather a serial that looks most like what we used to use exclusively from lsblk but it started to fail a lot in reporting serials and so we moved to failing through udevadm. See also:

And you could start by simply changing the “False” to “True” (and then restarting Rockstor) as from memory I think we only fall through to udev if lsblk can give us no serials at all.

Let see the output of a couple of different disks on this controller that otherwise show up as the same serial. It may be a tweak there could make this controller work in it’s current configuration. I’m assuming there is no other way to configure it to pass drives throught.

The hope I see here is that udev has created by-id names of:

lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_00000000 -> …/…/sdb
...
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_01000000 -> …/…/sda
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_02000000 -> …/…/sdc
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_03000000 -> …/…/sdd
lrwxrwxrwx 1 root root 9 mag 30 2022 scsi-0HP_LOGICAL_VOLUME_04000000 -> …/…/sde

Which looks like artificially assigned serials of 01000000, 02000000, 03000000, 04000000. But if we can see these in the udev output then we may well be able to use them as the serials instead. Long shot but may workout. However before changing anything in the serial finder/parser try that False to True thing first as it may just be we are being initially mislet by lsblk’s serial reporting. For more info on this could you post the output of:

lsblk -P -p -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID

If it’s dishing out those non unique serials of 5001438010043670 then the False -> True change to force udevadm serial retrieval on our part may this may get us a little further along the way.

Hope that helps, and thank for working through this with us.

Nice.

2 Likes

Hello again @phillxnet,
I have changed the parameter’s value as suggested I think…

I have modified the parameter in the file called “osi.py” wich is under “/opt/rockstor/src/rockstor/system”
Now
Is it the rigth procedure? Sorry but I’m not really good…

Anyway this is the result of
“lsblk -P -p -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID” after reboot:
"
NAME="/dev/sda" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“465,7G” TRAN=“sas” VENDOR=“HP " HCTL=“0:1:0:0” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sda4" MODEL="" SERIAL="" SIZE=“463,7G” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE=“btrfs” LABEL=“ROOT” UUID=“4ac51b0f-afeb-4946-aad1-975
a2a26c941”
NAME="/dev/sda2" MODEL="" SERIAL="" SIZE=“64M” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE=“vfat” LABEL=“EFI” UUID=“A48E-EDBF”
NAME="/dev/sda3" MODEL="" SERIAL="" SIZE=“2G” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE=“swap” LABEL=“SWAP” UUID=“42d2c6da-102d-49f2-948e-817a10e6
1370”
NAME="/dev/sda1" MODEL="" SERIAL="" SIZE=“2M” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE="" LABEL="" UUID=""
NAME="/dev/sdb" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“465,7G” TRAN=“sas” VENDOR=“HP " HCTL=“0:1:0:1” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdc" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“0:1:0:2” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdd" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“0:1:0:3” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sde" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“0:1:0:4” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sr0" MODEL=“DV-28S-W” SERIAL=“10100628181955” SIZE=“1024M” TRAN=“sata” VENDOR=“TEAC " HCTL=“3:0:0:0” TYPE=“rom” FSTYPE=”" LABEL="" UUID
=""
"

This is UDEAVADM info for 2 disks (/dev/sdb and /dev/sda):
/usr/bin/udevadm info --name=/dev/sdb
P: /devices/pci0000:00/0000:00:04.0/0000:03:00.0/host0/target0:1:0/0:1:0:1/block/sdb
N: sdb
L: 0
S: disk/by-path/pci-0000:03:00.0-scsi-0:1:0:1
S: disk/by-id/scsi-3600508b1001ca686d9b47b52e6f088e8
S: disk/by-id/scsi-0HP_LOGICAL_VOLUME_01000000
S: disk/by-id/scsi-SHP_LOGICAL_VOLUME_5001438010043670
S: disk/by-id/wwn-0x600508b1001ca686d9b47b52e6f088e8
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:03:00.0/host0/target0:1:0/0:1:0:1/block/sdb
E: DEVNAME=/dev/sdb
E: DEVTYPE=disk
E: MAJOR=8
E: MINOR=16
E: SUBSYSTEM=block
E: USEC_INITIALIZED=12195066
E: DONT_DEL_PART_NODES=1
E: ID_PATH=pci-0000:03:00.0-scsi-0:1:0:1
E: ID_PATH_TAG=pci-0000_03_00_0-scsi-0_1_0_1
E: SCSI_TPGS=0
E: SCSI_TYPE=disk
E: SCSI_VENDOR=HP
E: SCSI_VENDOR_ENC=HP\x20\x20\x20\x20\x20\x20
E: SCSI_MODEL=LOGICAL_VOLUME
E: SCSI_MODEL_ENC=LOGICAL\x20VOLUME\x20\x20
E: SCSI_REVISION=6.64
E: ID_SCSI=1
E: ID_SCSI_INQUIRY=1
E: SCSI_IDENT_SERIAL=5001438010043670
E: SCSI_IDENT_LUN_NAA_REGEXT=600508b1001ca686d9b47b52e6f088e8
E: SCSI_IDENT_LUN_VENDOR=01000000
E: ID_VENDOR=HP
E: ID_VENDOR_ENC=HP\x20\x20\x20\x20\x20\x20
E: ID_MODEL=LOGICAL_VOLUME
E: ID_MODEL_ENC=LOGICAL\x20VOLUME\x20\x20
E: ID_REVISION=6.64
E: ID_TYPE=disk
E: ID_WWN_WITH_EXTENSION=0x600508b1001ca686d9b47b52e6f088e8
E: ID_WWN=0x600508b1001ca686
E: ID_BUS=scsi
E: ID_SERIAL=3600508b1001ca686d9b47b52e6f088e8
E: ID_SERIAL_SHORT=600508b1001ca686d9b47b52e6f088e8
E: MPATH_SBIN_PATH=/sbin
E: DM_MULTIPATH_DEVICE_PATH=0
E: FC_TARGET_LUN=1
E: COMPAT_SYMLINK_GENERATION=2
E: DEVLINKS=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:1 /dev/disk/by-id/scsi-3600508b1001ca686d9b47b52e6f088e8 /dev/disk/by-id/scsi-0HP_LOGICAL_V
OLUME_01000000 /dev/disk/by-id/scsi-SHP_LOGICAL_VOLUME_5001438010043670 /dev/disk/by-id/wwn-0x600508b1001ca686d9b47b52e6f088e8
E: TAGS=:systemd:

"
P: /devices/pci0000:00/0000:00:04.0/0000:03:00.0/host0/target0:1:0/0:1:0:0/block/sda
N: sda
L: 0
S: disk/by-id/scsi-0HP_LOGICAL_VOLUME_00000000
S: disk/by-id/scsi-SHP_LOGICAL_VOLUME_5001438010043670
S: disk/by-id/wwn-0x600508b1001c6b17ed447eece1c911e8
S: disk/by-path/pci-0000:03:00.0-scsi-0:1:0:0
S: disk/by-id/scsi-3600508b1001c6b17ed447eece1c911e8
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/0000:03:00.0/host0/target0:1:0/0:1:0:0/block/sda
E: DEVNAME=/dev/sda
E: DEVTYPE=disk
E: MAJOR=8
E: MINOR=0
E: SUBSYSTEM=block
E: USEC_INITIALIZED=11664722
E: DONT_DEL_PART_NODES=1
E: ID_PATH=pci-0000:03:00.0-scsi-0:1:0:0
E: ID_PATH_TAG=pci-0000_03_00_0-scsi-0_1_0_0
E: SCSI_TPGS=0
E: SCSI_TYPE=disk
E: SCSI_VENDOR=HP
E: SCSI_VENDOR_ENC=HP\x20\x20\x20\x20\x20\x20
E: SCSI_MODEL=LOGICAL_VOLUME
E: SCSI_MODEL_ENC=LOGICAL\x20VOLUME\x20\x20
E: SCSI_REVISION=6.64
E: ID_SCSI=1
E: ID_SCSI_INQUIRY=1
E: SCSI_IDENT_SERIAL=5001438010043670
E: SCSI_IDENT_LUN_NAA_REGEXT=600508b1001c6b17ed447eece1c911e8
E: SCSI_IDENT_LUN_VENDOR=00000000
E: ID_VENDOR=HP
E: ID_VENDOR_ENC=HP\x20\x20\x20\x20\x20\x20
E: ID_MODEL=LOGICAL_VOLUME
E: ID_MODEL_ENC=LOGICAL\x20VOLUME\x20\x20
E: ID_REVISION=6.64
E: ID_TYPE=disk
E: ID_WWN_WITH_EXTENSION=0x600508b1001c6b17ed447eece1c911e8
E: ID_WWN=0x600508b1001c6b17
E: ID_BUS=scsi
E: ID_SERIAL=3600508b1001c6b17ed447eece1c911e8
E: ID_SERIAL_SHORT=600508b1001c6b17ed447eece1c911e8
E: MPATH_SBIN_PATH=/sbin
E: DM_MULTIPATH_DEVICE_PATH=0
E: FC_TARGET_LUN=0
E: ID_PART_TABLE_UUID=8153e902-98f9-48d7-a3ac-4fe49680f98e
E: ID_PART_TABLE_TYPE=gpt
E: COMPAT_SYMLINK_GENERATION=2
E: DEVLINKS=/dev/disk/by-id/scsi-0HP_LOGICAL_VOLUME_00000000 /dev/disk/by-id/scsi-SHP_LOGICAL_VOLUME_5001438010043670 /dev/disk/by-id/wwn-0x600508b10
01c6b17ed447eece1c911e8 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:1:0:0 /dev/disk/by-id/scsi-3600508b1001c6b17ed447eece1c911e8
E: TAGS=:systemd:
"

And this is part of Rockstor log (don’t know if could be useful, looks like an insert fail in database for duplicate key):

"

[06/Jun/2022 12:41:22] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-e91e6af7-023f-496f-8344-787001aaa139).
[06/Jun/2022 12:41:22] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-5224c875-af1f-45ab-b582-8117ff90f0ce).
[06/Jun/2022 12:41:22] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-47a4c9d8-a111-403b-b1b0-94cdcf91c2b8).
[06/Jun/2022 12:41:22] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-db4b2484-6172-4c25-acba-c7d226972188).
[06/Jun/2022 12:41:23] ERROR [storageadmin.util:45] Exception: duplicate key value violates unique constraint "storageadmin_disk_name_key"
DETAIL:  Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 504, in post
    return self._update_disk_state()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 426, in _update_disk_state
    dob.save()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 734, in save
    force_update=force_update, update_fields=update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 762, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 846, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 885, in _do_insert
    using=using, raw=raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/manager.py", line 127, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/query.py", line 920, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/sql/compiler.py", line 974, in execute_sql
    cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
IntegrityError: duplicate key value violates unique constraint "storageadmin_disk_name_key"
DETAIL:  Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.

[06/Jun/2022 12:42:26] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-e91e6af7-023f-496f-8344-787001aaa139).
[06/Jun/2022 12:42:26] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-5224c875-af1f-45ab-b582-8117ff90f0ce).
[06/Jun/2022 12:42:26] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-47a4c9d8-a111-403b-b1b0-94cdcf91c2b8).
[06/Jun/2022 12:42:26] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-db4b2484-6172-4c25-acba-c7d226972188).
[06/Jun/2022 12:42:27] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-e91e6af7-023f-496f-8344-787001aaa139).
[06/Jun/2022 12:42:27] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-5224c875-af1f-45ab-b582-8117ff90f0ce).
[06/Jun/2022 12:42:27] ERROR [storageadmin.util:45] Exception: duplicate key value violates unique constraint "storageadmin_disk_name_key"
DETAIL:  Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 504, in post
    return self._update_disk_state()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 426, in _update_disk_state
    dob.save()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 734, in save
    force_update=force_update, update_fields=update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 762, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 846, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 885, in _do_insert
    using=using, raw=raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/manager.py", line 127, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/query.py", line 920, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/sql/compiler.py", line 974, in execute_sql
    cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
IntegrityError: duplicate key value violates unique constraint "storageadmin_disk_name_key"
DETAIL:  Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.

[06/Jun/2022 12:42:27] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-47a4c9d8-a111-403b-b1b0-94cdcf91c2b8).
[06/Jun/2022 12:42:27] INFO [storageadmin.views.disk:141] Deleting duplicate or fake (by serial) disk db entry. Serial = (fake-serial-db4b2484-6172-4c25-acba-c7d226972188).
[06/Jun/2022 12:42:28] ERROR [storageadmin.util:45] Exception: duplicate key value violates unique constraint "storageadmin_disk_name_key"
DETAIL:  Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.
Traceback (most recent call last):
  File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception
    yield
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 504, in post
    return self._update_disk_state()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 426, in _update_disk_state
    dob.save()
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 734, in save
    force_update=force_update, update_fields=update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 762, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 846, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/base.py", line 885, in _do_insert
    using=using, raw=raw)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/manager.py", line 127, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/query.py", line 920, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/models/sql/compiler.py", line 974, in execute_sql
    cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/utils.py", line 98, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/opt/rockstor/eggs/Django-1.8.16-py2.7.egg/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
IntegrityError: duplicate key value violates unique constraint "storageadmin_disk_name_key"
DETAIL:  Key (name)=(scsi-SHP_LOGICAL_VOLUME_5001438010043670) already exists.
"   

Thanks again.

@ITstaff Hello again and thanks for the additional info:

From:
NAME="/dev/sda" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670”
NAME="/dev/sdb" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670”
NAME="/dev/sdc" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670”
NAME="/dev/sdd" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670”
etc.
You will definitely need to force via the False to True change the use of udev as if we find a serial via lsblk it takes precedence. Hence the fake-serials.

And from:

Agreed. But in this case it’s due to the drive name selected by Rockstor being the same one for every drive, because of the drive name expected to be, in a normal setup, containing the serial as an element. But in this light we have:

And likewise for the other example you provided:

There is hope for this controller arrangement regarding sourcing the serial from udevadm and using the wwn by-id name. But it’s a little bit of a long shot.

If you are game to have a play around, given you have not current settings likely (hopefully). The it may help with supporting this controllers affect on by-id names and side stepping it’s lsblk serial nonsense.

First make sure you have that change I addressed earlier in place:

changed to True. Then you are going to have to reset the database and loose all Rockstor settings in the process.

# stop all relevant rockstor processes prior to full settings reset:
systemctl stop rockstor rockstor-pre rockstor-bootstrap
# delete the user you created when you first did the setup Web-UI setup "admin in this example"
userdel admin
# wipe the 'flag' file that tells Rockstor it has been setup before:
rm /opt/rockstor/.initrock
# restart all rockstor services (or just reboot)
systemctl start rockstor

The Rockstor setup will then wipe the old database and begin afresh. You will have to go through the initial Web-UI setup again.

See what that does no you have changed to forcing udev for serial. It may find it’s way to a solution but let us know either way as there is likely a few other bits and bobs to adapting to this unusual presentation of by-id and we usually avoid on purpose the wwn by-id names here:

and far more importantly here:
get_dev_byid_name() & get_devname()
respectively:


and

There’s a lot of interplay in there and we have full test coverage for all of these critical low level functions and haven’t needed to change them for a number of years now:

https://github.com/rockstor/rockstor-core/blob/master/src/rockstor/system/tests/test_osi.py

So with your provided output I may be able to take a look and reproduce the incompatibility you are seeing with this controller under the appropriate tests by using the provided command outputs. But I’m a little short on time currently with some background stuff I have to see to but first try the force of udev for serial and the db wipe mechanism and see what happens. I’m pretty sure I’ve seen Rockstor use wwn by-id names successfully in the past. But that may have been a long time ago.

The main problem I see is how udevadm is bending it’s normal by-id name generation pattern here and using the odd
/dev/disk/by-id/scsi-0HP_LOGICAL_VOLUME_00000000
and
/dev/disk/by-id/scsi-0HP_LOGICAL_VOLUME_01000000

where the serial is faked likely for a similar reason we have to generate our own serials. That serial doesn’t match any other serial we can retrieve. But I’ve also just noticed we could possible retrieve these odd serials (or scsi luns actually) here; respectively:

E: SCSI_IDENT_LUN_VENDOR=00000000

E: SCSI_IDENT_LUN_VENDOR=01000000

So that’s a little more hope actually.

So provided by-id names from the DEVLINKS udev entry are (for lun 00000000):

/dev/disk/by-id/scsi-0HP_LOGICAL_VOLUME_00000000
/dev/disk/by-id/scsi-SHP_LOGICAL_VOLUME_5001438010043670
/dev/disk/by-id/wwn-0x600508b1001c6b17ed447eece1c911e8
/dev/disk/by-id/scsi-3600508b1001c6b17ed447eece1c911e8

So given we look for the longest, in this corner case a quick hack may be too use the shortest along with a modified serial lookup to use “SCSI_IDENT_LUN_VENDOR” to adapt similarly.

Crazy and likely a strange corner case but lets see what happens if you initially just change use udev for serial thing I suspect we will get the same due to this name clash where udevadm is reporting
“/dev/disk/by-id/scsi-SHP_LOGICAL_VOLUME_5001438010043670”

for all your drives !! A direct obfuscation of the hardware raid not presenting the drives to us as they are. Are you sure there is not another passthough option on this controller. I.e. as her @Hooverdan’s comments earlier.

Lets see what happens with a db reset to at least give the existing code a chance to have another go at things. But longest names are identical and that’s not going to work in this unusual instance as presented I suspect.

Apologies for the thinking aloud here but easier then to pick-up later or for others to chip in with some work around. I’d like to see you sorted here and I’m thinking there is a way but difficult to account for properly without having the command outputs in the test data. And then making the modifications. But again we have made no real modifications for quite some time in this area. But in time we will again as we plan to support multi-path in time. But again all in good time.

Let us know how the initial change you may already have made works out after the db reset. And we can take it form there if you are interested in putting some time in and have a bit of patience as I’m unfortunately a bit way-layed currently. But I do see more hope here having had a little look. Even if it’s a few simple hacks that you may have to ‘carry’. I.e. shortest by-id ("<" not “>”) and looking for “SCSI_IDENT_LUN_VENDOR” as the serial. Super messy really but so is this card presenting all your disks as having the same scsi serial via “SCSI_IDENT_SERIAL”!

Hope that helps. And maybe others can chip in with an alternative approach to squeeze this name presentation into our regime.

Also just noticed this:
/dev/disk/by-id/scsi-3600508b1001c6b17ed447eece1c911e8
and:
E: ID_SERIAL=3600508b1001c6b17ed447eece1c911e8
Oh well. If only the hardware controller would present the drives as per almost all other systems do to date.
Noting for follow-up that we have an existing example of a preferred serial match, if found:

1 Like

Hello again @phillxnet,
another update…
I will try to describe step by step exactly what I have done.

  • I formatted the server to eliminate any possible problems related to previous tests. There was nothing vital on it, the only things I had done were a join with active directory and setting up an NTP server, but that’s a 5-minute task.

  • I also configured all the slots in the server in raid 0 (we have a total of 8 slots). In the previous installation I configured only 5 or 6 of them.

  • Before starting the configuration procedure from the web interface directly with root user I navigate to “/opt/rockstor/src/rockstor/system” and modified the parameter as “always_use_udev_serial = True” in “osi.py” file.

  • Then I started normal configuration from the web page.

Unfortunately the result is the same, disks are flagged as unusable, the output of lsblk is the following:

lsblk -P -p -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID
NAME="/dev/sda" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“465,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:0” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sda4" MODEL="" SERIAL="" SIZE=“463,7G” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE=“btrfs” LABEL=“ROOT” UUID=“4ac51b0f-afeb-4946-aad1-975
a2a26c941”
NAME="/dev/sda2" MODEL="" SERIAL="" SIZE=“64M” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE=“vfat” LABEL=“EFI” UUID=“A48E-EDBF”
NAME="/dev/sda3" MODEL="" SERIAL="" SIZE=“2G” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE=“swap” LABEL=“SWAP” UUID=“42d2c6da-102d-49f2-948e-817a10e6
1370”
NAME="/dev/sda1" MODEL="" SERIAL="" SIZE=“2M” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE="" LABEL="" UUID=""
NAME="/dev/sdb" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:1” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdc" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:2” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdd" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:3” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sde" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:4” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdf" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:5” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdg" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:6” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sdg1" MODEL="" SERIAL="" SIZE=“136,7G” TRAN="" VENDOR="" HCTL="" TYPE=“part” FSTYPE="" LABEL="" UUID=""
NAME="/dev/sdh" MODEL=“LOGICAL_VOLUME” SERIAL=“5001438010043670” SIZE=“136,7G” TRAN=“sas” VENDOR=“HP " HCTL=“1:1:0:7” TYPE=“disk” FSTYPE=”" LABE
L="" UUID=""
NAME="/dev/sr0" MODEL=“DV-28S-W” SERIAL=“10100628181955” SIZE=“1024M” TRAN=“sata” VENDOR=“TEAC " HCTL=“3:0:0:0” TYPE=“rom” FSTYPE=”" LABEL="" UUID
=""

UDEVADM returns again different ID_Serial but same ID_SCSI_SERIAL:

/usr/bin/udevadm info --name=/dev/sda
E: ID_SERIAL=3600508b1001c1ddde72ae81e59a02d90
E: ID_SERIAL_SHORT=600508b1001c1ddde72ae81e59a02d90
ID_SCSI_SERIAL=5001438010043670

/usr/bin/udevadm info --name=/dev/sdb
E: ID_SERIAL=3600508b1001cc1407585f38c72bfea2a
E: ID_SERIAL_SHORT=600508b1001cc1407585f38c72bfea2a
ID_SCSI_SERIAL=5001438010043670

Looks like ID_SCSI_SERIAL always win on ID_SERIAL…

The only thing I didn’t do was to restart the server after changing the parameter to true. Could this have affected the result? Maybe I can repeat the procedure by doing the reboot as well.

Again sorry for my poor english and thanks for your patience and time.

Update 1:

Forget this part, I have reinstalled and rebooted the server to be sure :smile:
Unfortunately nothing has changed…

Update 2:
Lookin’ at the source file this is the interesting part (inside get_disk_serial function):

    if line_fields[1] == "ID_SCSI_SERIAL":                                                                                                       
        # we have an instance of SCSI_SERIAL being more reliably unique                                                                          
        # when present than SERIAL_SHORT or SERIAL so overwrite whatever                                                                         
        # we have and look no further by breaking out of the search loop                                                                         
        serial_num = line_fields[2]                                                                                                              
        break                                                                                                                                    
    elif line_fields[1] == ""ID_SERIAL_SHORT":                                                                                                    
        # SERIAL_SHORT is better than SERIAL so just overwrite whatever we                                                                       
        # have so far with SERIAL_SHORT                                                                                                          
        serial_num = line_fields[2]                                                                                                              
    else:                                                                                                                                        
        if line_fields[1] == "ID_SERIAL":                                                                                                        
            # SERIAL is sometimes our only option but only use it if we                                                                          
            # have found nothing else.                                                                                                           
            if serial_num == "":                                                                                                                 
                serial_num = line_fields[2] 

If I understand correctly it will always return “ID_SCSI_SERIAL” in our situation so no difference using udevadm or other methods…
I was thinking if a possible solution could be modify this part and force using ID_SERIAL_SHORT or ID_SERIAL…

Update3:

Maybe not an elegant solution but I SOLVED the problem this way!!
# if line_fields[1] == “ID_SCSI_SERIAL”:
# we have an instance of SCSI_SERIAL being more reliably unique
# when present than SERIAL_SHORT or SERIAL so overwrite whatever
# we have and look no further by breaking out of the search loop
# serial_num = line_fields[2]
# break
if line_fields[1] == "“ID_SERIAL_SHORT”:
# SERIAL_SHORT is better than SERIAL so just overwrite whatever we
# have so far with SERIAL_SHORT
serial_num = line_fields[2]
else:
if line_fields[1] == “ID_SERIAL”:
# SERIAL is sometimes our only option but only use it if we
# have found nothing else.
if serial_num == “”:
serial_num = line_fields[2]

I commented the lines about ID_SCSI_SERIAL so I forced the use of ID_SERIAL_SHORT and ID_SERIAL…

Unfortunately I think this solution is ok only for this particular situation :smile:

I think one possible solution might be to do a pre-scan of the disks by marking the different ids and comparing them.
If they are different then it is ok, otherwise it has to use different methods until it finds the unique id…I hope it is understandable what I wrote, as I said my English is really bad!

Anyway although I don’t think it will ever be useful to others this is what I did to solve the problem in summary:

  • Updated the various firmware and server bios.
  • Set parameter “always_use_udev_serial” = True
  • Comment out the lines of code related to “ID_SCSI_SERIAL” in the get_disk_serial function (see upper part of the post for details)

I thank everyone who participated in the discussion by giving me valuable support.
Rockstor’s seems like a nice community to me!!!

@phillxnet: if you need some particular test feel free to ask me, I will help with pleasure.

Bob

2 Likes

@ITstaff Thanks for the super detailed progress report and well done on finding a work around.
Re:

Yes that the nub of it. Looks like I circled around this and some how missed it. Nice find.

Yes that is required, as you later stated. Changes to the code don’t afect the running version untill the associated rockstor service is re-started.

Not strickly correct. This entire serial retrieval method (via udevadm) is only called when lsblk has an empty serial. We could probably push this as the default method soon however as we needed the older method to be maintained to keep compatibility with our older lsblk only method. I added the udevadm fall through as we encountered more and more empty lsblk serials in situations where udev could supply serials just fine. Several different versions of serials at that. Oh well.

Works for me. I remember putting this in several years ago and at the time it seemed absolutely required with a test system I had. As you say it’s the exact opposite of what is needed in your setting and this is super interesting and a great find. Cheers.

If this works then great. I’m still not convinced this is the optimal way here however. Could you give us a screen grab of your disks overview page so we can see what names are assigned. I was thinking they would all end up having the same names if only this change was made? Obviously not as you appear to be up-and-running which is great.

Yes, I’d kind of agree currently as we don’t know the exact relevant/reach of the now very long standing precedence of ID_SCSI_SERIAL. As stated and noted in the comment it was found to be a requirement one time so I enforced it as a priority. And here we are years later and it’s the exact opposite!

There is definitely some more work to be done in this area. As always. And our existing tests should help to avoid any regressions. But we have to tread carefully here as this stuff is way down low in the critical path area.

On the contrary. Anyone with a similar controller, presenting non unique lsblk and ID_SCSI_SERIAL reading now have this thread as a reference.

Thanks, and much appreciated. I’d really like to get this setup into our test data so yes some more command outputs may be required. But I can’t unfortunately jump on exactly what is needed just yet. However if you could post a disks overview screen grab that would be good. As I say I think there are still dragons potentially in here somewhere but it would initially be great to know the disk names selected in your setup post the changes you made.

And thanks for chasing this one down with us. Always good to find work-arounds if not absolute fixes and it helps to know the caveats required for some hardware arrangements. Pretty sure there is something still amiss here however but hope is blooming obviously with your findings thus far.

In time we will revamp this layer a little as we add multipath awareness, but again, all in good time.

Cheers. And do keep us informed and keep in mind it’s always worth looking at the code for such things as it’s now quite well documented. Also the prior referenced “Device management in Rockstor” Wiki entry here in the forum is an easy read and may well help with any further needed modifications if they arise.

Hi @phillxnet,

This is a screenshot of our disks now.
I don’t have the screenshot of the previous situation sorry.
Hope it helps…
disks

Is there a better way to post an image?

Anyway keep in mind that hp servers are the most troublesome. I have always struggled with them :smile:

The funny part is that the disks you see are just for testing. By now figured out the trick, though, it shouldn’t be a problem when we put in larger disks.

2 Likes

@ITstaff Cheers.

And yes that does help. Interesting to see the wwn by-id disk name. I stated before I’ve seen other setups where they floated to the top. This may very well be just fine. There may be an issue with disk activity in the dashboard monitor but that may not even work at the lowest level of the OS through the raid controller. We can always approach that one if need be anyway.

Thanks for the pic, reassuring and nice to see. Best create a multi-disk pool and what not just in case and keep an eye on the logs in case there is some hidden confusion. But yes we have had wwn names selected before and not be associated with issues. But if there are then just let us know and hopefully we can track down a fix or work around. On the smart front, that is another issue with hardware raid controllers. We have some info on potential custom smart options but it’s far from an idea arrangement and really quite fiddly potentially. See our:
S.M.A.R.T Howto: https://rockstor.com/docs/howtos/smart.html
and it’s ‘Configure Monitoring’ & ‘S.M.A.R.T through Hardware RAID Controllers’ subsection.

Also you could consider our Scrutiny Rock-on. Upstream repo here: https://github.com/AnalogJ/scrutiny
but it will have all the same issue regarding smart through raid but may have configurations that can help. It also uses the ubiquitous smartmontools but from within a container via the Rock-on system:
https://rockstor.com/docs/interface/overview.html

This is fine, and right click followed by open image in new tab or the like end up giving an original resolution view.

Keep in mind that a Rockstor update may end up reverting these by-hand changes. In production however, you can always apply all non rockstor updates via the method documented here:
https://rockstor.com/docs/installation/install.html#updating-rockstor
irrespective of update channel selection. You don’t really want to be stuck with a locked db when it finds all the names and serials are the same again!! All doable but could be messy. Although such a situation would normally I think only affect Web-UI function, not the underlying services.

Hope that helps and thanks for the feedback and requested screen grab. As always keep us posted on any issues or stuff you find. And of course ideas on how we might improve things.

1 Like

Hi again @phillxnet,
Yesterday we had some problems with another server so I had to stop testing for a moment but today we should have a little more time.

Following the directions now I was able to enable S.M.A.R.T on 7 disks (the last one did not take it despite being identical to the others… maybe has some sort of fault…). Here is the current situation:

smart

smart_details

I tried to install it through Rock-ons but the installation fails. I’m now trying to figure out why. Other Rock-ons (like for example Netdata) worked perfectly…

Good suggestion, I hadn’t thought of that!
I will now set up the configuration you suggested.

I’ll try to keep you informed.
Thanks as always and have a nice day.

Bob

2 Likes

Just a quick note on your Scrutiny Rockon experience. It seems there is currently an issue with the underlying image from Linuxserver.io due to new prerequisites introduced to scrutiny. For tracking purposes I have added a new issue on our github repository:

So, for now this won’t be a viable monitoring tool, until this is sorted. sorry about that. Of course, you can look at the original repository and use the docker command line to install scrutiny in the interim, but for now the Rockon will be temporarily non-functional. Thanks for (inadvertently) finding this issue :slight_smile:

3 Likes

Hi @Hooverdan, thanks for the info and for reporting the problem.

About finding issues… I have a natural talent :grin::grin::grin:

Have a nice day

3 Likes