[Please complete the below template with details of the problem reported on your Web-UI. Be as detailed as possible. Community members, including developers, shall try and help. Thanks for your time in reporting this issue! We recommend purchasing commercial support for expedited support directly from the developers.]
Brief description of the problem
cant add disks to storage
Detailed step by step instructions to reproduce the problem
clicked storage button in header to add disks to system (have 4 data disks installed) but get following error
File “/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py”, line 41, in _handle_exception
yield
File “/opt/rockstor/src/rockstor/storageadmin/views/disk.py”, line 418, in post
return self._update_disk_state()
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib64/python3.11/contextlib.py”, line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/src/rockstor/storageadmin/views/disk.py”, line 115, in _update_disk_state
attached_disks = scan_disks(MIN_DISK_SIZE)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/opt/rockstor/src/rockstor/system/osi.py”, line 330, in scan_disks
blk_dev_properties: dict = {
^
File “/opt/rockstor/src/rockstor/system/osi.py”, line 336, in
for key, value in (
^^^^^^^^^^
ValueError: not enough values to unpack (expected 2, got 1)
This part of the code is @phillxnet expertise but I may be able to move forward a bit. Could you run the following command and paste its output here, please?
@coffax Welcome to the Rockstor commuinity forum. @McFaul Hello again, and thanks for chipping in on this one with the requested info.
In 5.0.9-0 we made significant changes to a very low level of our disk ‘parsing’ this was a long awaited simplification of the code as it had become rather unmaintainable. However your reports indicate that we have a new failure. Inevitable but hopefully approachable. What you are seeing is not covered by our many unit tests in this area. So what we need is to reproduce this failure in a test environment so that we can prove a fix. The fist step is to turn on debug logging, this should give us the output of the following initial parse result of the lsblk line @Flox requested the output form.
The debug log line we are first interested in is the following:
To turn on debug-mode
As the root user run:
cd /opt/rockstor
poetry run debug-mode ON
@coffax it would help to also have your output from the command that @Flox indicated.
Yes, if there is a failure to parse, it may well result in the Web-UI not having knowledge of any of the drives. We deal with them all similarly now - that was the major simplification that was long overdue. It should in-time allow us to be way more flexible on our system pool - with the eventual aim of supporting multi-device for the ROOT pool. But lets get this bug resolved first.
@McFaul Could you also confirm that you do get this exact same error that @coffax originally reported, i.e. the following:
Thanks all, This was a rather bold change, in low level code, but the prior arrangement was way more complex. And it had held back progress on our future flexibility. Plus we fixed some narly repeat bug types with these changes. We just have to route out a work around for the no serial parsing we see here. Later in this same code we would resource udev for the missing serials, just in case it can retrieve them, but alas we don’t get that far.
I’ll begin looking into this bug now and await feedback here also.
@McFaul The failure as currently reported by @coffax may end up blocking the indicated debug logging line. But we first have to confirm you have the exact same error as originally reported. Plus once debug mode is on we can more easily confirm a hot-fix: assuming we come up with one that is easy to hand-patch in.
I’d love to help but I think i would not be a lot of help since i was just giving rockstor another try in a hyper-V VM , so definitely not a supported/recommended environment. I had assumed it was just a quirk of running under hyper-v it was only when i spotted someone else on the forum with the same error i thougth i’d chip in)
It was not a working setup - it was a new install and i was looking to initialise a new pool with a couple of passed-through disks. (although from output from lsblk seems to suggest that the system drive doenst have an UUID, which i dont know if that is normal, but the pass through disk it could detect the disk and serial number with no issues NAME=“/dev/sda” MODEL=“ST18000NM000J-2T” SERIAL=“ZR54RJQ2”)
If you think it would be helpful i am happy to try the commands above, but given the temporary nature of the test environment I understand if you may prefer not to be chasing ghosts here! (it does also mean that i have literally nothing to lose trying any potential solutions!)
@McFaul Thanks for the extra context here - most helpfull.
@coffax is your reproducer system also in a VM? And if so what is the hypervisor and can you provide any more info on the setup?
I’m actually working on this problem now. And hope to be able to chip-in myself with some more context about what is throwing our new lsblk parsing here. I’m using your reported lsblk output to try and develop a reproducer test, and yes the serial does look as you say. I’ll report back here once I have more idea about exactly what part of this lsblk output is throwing our new parser. We do alreay have a number of tests based on real-hardware, all of which parse before we publish each rpm - so yes there may be some peculiarity regarding the VM hardware mocking/or passthrough interpretation that we just don’t have a test for yet.
New issue that I’m working against for this forum report:
In the ongoing linked GitHub issue this does now look to be VM related: your output has helped to reproduce the issue locally, and I’ve now got a unittest to demo this behaviour. I’m pretty sure the white space VENDOR entry returned by you VM is the issue here i.e.:
VENDOR=" "
That fails our parsing and looks nothing like what we have seen for real hardware. Can you look to see if there is a setting in your hypervisor to nudge it out of this buggy behaviour. It may just be we need to inform folks of this during VM setups. As you say it’s not our target audience, and I’m keen not to complicate all supported instances for the sake of unsupported ones. So if there is a hypervisor setting that forces a non white-space disk VENDOR report then dandy. We can deal with “” vendors and the other vendor you supply here of:
VENDOR="Msft "
But the whitespace only VENDOR is a no-go for us I’m afraid. Again I’ll look to fixes but I’m not prepared to complicate our code for flaky/buggy corrupt reporting of the disk VENDOR element.
Hope that helps, and this has been a help for sure. Always good to know where we fail, and the exact context so we can help folks avoid these out-of-bounds arrangements.
May be there is the ability to inject something like the vendor code (or other Vendor Product Data aka VPD)? On Virtualbox that seems possible (though I have not encountered that as an issue in the past, as the vendor there was usually tagged as ATA. I couldn’t quickly find whether that’s (easily) possible on Hyper-V or other KVMs if it’s not populated by default.
Ive had quite a look online for anything similar and i cant find a way to manually set a vendor for a passed thru disk. it may be possible via powershell, but it is beyond my somewhat limited hyper-v abilities
No worries, I’ve popped in a proposed fix that helps to work around these apparently hyper-v originating whitespace lsblk values. It’s a shame we have to, and I’ve never seen this in real-hardware, but alas it may help folks experimenting. This should be included in our next testing release - but we won’t be rushing to that release as this pertains to unsupported installs anyway: at least as far as we have yet had reported.
Thanks again for your assistance in this issue and thanks also to @coffax for the original report. However we do need feedback to proceed with fixes.
Take a look at the proposed changes here:
If you fancy tinkering you could by-hand apply the changes proposed for the src/rockstor/system/osi.py file. And a reboot, or full rockstor service restart later, should see this issue at least resolved. The other changes there are a test to reproduce this failure with the older code to prove the newer code provide a fix, and to help not fail in this way again. Interesting as we have uncovered in this process another potential parsing fail, but it has yet to have seen in reality. Bit by bit.