Disk error on rescan

[Please complete the below template with details of the problem reported on your Web-UI. Be as detailed as possible. Community members, including developers, shall try and help. Thanks for your time in reporting this issue! We recommend purchasing commercial support for expedited support directly from the developers.]

Brief description of the problem

[write here] Problem disk formated in windows ans add to system. It does not show up in disk page and when I do rescan I get this error.

Detailed step by step instructions to reproduce the problem

[write here]

Web-UI screenshot

[Drag and drop the image here]

Error Traceback provided on the Web-UI

Traceback (most recent call last): File "/opt/rockstor/src/rockstor/rest_framework_custom/generic_view.py", line 41, in _handle_exception yield File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 418, in post return self._update_disk_state() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib64/python3.11/contextlib.py", line 81, in inner return func(*args, **kwds) ^^^^^^^^^^^^^^^^^^^ File "/opt/rockstor/src/rockstor/storageadmin/views/disk.py", line 213, in _update_disk_state pool_info = dev_pool_info[dev_name] ~~~~~~~~~~~~~^^^^^^^^^^ KeyError: '/dev/sdd'

@Stevek Hello again.

It could be the the windows formatting has confused things somewhere along the line, we use un-formatted disks ideally and can usually wipe them from within the Web-UI. What you are after is no formatting or partitioning at all. Internally, when we can identify a drives partitioning and formatting, our Web-UI presents the option to wipe partitions individually or the disk as a whole; with the latter our strongly preferred option.

The command we use to entirely wipe a drives partition info is as follows:

Which translates to, run as the root user, the following:

/usr/sbin/wipefs -a drive-name-here

And to give more reasurance that you are about to wipe (DESTROY ALL DATA ON) the correct drive (the /dev/sdX type names can change from one reboot to another) we use by-id names internally.

To get by-id names, which have more info about the drive make model etc, you can run:

ls -la /dev/disk/by-id/

So in short the Web-UI may be confused by existing partitioning on the drive (we need none ideally - as per new drives), or the drive may have no serial, i.e. you are using a VM setup of sorts. It is also possible that the drive is within an enclosure that obfoscates the serial number of it’s drive members: making them all look the same. That type of enclosure is not compatible with Rockstor’s use of drive serial numbers. Normally there is a Web-UI indication of this however.

If the drive has no data, or it has no value (you mention it has been formatted) then the above command can wipe all partitions ready for a fresh start under btrfs and Rockstor management.

So more info here would be good, and we may have some bugs remaining as we have made recent changes in that area. The output of for example the following command could help to establish this:

/usr/bin/lsblk -P -p -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID

Assuming you are running our latest testing release of 5.0.10-0.

Note also that we plan to release a new testing version soon (5.0.11-0) that does have a drive management fix applied, but that looks a little different to what you have shown here. And pertains to VM use only we think.

Hope that helps.

1 Like

Linking in here a previous very similar report of yours, from 10 days ago, regarding what looks like the same error:

Which you closed as:

So you may have some intermittent configuration here. Are you experimenting with Rockstor using a VM in these cases. They need special configuration re providing unique and static serials to the drives used for Rockstor, and some are just not compatible at all.

Hope that helps.

2 Likes

Thanks Phil. Sorry this was a repeat. Resolved by deleting partition and putting it back in. I’m not super comfortable with some of these types of actions in CLI.

I’m still worried there is something weird with the disk. I will be testing it with some data this week.

2 Likes

You had asked for some more info that I never sent. Here it is…

vault:~ # /usr/bin/lsblk -P -p -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID

NAME=“/dev/sda” MODEL=“ST9500423AS” SERIAL=“S2V09CM0” SIZE=“465.8G” TRAN=“ata” VENDOR="ATA " HCTL=“2:0:0:0” TYPE=“disk” FSTYPE=“btrfs” LABEL=“Data-3_500GB” UUID=“cc528b4a-8057-452e-89ce-1a795acd14f2”

NAME=“/dev/sdb” MODEL=“ST31000528AS” SERIAL=“5VP1AAS4” SIZE=“931.5G” TRAN=“ata” VENDOR="ATA " HCTL=“2:0:1:0” TYPE=“disk” FSTYPE=“btrfs” LABEL=“Data-2_1TB” UUID=“10d34f47-4c7f-428c-98e0-b73fe941d491”

NAME=“/dev/sdc” MODEL=“TOSHIBA DT01ACA200” SERIAL=“X6IMBL9AS” SIZE=“1.8T” TRAN=“sata” VENDOR="ATA " HCTL=“1:0:0:0” TYPE=“disk” FSTYPE=“btrfs” LABEL=“Data-1” UUID=“fab7c0cd-1a77-405b-87ee-22b84b09180a”

NAME=“/dev/sdd” MODEL=“ST1000LM035-1RK172” SERIAL=“ZDEC9PNM” SIZE=“931.5G” TRAN=“sata” VENDOR="ATA " HCTL=“3:0:0:0” TYPE=“disk” FSTYPE=“btrfs” LABEL=“Data-4_1TB_bad” UUID=“17860343-ae11-4797-aa84-f29e83b8fb67”

NAME=“/dev/sde” MODEL=“WDC WD10EARX-00N0YB0” SERIAL=“WD-WMC0S0882589” SIZE=“931.5G” TRAN=“sata” VENDOR="ATA " HCTL=“4:0:0:0” TYPE=“disk” FSTYPE=“btrfs” LABEL=“Data-1” UUID=“fab7c0cd-1a77-405b-87ee-22b84b09180a”

NAME=“/dev/sdf” MODEL=“SSD 64GB” SERIAL=“AA000000000000002365” SIZE=“59.6G” TRAN=“sata” VENDOR=“ATA " HCTL=“5:0:0:0” TYPE=“disk” FSTYPE=”" LABEL=“” UUID=“”

NAME=“/dev/sdf1” MODEL=“” SERIAL=“” SIZE=“2M” TRAN=“” VENDOR=“” HCTL=“” TYPE=“part” FSTYPE=“” LABEL=“” UUID=“”

NAME=“/dev/sdf2” MODEL=“” SERIAL=“” SIZE=“64M” TRAN=“” VENDOR=“” HCTL=“” TYPE=“part” FSTYPE=“vfat” LABEL=“EFI” UUID=“1381-2428”

NAME=“/dev/sdf3” MODEL=“” SERIAL=“” SIZE=“2G” TRAN=“” VENDOR=“” HCTL=“” TYPE=“part” FSTYPE=“swap” LABEL=“SWAP” UUID=“ee6de8d5-eb1c-4d19-943c-0d24ecee92a7”

NAME=“/dev/sdf4” MODEL=“” SERIAL=“” SIZE=“57.6G” TRAN=“” VENDOR=“” HCTL=“” TYPE=“part” FSTYPE=“btrfs” LABEL=“ROOT” UUID=“46befa9e-b399-480d-b10d-338a4c875615”

NO I am not using a VM but I am experimenting a lot.

@Stevek Thanks for the extra info.

And glad the partition info wipe resolved the issue.

Re:

Understand, and they shouldn’t be required; our partition wiping facility via the Web-UI should have served your. But the failure you experienced threw the Web-UI so only the cli was left. We are gradually bit-by-bit getting more robust to these types of failure but having just re-done some low level stuff we have taken a little step back. But our code was almost unapproachable in this area before: where-as now it has already proven to be far easier to maintain and extend. Kind of two steps forward and one back type thing.

Yes, always a concern. You could take a look at our following doc:

Pre-Install Best Practice (PBP): Pre-Install Best Practice (PBP) — Rockstor documentation

It has some tips on testing things like memory and drives. The drive test suggested there will write something to every part of the drive (multiple times if you choose that option) so it ends up giving the tested drive quite the work-out/test. And can also be good for ensuring all data on a drive is properly wiped before decommissioning/re-commissioning. Definitely worth taking a look.

Your lsblk command output (list block devices) looks to have been taken after the dodgy partition removal from the looks of it. If you find another problem drive post the same output again with that drive included before the wipe, and we can hopefully develop a test using it to see where we might be tripping up. Always tricky at this low level as there are so many combinations and system from which drive formats/partitions might have been inherited from. But that command usually wipes all but some mdraid setups that we need more docs for actually. But the PBP doc disk wipe section will remove anything of any prior disk use.

Hope that helps.

2 Likes