[solved] Disks failing with badge "Disk is a mdraid member"

hey all,

I have four drives installed in a fresh Rockstor setup, Version 3.8.-12 from which only two are available for a pool creation. SMART works.

I see the two failing Drives in the section “Disks” in the UI with an i badge saying “Disk is a mdraid member” which isn’t true. They had been but I deleted the partition table with several methods (parted, fdisk, dd).

Can someone help me to fix this and get the drives running?

Thx in advance,
elmcrest

inspired by a different topic, this might help:

[root@rockstor ~]# /usr/bin/lsblk -P -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID
NAME="sda" MODEL="Speed Line      " SERIAL="13110600013565" SIZE="7.3G" TRAN="usb" VENDOR="Intenso " HCTL="8:0:0:0" TYPE="disk" FSTYPE="iso9660" LABEL="Rockstor 3 x86_64" UUID="2016-02-25-19-16-52-00"
NAME="sda1" MODEL="" SERIAL="" SIZE="726M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="iso9660" LABEL="Rockstor 3 x86_64" UUID="2016-02-25-19-16-52-00"
NAME="sdb" MODEL="SAMSUNG SSD 830 " SERIAL="S0VYNYAC209249" SIZE="119.2G" TRAN="sata" VENDOR="ATA     " HCTL="2:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sdb1" MODEL="" SERIAL="" SIZE="500M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="ext4" LABEL="" UUID="9820ef62-94ba-4a4d-99c1-0bb34bc1d515"
NAME="sdb2" MODEL="" SERIAL="" SIZE="7.9G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="swap" LABEL="" UUID="de6afed5-099b-4a1c-921f-863a71eea5dc"
NAME="sdb3" MODEL="" SERIAL="" SIZE="110.9G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="btrfs" LABEL="rockstor_rockstor" UUID="28bf811f-d448-43e4-9431-eb4171cdde5a"
NAME="sdc" MODEL="WDC WD20EARX-00P" SERIAL="WD-WCAZAH541143" SIZE="1.8T" TRAN="sata" VENDOR="ATA     " HCTL="4:0:0:0" TYPE="disk" FSTYPE="btrfs" LABEL="test" UUID="9bbc7e69-1ec1-42c8-bccb-07db0435eaa3"
NAME="sdd" MODEL="WDC WD20EFRX-68E" SERIAL="WD-WCC4M7VX31D7" SIZE="1.8T" TRAN="sata" VENDOR="ATA     " HCTL="5:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sde" MODEL="ST2000DM001-1CH1" SERIAL="Z1E2M40Z" SIZE="1.8T" TRAN="sata" VENDOR="ATA     " HCTL="6:0:0:0" TYPE="disk" FSTYPE="btrfs" LABEL="test" UUID="9bbc7e69-1ec1-42c8-bccb-07db0435eaa3"
NAME="sdf" MODEL="WDC WD20EFRX-68E" SERIAL="WD-WCC4M1EUJ9YC" SIZE="1.8T" TRAN="sata" VENDOR="ATA     " HCTL="7:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""

I deleted the partition tables as described here (note skip is wrong, should be seek, see comments):

https://www.bfccomputing.com/zero-a-gpt-label-using-dd/

this also doesn’t help:

mdadm --zero-superblock /dev/sdX

some futher infos:

[root@rockstor ~]# fdisk -l /dev/sdc

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

[root@rockstor ~]# fdisk -l /dev/sdd

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

[root@rockstor ~]# fdisk -l /dev/sde

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

[root@rockstor ~]# fdisk -l /dev/sdf

Disk /dev/sdf: 2000.4 GB, 2000398934016 bytes, 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Did you try a wipefs -a /dev/sdX? And you said you used dd but did you let it wipe the whole disk?

yeah did that just now, without any changes …

at least wipefs deleted something, but I did mkfs.btrfs earlier, so I assume it deleted that …

[root@rockstor ~]# wipefs -a /dev/sdc
/dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
[root@rockstor ~]# wipefs -a /dev/sde
/dev/sde: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
[root@rockstor ~]# wipefs -a /dev/sdb
[root@rockstor ~]# wipefs -a /dev/sde
[root@rockstor ~]# 

No it turns out the metadata from a mdraid and the Intel RST RAID are surprisingly durable. I ran into the same problem a while back.

You can try to list the offsets with a

wipefs /dev/sdX

and then erase using

wipefs -o [offset] /dev/sdX

to delete each one.

What I ended up doing though is just brute forcing it with

dd if=/dev/zero of=/dev/sdX bs=100M

1 Like

doesn’t this takes like hours?

dd if=/dev/zero of=/dev/sdX bs=100M

1 Like

Ya it is admittedly not a graceful solution but as it wipes the entire disk it is pretty much guaranteed to clean anything from a different FS. I tend to try it more as a last resort.

1 Like

ok, a reinstallation of rockstor solved it. seems a little weird but I guess there is an bug in the code that doesn’t check again once it is marked as “mdraid member” or it checks faulty …

anyways, now my activation key doesn’t work anymore :frowning:

1 Like

That is kinda strange did you hit the refresh button on the disk panel?

No it won’t new install new appliance ID just send them an email I saw it mentioned somewhere on here that would happen and they would have to send you a new key.

EDIT: Ya it was here

1 Like

ok thx.

and yes of course I hit rescan like a maniac :slight_smile:

Haha just figured I’d check

@elmcrest and @Spectre694 Well done people; I’m pretty sure you’ve found a bug there.

Yes, sorry about that, :blush: pretty sure that’s my fault. I’d considered mdraid status only in the context of a system drive where a re-install is assumed and fairly sure that I neglected to update the db status on such members when that membership is from a prior use. I have created an issue mdraid member status not updated so that we don’t loose this one as it would be a nice one to get fixed.

I’m currently working on another element of Rockstor but will try to get around to this one soon.

Thanks to both of you for persevering and rooting this one out.

@elmcrest Well done for the lsblk output, I’m assuming that was taken after you manually wiped the mdraid superblock from the mdraid labelled disks.

hey Philip,

no worries, only bad thing was that it took some hours to find this. so I’m glad we figured this.
I think you guys should mention this in the installation documentation (or the pre installation documentation), I’d expect this rather common for guys testing rockstor.

I start to really like rockstor - though there are some major things still missing I feel it’s developing pretty good. so keep the good work! :slight_smile:

elmcrest

@elmcrest and @Spectre694 Just reporting back that as of Rockstor 3.8-13.11 (testing channel updates) the issue I opened as a result of this thread, mdraid member status not updated, has now been closed. So from this version on, assuming all goes well, the disk page should now correctly update any mdraid member status changes to drives visible to Rockstor.

Thanks again for reporting this and although I know you were able to resolve this by ‘brute force’ re-install I realized when looking into the issue that simply disconnecting the drive after wiping it’s mdraid member status as you did, and hitting ‘Rescan’ on the disks page, then removing the ‘missing’ drive entry via the bin icon before re-attaching the drive would also have resulted in a quicker work around. Alas we now have in place a proper update for mdraid membership changes.

@elmcrest I have also opened a new issue in the rockstor-doc repo to address your docs suggestion. The issue is entitled “improve mdraid content re re-deployed prior mdraid drives” and links back to your last post in this thread for context.

Bit by bit.

2 Likes