VTd passthru of LSI2008 HBA not detecting disks in 3.8-10 Rockstor release

whitey · December 20, 2015, 7:26pm

And here is a GOLD screenshot, I think the int’s are still pissed at me, I can see no int conn on initial boot (may go remove secondary vnics that WERE to be used for stg traff) just for now, I had to ifconfig stgintvnic# down, then I saw hung/stale yum job take off like a rocket, caught one prior and one after. Looks like one is happy/one maybe now, this is using ‘screen’ so i can give you a two for one view. Do I need to down/up rev one or other. Whew this is GNARLY! hah

EDIT: both web int’s are back now w/ secondary data vnic down. both appliances at 3.8-10.02 now :-D, help me climb this final hump if you guys would be soo kind. Want to proceed cautiously, I knwo you said downgrade python but I wanna see what you think of current state w/ one at one ver and the other at another, very odd to me (looks to me like primary (rockstor) did the downgrade and that secondary (rockstor-dr) did not but that was the one I JUST upgraded after realizing net conn was hosing up upgrade) SMH

whitey · December 20, 2015, 7:38pm

And they just fixed themselves (rather rockstor-dr/secondary did) NICE.

Both appliances at 3.8-10.02, python seemingly auto-magically downgraded. Am I good to go to setup a rep job ya gents think?

I see i must have AGAIN been blind or maybe web int changed, now on secondary NFS data vnics that were causing contention/routing confusion it seems I DO see how i can leave off the DGW/DNS info, YEA, things seem solid now. ROCKON FOR ROCKSTOR!

whitey · December 20, 2015, 7:46pm

Shoot me, I just can’t win, now all disks are offline (mix of 4 disk HGST husslbss600 ssd’s in r10 and 4 disk intel s3700 200gb ssd’s in r10). Both working previous to upgrade, no import/wipe, just remove/delete.

I feel like I am living a groundhog day

suman · December 20, 2015, 8:00pm

Disks not showing up could be a matter of scan failure. What’s the output of systemctl status -l rockstor-pre rockstor rockstor-bootstrap

whitey · December 20, 2015, 8:15pm

Did I ever mention how AWESOME you guys are w/ the quick responses, LOVE the direct interaction and being able to provide valuable feedback, hope I am not being a PITA/bother.

whitey · December 21, 2015, 2:14am

I really do think there is something seriously wrong here, I just wiped a disk clean again, built out a new RS 3.8.10 box that COULD see disk prior to update to 3.8.10.02 box, just said I had to wipe it to be usable but the button was there AT LEAST, now after update and reboot all I can do it delete from system and now after several reboots it is still not seeing the disk.

Could use some guidance, very dejecting.

phillxnet · December 21, 2015, 1:13pm

@whitey I think that would be a bit extreme.
All I can think of off the top of my head (for now) is that another update has upset the disk scan with your particular hardware as we have reports of 3.8-10.02 installing and rebooting and replicating successfully outside of our own testing. The previously referenced forum thread re Replication has such a report from @jason1 . So there is hope. We just have to narrow down why on you’re system the disk scan is finding no drives after the 3.8-10.02 testing update.

Background: between 3.8-10 (iso) and 3.8-10.02 (testing) there were fixes to how missing drives are represented in the db. However there were no significant changes to how the drives were scanned, at least on the rockstor side. Given this info and from your last Web-UI screenshot (3.8-10.02) it looks like this system is working as expected; bar not seeing any of the attached drives of course, it is simply listing all those devices that use to be attached. When there are long random numbers for device names that is simply shorthand for missing / offline device (in testing) in the new arrangement.

Disks scan is performed by a Rockstor internal function (fs/btrfs/scan_disks) which is mostly unchanged except for how it allocates fake serial numbers to repeat serial of blank serial devices. scan_disks in turn uses the following linux command (from util-linux) to find info on all attached disks.

/usr/bin/lsblk -P -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID

the result in a vm here with 3.8-10 and no data drives is:-

NAME="sda" MODEL="QEMU HARDDISK   " SERIAL="QM00005" SIZE="8G" TRAN="sata" VENDOR="ATA     " HCTL="2:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sda1" MODEL="" SERIAL="" SIZE="500M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="ext4" LABEL="" UUID="f34fe05e-c9bf-49eb-bcd5-d027ebaba79a"
NAME="sda2" MODEL="" SERIAL="" SIZE="820M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="swap" LABEL="" UUID="8f992f43-7100-4027-b09e-b6b29a3f0252"
NAME="sda3" MODEL="" SERIAL="" SIZE="6.7G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="btrfs" LABEL="rockstor_rockstor" UUID="e526c68b-ca0f-43d0-9c16-69c75c1a7e32"
NAME="sr0" MODEL="QEMU DVD-ROM    " SERIAL="QM00001" SIZE="1024M" TRAN="ata" VENDOR="QEMU    " HCTL="0:0:0:0" TYPE="rom" FSTYPE="" LABEL="" UUID=""

This info is then parsed by scan_disks and passed on for db update purposes to update_disk_state(). In your case it looks like scan_disks is saying there are no attached disks. Util-linux was one of the recent base system updates from version util-linux.x86_64 0:2.23.2-22.el7_1.1 to util-linux.x86_64 0:2.23.2-26.el7.

The same command here in this example vm after updating to 3.8-10.02 (testing channel) gives the exact same output.

Could you post the output of that same command on your problem system with 3.8-10 and then with 3.8-10.02 testing; their shouldn’t be any differences but if there are then that could be the cause of the problem. Thanks.

If you paste the command results here a triple backtick on the lines before and after will aid in it’s reading as otherwise it’s not very eye friendly.

Also if you run:

tail -n 20 -f /opt/rockstor/var/log/rockstor.log

and then press the Rescan button in the Disk screen what is the output.

Thanks.

whitey · December 21, 2015, 6:13pm

Output from 3.8.10.02 box of ‘/usr/bin/lsblk -P -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID’ w/ 1 data disk attached but NOT seen in GUI.

NAME="fd0" MODEL="" SERIAL="" SIZE="4K" TRAN="" VENDOR="" HCTL="" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sda" MODEL="Virtual disk    " SERIAL="" SIZE="20G" TRAN="spi" VENDOR="VMware  " HCTL="3:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sda1" MODEL="" SERIAL="" SIZE="500M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="ext4" LABEL="" UUID="12936957-4a19-457f-a8a0-e55368b82830"
NAME="sda2" MODEL="" SERIAL="" SIZE="2G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="swap" LABEL="" UUID="edc4f918-2282-4317-b01b-77df15bae459"
NAME="sda3" MODEL="" SERIAL="" SIZE="17.5G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="btrfs" LABEL="rockstor_rockstor" UUID="561bccc2-0281-474d-b481-9faaead4f241"
NAME="sr0" MODEL="VMware IDE CDR10" SERIAL="10000000000000000001" SIZE="707M" TRAN="ata" VENDOR="NECVMWar" HCTL="1:0:0:0" TYPE="rom" FSTYPE="iso9660" LABEL="Rockstor 3 x86_64" UUID="2015-12-11-16-37-13-00"```

Outout from tail cmd

```[21/Dec/2015 11:14:45] INFO [storageadmin.views.disk:69] update_disk_state() Called
[21/Dec/2015 11:14:45] INFO [storageadmin.views.disk:81] Deleting duplicate or fake (by serial) Disk db entry. Serial = fake-serial-c9f74217-b2c7-4b26-89a4-8d795aeafb81```

phillxnet · December 21, 2015, 6:58pm

@whitey Yes that is what I suspected, there is no data disk reported there by lsblk, only the sda system disk.
Sorry to be a pain but what does the same lsblk command produce when you are on plain 3.8-10 with the same disk arrangement.

Also Note that your system disk does not have a serial assigned to it; at least according to this lsblk. That is a must to sort out (in the hyper visor) and for legacy purposes don’t make that serial “sda”. This should at least sort out the system disk entry. I’m pretty sure the fake-serial log entry is as a result of the system disk having an empty serial number and a fake one having to be assigned temporarily by scan_disks.

Thanks. I think we are narrowing down on this one.

By the way if you surround your command output in the forum with 3 back ticks on the line before it and after it then it displays in an easier on the eye manner.

I’m currently thinking that with your setup it looks like you may need to downgrade the util-linux to the one that worked for you, best to double check though with the above output as then we can see for sure if that is it. Also I am as yet unaware of the knock ons with other packages.

whitey · December 21, 2015, 7:33pm

Dang, it i got all inpatient and upgraded on a fresh reload prior to lsblk cmd, I can reload now but here are updated screenshots showing 3.8.10 v.s. 3.8-10.02 (goes from wipe disk option to delete ONLY option after upgrade). Will reload and give you other output. I think I know how to format it now, sorry about that.

Notice how it is changing from generic /dev/sdb/c/d/e/f to UUID looking entries for names, that intended?

whitey · December 21, 2015, 7:51pm

Here’s what a base 3.8-10 system shows. Granted only one disk shows as wipeable (only one I prepared, /dev/sdb), at least the others are detected and I’m sure if I go through my wipe process I could use them all. (still can’t put my finger on why the disks seem to be SOOO sensitive in this setup, never had these hoops to jump through on OmniOS, FreeNAS, napp-it, ZoL stg appliance setups I’ve used in the past).

[root@rocket ~]# /usr/bin/lsblk -P -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID
NAME="sda" MODEL="Virtual disk    " SERIAL="" SIZE="20G" TRAN="spi" VENDOR="VMware  " HCTL="3:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sda1" MODEL="" SERIAL="" SIZE="500M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="ext4" LABEL="" UUID="fa48f475-0b12-4227-bec7-2b4bc0eec7c7"
NAME="sda2" MODEL="" SERIAL="" SIZE="2G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="swap" LABEL="" UUID="9dea59e9-ea76-4353-afe8-9c271e41fe61"
NAME="sda3" MODEL="" SERIAL="" SIZE="17.5G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="btrfs" LABEL="rockstor_rockstor" UUID="62e6cf41-f931-4e01-bdf6-4e786d1b4061"
NAME="sdb" MODEL="HUSSL4010BSS600 " SERIAL="5000cca0131a32dc" SIZE="93.2G" TRAN="sas" VENDOR="HITACHI " HCTL="2:0:0:0" TYPE="disk" FSTYPE="ext4" LABEL="" UUID="606f6e64-1367-4f8d-8349-033a58524994"
NAME="sdc" MODEL="HUSSL4010BSS600 " SERIAL="5000cca0132a3104" SIZE="93.2G" TRAN="sas" VENDOR="HITACHI " HCTL="2:0:1:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sdd" MODEL="MK0200GCTYV     " SERIAL="BTTV4082018N200GGN" SIZE="186.3G" TRAN="sas" VENDOR="ATA     " HCTL="2:0:2:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sde" MODEL="INTEL SSDSC2BA20" SERIAL="BTTV438004PY200GGN" SIZE="186.3G" TRAN="sas" VENDOR="ATA     " HCTL="2:0:3:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sdf" MODEL="HUSSL4010BSS600 " SERIAL="5000cca0132a1064" SIZE="93.2G" TRAN="sas" VENDOR="HITACHI " HCTL="2:0:4:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sdg" MODEL="HUSSL4010BSS600 " SERIAL="5000cca0132a1018" SIZE="93.2G" TRAN="sas" VENDOR="HITACHI " HCTL="2:0:5:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sdh" MODEL="MK0200GCTYV     " SERIAL="BTTV2493011D200GGN" SIZE="186.3G" TRAN="sas" VENDOR="ATA     " HCTL="2:0:6:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sdi" MODEL="INTEL SSDSC2BA20" SERIAL="BTTV438004U0200GGN" SIZE="186.3G" TRAN="sas" VENDOR="ATA     " HCTL="2:0:7:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sr0" MODEL="VMware IDE CDR10" SERIAL="10000000000000000001" SIZE="707M" TRAN="ata" VENDOR="NECVMWar" HCTL="1:0:0:0" TYPE="rom" FSTYPE="iso9660" LABEL="Rockstor 3 x86_64" UUID="2015-12-11-16-37-13-00"
[root@rocket ~]#

phillxnet · December 21, 2015, 9:48pm

@whitey That’s great thanks.[quote=“whitey, post:30, topic:797”]
changing from generic /dev/sdb/c/d/e/f to UUID looking entries for names, that intended?
[/quote]
Yes that is intended. the UUID looking entries are Rockstor’s new (currently testing channel only) way to identify device names that are classed as missing as given device names are essentially arbitrary from one boot to the next it made sense to drop them with devices that are thought to be offline / missing. We may later ‘hide’ the uuid derived missing device name with a simple “missing / detached / offline” entry. But for now it is a different made up uuid derived string on each scan that serves as a unique but arbitrary place holder that can’t be confused with real device names that are reserved for scan_disk reported attached devices.

The reason 3.8-10.02 offers only remove is because it can’t any longer see the drives as per your earlier post of the lsblk command that had no data drive entries, where as you’re most recent post of the same lsblk command indicates a bunch of them and sure enough they show as connected in the Disks screen.

Nice formatting there, thanks.

So the options are:-
1:- reconfigure the hyper-visor vmware gubbins so that the drives show up with the updated lsblk command as in the most recent post as then Rockstor can see them.
or
2:- downgrade the installed version of lsblk to the prior version that could see the drives, as currently configured.

I’m afraid that is as far as I’ve gotten with this one and option 2 seems like a real hack so I would go with option 1 but I can’t help you there as I know next to nothing of vmware.

The changes between the lsblk version that did work and the one that has just been updated by the CentOS 7.2 1511 update are available in this comparison of the two package versions. Note at the end they include the spec file changes where we can see the changelog. Ironically enough two of the fixes are:-
“lsblk does not list dm-multipath and partitioned devices”
“blkid outputs nothing for some partitions (e.g. sun label)”

No I too haven’t come across such touchy drives, strange. Maybe they are just plain faulty hence the multitude of systems that have had issues sensing them.

So here goes with the hacky workaround but if the new lsblk can’t see your drives and you can’t use another config to present the drives then I’m not sure what else there is.
In case you want to try the lsblk downgrade (in 3.8-10.02), which I don’t recommend, then here are the downloads of util-linux, it’s dependencies, and the required command to install them:-

# util-linux downgrade to CentOS 7.1.1503 version:-
# https://www.rpmfind.net/linux/RPM/centos/updates/7.1.1503/x86_64/Packages/util-linux-2.23.2-22.el7_1.1.x86_64.html
curl -O ftp://195.220.108.108/linux/centos/7.1.1503/updates/x86_64/Packages/util-linux-2.23.2-22.el7_1.1.x86_64.rpm
# https://www.rpmfind.net/linux/RPM/centos/updates/7.1.1503/x86_64/Packages/libblkid-2.23.2-22.el7_1.1.x86_64.html
curl -O ftp://195.220.108.108/linux/centos/7.1.1503/updates/x86_64/Packages/libblkid-2.23.2-22.el7_1.1.x86_64.rpm
# https://www.rpmfind.net/linux/RPM/centos/updates/7.1.1503/x86_64/Packages/libmount-2.23.2-22.el7_1.1.x86_64.html
curl -O ftp://195.220.108.108/linux/centos/7.1.1503/updates/x86_64/Packages/libmount-2.23.2-22.el7_1.1.x86_64.rpm
# https://www.rpmfind.net/linux/RPM/centos/updates/7.1.1503/x86_64/Packages/libuuid-2.23.2-22.el7_1.1.x86_64.html
curl -O ftp://195.220.108.108/linux/centos/7.1.1503/updates/x86_64/Packages/libuuid-2.23.2-22.el7_1.1.x86_64.rpm

And once you have all of these say in /root then with the following command you can take your chances and change the legs you are standing on:-

yum downgrade util-linux-2.23.2-22.el7_1.1.x86_64.rpm downgrade libblkid-2.23.2-22.el7_1.1.x86_64.rpm downgrade libmount-2.23.2-22.el7_1.1.x86_64.rpm downgrade libuuid-2.23.2-22.el7_1.1.x86_64.rpm

Note that my rather thin understanding of yum rpm is that these packages will return to their prior (for you non working) versions with the next update. So you will have to exclude them from being updated, ie via an exclude=packagename in the relevant repo listing. See man 5 yum.conf

N.B. I haven’t tried this but from that man page it looks like you will need:-

exclude=util-linux libblkid libmount libuuid

in /etc/yum.conf

After the downgrade you will have to reboot to begin using the downgraded versions.

Again please don’t do this on a production server but if there are no risks then this may very well prove the issue.

If anyone has a nicer way to do this then that would be great and sorry for being such a hack.

whitey · December 21, 2015, 10:37pm

Attempting option 2 now as option one there really is nothing odd about that setup (VMware ESXi Ent Plus) /w vt-D passthru LSI 2008 6Gbps HBA to attach up the disks.

Thanks for options. Will report back shortly.

EDIT: No go, tried w/ a downgrade on each package and a global downgrade just at the first one, neither happy.

Also my disks all moved to ‘delete’ only option and I nuked them and now rescan and fdisk -l see’s nothing. Ver is still at 3.8-10 Will wipe disk again and see if I can get it to ‘re-appear’

BTW, 3 of the 4 husslbss600’s I have had ‘initial’ issues with but they work well now in other stg appliances. The intel dcs3700’s are rock solid, always have been but not here. I dunno what other sanity checks/testes to execute

phillxnet · December 22, 2015, 12:21am

@whitey I proofed the command supplied here with an upgraded 3.8-10.02 and it worked to downgrade all four packages to those found in 3.8-10 so I am sure of the instructions as given and cut and past from a terminal. Could it be that you are attempting to execute them on a 3.8.10 system prior to any updates. When they are intended to downgrade lsblk in 3.8-10.02 to that of the version that worked for you in 3.8-10 as per your lsblk results from the two systems. After all we are after getting the duplication fixes whilst retaining the earlier CentOS 7.1 versions of lsblk that could still report details of your drives.

Pretty late here in UK.

yum info util-linux

to see what version you have.

My understanding is that 3.8-10 where the lsblk command could see your data drive had version util-linux.x86_64 0:2.23.2-22.el7_1.1 and 3.8-10.02 or rather the CentOS 7.2 upgrades that come along with updating via subscribing to testing and upgrading will result in an upgrade to util-linux.x86_64 0:2.23.2-26.el7 which in your execution of the lsblk command you’re response indicated that it could not see your data drives.

I suspect that a background upgrade has foiled us. Please check that you are on 3.8-10.02 and have all updates in place and then try the provided commands as I believe them to be good / working also note that updates will undo / replace them with the newer versions so you will need the exclude command to stop this happening.

Have to dash and will double check my end tomorrow hopefully.

Remember that the updates are from base / updates not from Rockstor but will be pulled in regardless of Rockstor’s version. That may be the issue here, hence the yum info util-linux.

Hope that helps as I think your pasted responses to the lsblk commands fully explain what has happened. Maybe I have mis-read the versions of util-linux that work and don’t work with your setup.

This very much indicates that the background update from CentOS 7.1 to 7.2 that brought with it the lsblk updates via util-linux have taken place and have similarly had the same effect, ie newer lsblk can’t see your data drives. The same command is used in either version of Rockstor only the testing version will represent the missing drives differently as explained in my last post.

Stick with it as I believe we are almost there, it’s just me not clarifying that the updates were part of the standard upgrade process, this is not a Rockstor thing as you see from fdisk lsblk etc.

The nub of it is that your devices should show up in CentOS 7.2 1511 (within vmware) but they do not so we are attempting a work around.

whitey · December 22, 2015, 4:47am

You are rights, I had not upgraded to 3.8-10.02 My bad/misunderstanding.

All set now

Downgrade  4 Packages

Total size: 8.1 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : libuuid-2.23.2-22.el7_1.1.x86_64                                                                 1/8
  Installing : libblkid-2.23.2-22.el7_1.1.x86_64                                                                2/8
  Installing : libmount-2.23.2-22.el7_1.1.x86_64                                                                3/8
  Installing : util-linux-2.23.2-22.el7_1.1.x86_64                                                              4/8
  Cleanup    : util-linux-2.23.2-26.el7.x86_64                                                                  5/8
  Cleanup    : libmount-2.23.2-26.el7.x86_64                                                                    6/8
  Cleanup    : libblkid-2.23.2-26.el7.x86_64                                                                    7/8
  Cleanup    : libuuid-2.23.2-26.el7.x86_64                                                                     8/8
  Verifying  : util-linux-2.23.2-22.el7_1.1.x86_64                                                              1/8
  Verifying  : libuuid-2.23.2-22.el7_1.1.x86_64                                                                 2/8
  Verifying  : libblkid-2.23.2-22.el7_1.1.x86_64                                                                3/8
  Verifying  : libmount-2.23.2-22.el7_1.1.x86_64                                                                4/8
  Verifying  : libmount-2.23.2-26.el7.x86_64                                                                    5/8
  Verifying  : libblkid-2.23.2-26.el7.x86_64                                                                    6/8
  Verifying  : util-linux-2.23.2-26.el7.x86_64                                                                  7/8
  Verifying  : libuuid-2.23.2-26.el7.x86_64                                                                     8/8

Removed:
  libblkid.x86_64 0:2.23.2-26.el7         libmount.x86_64 0:2.23.2-26.el7       libuuid.x86_64 0:2.23.2-26.el7
  util-linux.x86_64 0:2.23.2-26.el7

Installed:
  libblkid.x86_64 0:2.23.2-22.el7_1.1     libmount.x86_64 0:2.23.2-22.el7_1.1   libuuid.x86_64 0:2.23.2-22.el7_1.1
  util-linux.x86_64 0:2.23.2-22.el7_1.1

Looks like it downgraded to me, also added exclude line in yum.conf prior to reboot

[root@rocket ~]# yum info util-linux
Loaded plugins: changelog, fastestmirror
Loading mirror speeds from cached hostfile
 * base: repos.redrockhost.com
 * epel: mirror.sfo12.us.leaseweb.net
 * extras: mirrors.xmission.com
 * updates: mirrors.xmission.com
Installed Packages
Name        : util-linux
Arch        : x86_64
Version     : 2.23.2
Release     : 22.el7_1.1
Size        : 7.6 M
Repo        : installed
From repo   : /util-linux-2.23.2-22.el7_1.1.x86_64
Summary     : A collection of basic system utilities
URL         : http://en.wikipedia.org/wiki/Util-linux
License     : GPLv2 and GPLv2+ and LGPLv2+ and BSD with advertising and Public Domain
Description : The util-linux package contains a large variety of low-level system
            : utilities that are necessary for a Linux system to function. Among
            : others, Util-linux contains the fdisk configuration tool and the login
            : program.

[root@rocket ~]#

Still not seeing disks

[root@rocket ~]# lsblk -P -o NAME,MODEL,SERIAL,SIZE,TRAN,VENDOR,HCTL,TYPE,FSTYPE,LABEL,UUID
NAME="sda" MODEL="Virtual disk    " SERIAL="" SIZE="20G" TRAN="spi" VENDOR="VMware  " HCTL="3:0:0:0" TYPE="disk" FSTYPE="" LABEL="" UUID=""
NAME="sda1" MODEL="" SERIAL="" SIZE="500M" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="ext4" LABEL="" UUID="fa48f475-0b12-4227-bec7-2b4bc0eec7c7"
NAME="sda2" MODEL="" SERIAL="" SIZE="2G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="swap" LABEL="" UUID="9dea59e9-ea76-4353-afe8-9c271e41fe61"
NAME="sda3" MODEL="" SERIAL="" SIZE="17.5G" TRAN="" VENDOR="" HCTL="" TYPE="part" FSTYPE="btrfs" LABEL="rockstor_rockstor" UUID="62e6cf41-f931-4e01-bdf6-4e786d1b4061"
NAME="sr0" MODEL="VMware IDE CDR10" SERIAL="10000000000000000001" SIZE="707M" TRAN="ata" VENDOR="NECVMWar" HCTL="1:0:0:0" TYPE="rom" FSTYPE="iso9660" LABEL="Rockstor 3 x86_64" UUID="2015-12-11-16-37-13-00"
[root@rocket ~]#

whitey · December 22, 2015, 6:16am

Just for reference freenas vt_D stg appliance VM see’s and can immediately create pool out of 4 s3700 200gb’ers and see’s hussl’s as well.

phillxnet · December 22, 2015, 1:07pm

@whitey OK, that’s a shame then as I’ve just re-checked and a fresh install of Rockstor 3.8-10 has as I thought version util-linux-2.23.2-22.el7_1.1 and that is exactly what we have successfully downgraded you to, apparently, so that must have been a red herring and that another upgrade is causing lsblk to no longer see the drives.

However it is still the case that the lsblk command and fdisk etc do not see your drives after updates so all I can now suggest is that you pursue that issue upstream with the CentOS folks and be sure to note that you are running under VMware and the specific version (as you did here) as until the underlying OS can see the drives via the given lsblk command then Rockstor is not going to see them either. Again I suspect that given a fresh install of Rockstor 3.8-10 which is roughly equivalent to CentOS 7.1 1503 there is still hope of a clever exclude command serving your needs as is and the CentOS / VMware folks are a better bet to assist with this, but it’s still a hack and a proper sort would of course be more in keeping.

Please do report back with any findings as this would be a great one to know how to crack; but given that this is looking like an upstream issue and not specific to Rockstor then your would be best served by others more specialized knowledge.

There is of course a chance that the interplay of Rockstor’s kernel and the recent CentOS updates are to blame in this context but again Rockstor simply uses the well known elrepo kernel unmodified.

To confirm the above if you install an equivalent generic CentOS 7.1 1503 and execute the same lsblk command and then apply the updates and then re execute the same lsblk command (after a reboot) then you should see the same behaviour as you see with Rockstor 3.8-10 and 3.8-10.02 respectively.

Also note that there are numerous reports of drive visibility issues for guest OS’s within VMware so it looks like your circumstance is now an example of this.

If you manage to make any headway / breakthroughs please do report back as we have both put a bit into this one and it’s a shame to have gotten this far and be foiled by upstream OS updates.

Hope to hear some good news on this one in the future. I will report back here if I come across / think of anything myself.

Thanks.

whitey · December 22, 2015, 4:30pm

Thanks for detailed responses and dedicated support. You’re right we have really dug down in the weeds on this one. Unfortunately I just loaded up a CentOS 1511 (7.2) VM (minimal install), slide a husslbss600 100gb slc sas ssd in and BAM, immediately detected. Proof in the puddin’

Want me to interact/lay down btrfs fs/volume creation from cli, not a btrfs guy but I can go RTFM to do it cli unless you have a quick tip/trick or something you’d like to see, I can also load up a CentOS 1503 (7.1) release, check it, update and see what the result is there like you asked, may prove more fruitful as that is kinda in lockstep w/ what we are doing but a fresh 7.2 CentOS takes off like a rocket (VMware VM, vt-D LSI HBA to VM, hotplug disks off a stg array chassis).

Thoughts?

EDIT: And btrfs laid down/mounted.

And laying down I/O

whitey · December 22, 2015, 4:59pm

And here’s a (what I assume is a raid0) across 4 Intel dc s3700 200gb’ers and coorosponding dd tests/results.

I gotta say, seems CentOS 7.2/1511 is rock solid w/ my disks/btrfs/env (VMware ESXi, vt-D stg appliance guest VM w/ LSI 2008 6Gbps HBA passed through to guest VM OS) As stated before I use this config VERY reliably w/ probably 4-5 openstorage appliances (OmniOS, napp-it, FreeNAS, ZoL, and hopefully soon RockStor). Should be nothing out of the ordinary, ALL ent grade HW/servers/chassis/hypervisors (Computer, Stg, Net) is all ent class gear in my home lab, life’s too short to mess arnd w/ consumer grade pulling your hair out hehe.

3-node Supermicro X9SCM MB, 32GB ECC ram, Xeon E3-1230v2 proc, LSI 9211-8i 2008 based HBA’s, Intel X520-DA2 10G nic’s, Juniper EX3300 10G switch. A slew of ent-class ssd’s. I ain’t playin’ here hah

phillxnet · December 22, 2015, 5:03pm

@whitey Fancy that, and this is in the exact arrangement VM wise that Rockstor was in! Have you yet installed any updates? Yes the CentOS 7.1 1503 then updates is also worth a try, info wise that is. And make sure to note what the util-linux versions on working and non working (if any) setups are? I still think that we need more assistance here as I’m a little out of ideas but more info is always good especially since it gives others more to work on as well; ideas wise.

Not sure how much more help I’m going to be able to offer up though.

Also you could checkout the possible interaction with the same version of the elrepo kernel installed that is in Rockstor. That also might help to narrow down or shed light on the problem.

It’s a gnarly one this.