@whitey We just need a few more eyes on this one and we are there.
YEP, EXACT config, except I was giving RockStor stg appliance VM 2 vCPU and 4 GB memory and this lil CentOS utility/validation VM is only 1 vCPU/2GB memory. The vt-D config is the important part here since that is what is essentially being passed through to the guest VM/OS by the hypervisor for data disk presentation. vSphere 6.0U1 and vt-D is a lifesaver/rock solid here…sure of it as I have 10’s of TB’s of data behind several ZFS vt-D boxes here driving many pools of stg for years and NEVER a hiccup/data loss w/ this config and the performance is stellar!
WORKING util-linux and python ver’s in CentOS 7.2 (1511) detecting disks. Have not applied ‘yum update -y’ yet, abt to pull the trigger though.
[root@localhost ~]# yum info util-linux | grep Release
Release : 26.el7
Release : 26.el7
[root@localhost ~]# yum info python | grep Release
Release : 34.el7
[root@localhost ~]#
yum update -y
Dependencies Resolved
=================================================================================
Package Arch Version Repository Size
=================================================================================
Installing:
kernel x86_64 3.10.0-327.3.1.el7 updates 33 M
Updating:
bind-libs-lite x86_64 32:9.9.4-29.el7_2.1 updates 724 k
bind-license noarch 32:9.9.4-29.el7_2.1 updates 81 k
dracut x86_64 033-360.el7_2 updates 311 k
dracut-config-rescue x86_64 033-360.el7_2 updates 49 k
dracut-network x86_64 033-360.el7_2 updates 90 k
glibc x86_64 2.17-106.el7_2.1 updates 3.6 M
glibc-common x86_64 2.17-106.el7_2.1 updates 11 M
gmp x86_64 1:6.0.0-12.el7_1 updates 280 k
grub2 x86_64 1:2.02-0.33.el7.centos.1 updates 1.5 M
grub2-tools x86_64 1:2.02-0.33.el7.centos.1 updates 3.3 M
kernel-tools x86_64 3.10.0-327.3.1.el7 updates 2.4 M
kernel-tools-libs x86_64 3.10.0-327.3.1.el7 updates 2.3 M
libxml2 x86_64 2.9.1-6.el7_2.2 updates 666 k
logrotate x86_64 3.8.6-7.el7_2 updates 66 k
openssl x86_64 1:1.0.1e-51.el7_2.1 updates 711 k
openssl-libs x86_64 1:1.0.1e-51.el7_2.1 updates 950 k
python-perf x86_64 3.10.0-327.3.1.el7 updates 2.4 M
rdma noarch 7.2_4.1_rc6-2.el7 updates 28 k
tuned noarch 2.5.1-4.el7_2.1 updates 193 k
Transaction Summary
=================================================================================
Install 1 Package
Upgrade 19 Packages
Installed:
kernel.x86_64 0:3.10.0-327.3.1.el7
Updated:
bind-libs-lite.x86_64 32:9.9.4-29.el7_2.1
bind-license.noarch 32:9.9.4-29.el7_2.1
dracut.x86_64 0:033-360.el7_2
dracut-config-rescue.x86_64 0:033-360.el7_2
dracut-network.x86_64 0:033-360.el7_2
glibc.x86_64 0:2.17-106.el7_2.1
glibc-common.x86_64 0:2.17-106.el7_2.1
gmp.x86_64 1:6.0.0-12.el7_1
grub2.x86_64 1:2.02-0.33.el7.centos.1
grub2-tools.x86_64 1:2.02-0.33.el7.centos.1
kernel-tools.x86_64 0:3.10.0-327.3.1.el7
kernel-tools-libs.x86_64 0:3.10.0-327.3.1.el7
libxml2.x86_64 0:2.9.1-6.el7_2.2
logrotate.x86_64 0:3.8.6-7.el7_2
openssl.x86_64 1:1.0.1e-51.el7_2.1
openssl-libs.x86_64 1:1.0.1e-51.el7_2.1
python-perf.x86_64 0:3.10.0-327.3.1.el7
rdma.noarch 0:7.2_4.1_rc6-2.el7
tuned.noarch 0:2.5.1-4.el7_2.1
Complete!
[root@localhost ~]#
So after FULL system update all is still good. util-linux/python did NOT get updated/need updates, lsblk, btrfs mounts still happy as piggies in mud.
[root@localhost ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 500M 0 part /boot
└─sda2 8:2 0 29.5G 0 part
├─centos-root 253:0 0 27.5G 0 lvm /
└─centos-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 93.2G 0 disk /btrfs
sdc 8:32 0 186.3G 0 disk /btrfs-r0
sdd 8:48 0 186.3G 0 disk
sde 8:64 0 186.3G 0 disk
sdf 8:80 0 186.3G 0 disk
sr0 11:0 1 1024M 0 rom
[root@localhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 28G 1021M 27G 4% /
devtmpfs 911M 0 911M 0% /dev
tmpfs 921M 0 921M 0% /dev/shm
tmpfs 921M 8.5M 912M 1% /run
tmpfs 921M 0 921M 0% /sys/fs/cgroup
/dev/sda1 497M 183M 314M 37% /boot
tmpfs 185M 0 185M 0% /run/user/0
/dev/sdb 94G 9.9G 82G 11% /btrfs
/dev/sdc 746G 9.9G 734G 2% /btrfs-r0
[root@localhost ~]#
Color me THROUGHLY perplexed! Got a rockstor tarball/build I can lay down on top of this working setup or can I safely assume it is a BIT more complicated/complex than that?
@whitey Well look at you go, this is great. And are you game to try the elrepo kernel?
On a Rockstor 3.8-10.03 we have:-
uname -a
Linux rockstord 4.2.5-1.el7.elrepo.x86_64 #1 SMP Tue Oct 27 12:32:38 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
I’m afraid the tarball / build question is more one for @suman, but you will need the elrepo kernel anyway really as 3.10 is ages old for btrfs.
Anytime now someone will get fed up with all this and jump in with a nice obvious point that we both missed, here’s hoping.
What, you guys already have a .03 update release out for 3.8-10 or is that internal?
I am certainly game/down to try whatever out, long as we think/feel it has ‘legs’. This is all test/boom boom env stuff for me, can spin/up and tear down in a matter of mins. I just don’t wanna bother anyone if the ‘bang is not worth the buck’ as in if we think it is specific to me/my env (I don’t think it is/can be but hey I’ll keep an open mind) then let’s scrap it but if we feel there may be a ‘red headed gremlin’ running rampant that ‘may’ bite anyone else let’s press on.
What’s the easiest way to implement elrepo kernel? Do I need to uninstall/blacklist or just move kernel boot order in grub.conf/menu.lst?
May need a lil hand holding but not much.
One of these days we’ll put this one to bed!
I think it’s certainly worth the effort and prudent to install the elrepo kernel at least, given it’s so key to btrfs and our little investigation.
My understanding re elrepo kernel install is add the repo and install as usual:-
Instructions are available at http://elrepo.org/tiki/tiki-index.php
I know that’s a rtm type answer but other than that I don’t really know.
But you will want their ml (main line) variant, currently we have:-
yum list installed | grep kernel
kernel-ml.x86_64 4.2.5-1.el7.elrepo @anaconda/3
OK just had a look and I see a problem here, they appear to only now have 4.3.2-1 and 4.3.3-1 so we can’t directly compare, which is a shame.
@suman how do we install the exact same kernel as Rockstor carries.
As for 3.8-10.03 its a minor fix up from what you have already tried in the testing channel.
Updated to latest avail kernel-ml from elrepo
[root@localhost ~]# yum --enablerepo=elrepo-kernel install kernel-ml
Loaded plugins: fastestmirror
base | 3.6 kB 00:00:00
elrepo | 2.9 kB 00:00:00
elrepo-kernel | 2.9 kB 00:00:00
extras | 3.4 kB 00:00:00
updates | 3.4 kB 00:00:00
elrepo-kernel/primary_db | 828 kB 00:00:01
Loading mirror speeds from cached hostfile
* base: mirror.fusioncloud.co
* elrepo: dfw.mirror.rackspace.com
* elrepo-kernel: dfw.mirror.rackspace.com
* extras: centos.mirror.lstn.net
* updates: centos.firehosted.com
Resolving Dependencies
--> Running transaction check
---> Package kernel-ml.x86_64 0:4.3.3-1.el7.elrepo will be installed
--> Finished Dependency Resolution
Dependencies Resolved
=================================================================================
Package Arch Version Repository Size
=================================================================================
Installing:
kernel-ml x86_64 4.3.3-1.el7.elrepo elrepo-kernel 37 M
Transaction Summary
=================================================================================
Install 1 Package
Total download size: 37 M
Installed size: 169 M
Is this ok [y/d/N]: y
Downloading packages:
warning: /var/cache/yum/x86_64/7/elrepo-kernel/packages/kernel-ml-4.3.3-1.el7.elrepo.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY
Public key for kernel-ml-4.3.3-1.el7.elrepo.x86_64.rpm is not installed
kernel-ml-4.3.3-1.el7.elrepo.x86_64.rpm | 37 MB 00:00:03
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-elrepo.org
Importing GPG key 0xBAADAE52:
Userid : "elrepo.org (RPM Signing Key for elrepo.org) <secure@elrepo.org>"
Fingerprint: 96c0 104f 6315 4731 1e0b b1ae 309b c305 baad ae52
Package : elrepo-release-7.0-2.el7.elrepo.noarch (installed)
From : /etc/pki/rpm-gpg/RPM-GPG-KEY-elrepo.org
Is this ok [y/N]: y
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
Installing : kernel-ml-4.3.3-1.el7.elrepo.x86_64 1/1
Verifying : kernel-ml-4.3.3-1.el7.elrepo.x86_64 1/1
Installed:
kernel-ml.x86_64 0:4.3.3-1.el7.elrepo
Complete!
Rebooted into that kernel, all seems well still to add to the mystery.
[root@localhost ~]# uname -a
Linux localhost.localdomain 4.3.3-1.el7.elrepo.x86_64 #1 SMP Tue Dec 15 11:18:19 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 500M 0 part /boot
└─sda2 8:2 0 29.5G 0 part
├─centos-root 253:0 0 27.5G 0 lvm /
└─centos-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 93.2G 0 disk
sdc 8:32 0 186.3G 0 disk
sdd 8:48 0 186.3G 0 disk
sde 8:64 0 186.3G 0 disk
sdf 8:80 0 186.3G 0 disk
sr0 11:0 1 1024M 0 rom
[root@localhost ~]# btrfs filesystem show
Label: none uuid: a7ded511-cde9-4246-b0a8-156b6811926f
Total devices 1 FS bytes used 9.78GiB
devid 1 size 93.16GiB used 13.03GiB path /dev/sdb
Label: none uuid: b242eaa0-b5ab-4165-8b07-cabfb3169520
Total devices 4 FS bytes used 9.78GiB
devid 1 size 186.31GiB used 4.01GiB path /dev/sdc
devid 2 size 186.31GiB used 4.00GiB path /dev/sde
devid 3 size 186.31GiB used 3.01GiB path /dev/sdd
devid 4 size 186.31GiB used 3.01GiB path /dev/sdf
btrfs-progs v3.19.1
[root@localhost ~]#
Anyone else here find this strange?
Rockstor 3.8-10.02 (w/ elrepo kernel 4.2.5 build included in RS release)
[root@rocket ~]# dmesg | grep mpt2sas
[ 1.133956] mpt2sas version 20.100.00.00 loaded
[ 1.140070] mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (4047048 kB)
[ 1.211747] mpt2sas0: MSI-X vectors supported: 1, no of cores: 2, max_msix_vectors: 8
[ 1.212152] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 59
[ 1.212155] mpt2sas0: iomem(0x00000000fd5f0000), mapped(0xffffc90000d40000), size(65536)
[ 1.212157] mpt2sas0: ioport(0x0000000000004000), size(256)
[ 1.352180] mpt2sas0: Allocated physical memory: size(7445 kB)
[ 1.352183] mpt2sas0: Current Controller Queue Depth(3307), Max Controller Queue Depth(3432)
[ 1.352184] mpt2sas0: Scatter Gather Elements per IO(128)
[ 31.382445] mpt2sas0: _base_event_notification: timeout
[ 31.382491] mpt2sas0: sending message unit reset !!
[ 31.387923] mpt2sas0: message unit reset: SUCCESS
[ 31.607713] mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_scsih.c:8237/_scsih_probe()!
[root@rocket ~]#
CentOS 7.2 (1511 w/ elrepo 4.3.3 kernel)
[root@localhost ~]# dmesg | grep mpt2sas
[ 0.991351] mpt2sas version 20.100.00.00 loaded
[ 0.998795] mpt2sas0: 32 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (2048740 kB)
[ 1.075498] mpt2sas0: MSI-X vectors supported: 1, no of cores: 1, max_msix_vectors: 8
[ 1.076066] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 58
[ 1.076068] mpt2sas0: iomem(0x00000000fd5f0000), mapped(0xffffc90000400000), size(65536)
[ 1.076069] mpt2sas0: ioport(0x0000000000004000), size(256)
[ 1.196721] mpt2sas0: Allocated physical memory: size(4964 kB)
[ 1.196724] mpt2sas0: Current Controller Queue Depth(3307), Max Controller Queue Depth(3432)
[ 1.196725] mpt2sas0: Scatter Gather Elements per IO(128)
[ 1.257870] mpt2sas0: LSISAS2008: FWVersion(19.00.00.00), ChipRevision(0x03), BiosVersion(07.37.00.00)
[ 1.257874] mpt2sas0: Dell 6Gbps SAS HBA: Vendor(0x1000), Device(0x0072), SSVID(0x1028), SSDID(0x1F1C)
[ 1.257875] mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[ 1.258517] mpt2sas0: sending port enable !!
[ 1.262039] mpt2sas0: host_add: handle(0x0001), sas_addr(0x5b8ca3a0f298aa00), phys(8)
[ 1.292223] mpt2sas0: port enable: SUCCESS
[root@localhost ~]#
This line concerns me/leads me to believe there is something NOT happy w/ the mpt2sas driver/firmware. (port reset: SUCCESS then failure on drivers scsi_h probe instead of port enable: SUCCESS)
[ 31.607713] mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_scsih.c:8237/_scsih_probe()!
Also see differences in 64 BIT PCI BUS DMS v.s. 32 BIT PCI BUD DMA between not working and working.
Notice on the rockstor build it doesn’t even get to this: (that is my LSI 2008 HBA in IT mode firmware v19 which I explicitly flash all my LSI HBA’s to, work DARN well in SW raid setups like that)
[ 1.257870] mpt2sas0: LSISAS2008: FWVersion(19.00.00.00), ChipRevision(0x03), BiosVersion(07.37.00.00)
[ 1.257874] mpt2sas0: Dell 6Gbps SAS HBA: Vendor(0x1000), Device(0x0072), SSVID(0x1028), SSDID(0x1F1C)
@whitey Yet more good findings I would say. There is the matter that the kernel on Rockstor didn’t change between 3.8-10 (working) and it’s .02 (not working for you) update but a whole lot of other stuff did via the CentOS updates and maybe the system was unable to present those disks with the driver in that state after the updates when before them it muddled through and did present them.
I definitely think you are onto something and it’s great to see those drives showing up. So this looks like some kind of regression that was aggravated by the 4.2.5-1 ml elrepo kernel in Rockstor that isn’t suffered by the 4.3.3 elrepo kernel, or the default CentOS kernel for that matter, which is great news. Of course we would need to install the 4.2.5-1 in the vanilla CentOS 7.2 to know but I’m fairly sure that there are plans to upgrade the default kernel in Rockstor anyway soon so this may all fall into place. Given that the btrfs-progs and kernel are best kept in step maybe the next thing to try is installing this newer elrepo kernel in an updated Rockstor 3.8-10.03 and see if it sorts things. But as I say the btrfs-progs are best kept in line with the kernel version however these have recently been updated in Rockstor so probably new enough for this test (non production for now).
yum list installed | grep btrfs
btrfs-progs.x86_64 4.3.1-0.rockstor @Rockstor-Testing
I suspect we are waiting on elrepo releasing a 4.4 kernel, not sure. I think @suman is currently pretty deep into the rock-on docker stuff at the moment so the kernel upgrade may be taking a back seat.
To my knowledge (which is very limited I might add) Rockstor uses a completely unaltered elrepo kernel, there just isn’t the people power for us to role a custom one. Although @Dragon2611 in the forums has been compiling his own, and making it available for people to try.
Note that change the default kernel expected by Rockstor so that you don’t get the warning message etc see another post and it’s answer from @Dragon2611 again :-
http://forum.rockstor.com/t/override-kernel-autoboot/798
The difference re 32 & 64 bit is strange / interplay with motherboard pci drivers maybe. You could look at available module options eg:-
modinfo mpt2sas
filename: /lib/modules/4.2.5-1.el7.elrepo.x86_64/kernel/drivers/scsi/mpt2sas/mpt2sas.ko
version: 20.100.00.00
license: GPL
description: LSI MPT Fusion SAS 2.0 Device Driver
author: Avago Technologies <MPT-FusionLinux.pdl@avagotech.com>
srcversion: F74E004728BCB0A8B19A944
alias: pci:v00001000d0000007Esv*sd*bc*sc*i*
alias: pci:v00001000d0000006Esv*sd*bc*sc*i*
alias: pci:v00001000d00000087sv*sd*bc*sc*i*
alias: pci:v00001000d00000086sv*sd*bc*sc*i*
alias: pci:v00001000d00000085sv*sd*bc*sc*i*
alias: pci:v00001000d00000084sv*sd*bc*sc*i*
alias: pci:v00001000d00000083sv*sd*bc*sc*i*
alias: pci:v00001000d00000082sv*sd*bc*sc*i*
alias: pci:v00001000d00000081sv*sd*bc*sc*i*
alias: pci:v00001000d00000080sv*sd*bc*sc*i*
alias: pci:v00001000d00000065sv*sd*bc*sc*i*
alias: pci:v00001000d00000064sv*sd*bc*sc*i*
alias: pci:v00001000d00000077sv*sd*bc*sc*i*
alias: pci:v00001000d00000076sv*sd*bc*sc*i*
alias: pci:v00001000d00000074sv*sd*bc*sc*i*
alias: pci:v00001000d00000072sv*sd*bc*sc*i*
alias: pci:v00001000d00000070sv*sd*bc*sc*i*
depends: scsi_transport_sas,raid_class
intree: Y
vermagic: 4.2.5-1.el7.elrepo.x86_64 SMP mod_unload modversions
parm: logging_level: bits for enabling additional logging info (default=0)
parm: max_sectors:max sectors, range 64 to 32767 default=32767 (ushort)
parm: missing_delay: device missing delay , io missing delay (array of int)
parm: max_lun: max lun, default=16895 (int)
parm: diag_buffer_enable: post diag buffers (TRACE=1/SNAPSHOT=2/EXTENDED=4/default=0) (int)
parm: prot_mask: host protection capabilities mask, def=7 (int)
parm: max_queue_depth: max controller queue depth (int)
parm: max_sgl_entries: max sg entries (int)
parm: msix_disable: disable msix routed interrupts (default=0) (int)
parm: max_msix_vectors: max msix vectors (int)
parm: mpt2sas_fwfault_debug: enable detection of firmware fault and halt firmware - (default=0)
parm: disable_discovery: disable discovery (int)
Building kernel on rockstor is actually pretty easy, off the top of my head I believe it’s as follows
install the development tools packages
yum group install “Development Tools”
Install gzip if it’s not install (Yum install gzip)
-
cd to /usr/src
-
grab the kernel sources from kernel.org
-
extract the file
-
symlink link the extracted dir to /usr/src/linux (I don’t think this is actually required but it’s slightly neater)
-
enter the directory
-
make clean
-
cp /boot/config-4.2.2-1.el7.elrepo.x86_64 .config (Note replace the 4.22 with the current kernel version tab should list the available options)
-
make menu config
8a) press load and load the .config file (should be the default)
8b) make any changes needed to compile any extra drivers you need, if you are careful you can also disable stuff you don’t need, I take the approach if I don’t know what something does I leave it on whatever it was set to before.
8c) press save then exit
- make rpm (If you are on SSH and prone to dropping then “screen make rpm” might be a better option)
9a) go get a cup of coffee and find something to do for a while depending on the speed of your rockstor box this could take anything from 15minutes to several hours.
-
if all worked you should be able to find a load of RPMs in /root for the kernel, headers.etc
-
install with your favourite package manger.
The only gottya I did run into is Yum may not auto remove custom kernels when you upgrade to a newer one so after a while you will have to delete some older ones or /boot will get full and you won’t be able to install any more.
@Dragon2611 Thanks, this is a great post. I didn’t realise the rpm could be made so easily. Is make oldconfig still an option these days as if so it may well stream line the process a little, especially if one is just going with the defaults?
I’ve not actually tried,
Also one thing I did change between the ELREPO and my kernel was to compile BTRFS directly into the kernel rather than rely on loading it as a module, I figured given how extensively rockstor uses it that it didn’t seem like a bad idea.
Cute another reboot and now I see this garbage on the CentOS 7.2 (1511) box I upgraded to kernel 4.3.3)
[root@localhost ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 500M 0 part /boot
└─sda2 8:2 0 29.5G 0 part
├─centos-root 253:0 0 27.5G 0 lvm /
└─centos-swap 253:1 0 2G 0 lvm [SWAP]
sr0 11:0 1 4G 0 rom
[root@localhost ~]# dmesg | grep mpt2sas
[ 0.775744] mpt2sas version 20.100.00.00 loaded
[ 0.781703] mpt2sas0: 32 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (2048740 kB)
[ 0.858546] mpt2sas0: MSI-X vectors supported: 1, no of cores: 1, max_msix_vectors: 8
[ 0.859041] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 58
[ 0.859043] mpt2sas0: iomem(0x00000000fd5f0000), mapped(0xffffc90000460000), size(65536)
[ 0.859044] mpt2sas0: ioport(0x0000000000004000), size(256)
[ 0.976310] mpt2sas0: Allocated physical memory: size(4964 kB)
[ 0.976313] mpt2sas0: Current Controller Queue Depth(3307), Max Controller Queue Depth(3432)
[ 0.976313] mpt2sas0: Scatter Gather Elements per IO(128)
[ 31.005253] mpt2sas0: _base_event_notification: timeout
[ 31.005269] mpt2sas0: sending message unit reset !!
[ 31.007239] mpt2sas0: message unit reset: SUCCESS
[ 31.055515] mpt2sas0: failure at drivers/scsi/mpt2sas/mpt2sas_scsih.c:8498/_scsih_probe()!
[root@localhost ~]#
If I reboot back into kernel 3.10 on the CentOS 7.2 (1511) tester box then it works.
[root@localhost ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 500M 0 part /boot
└─sda2 8:2 0 29.5G 0 part
├─centos-root 253:0 0 27.5G 0 lvm /
└─centos-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 93.2G 0 disk
sdc 8:32 0 186.3G 0 disk
sdd 8:48 0 186.3G 0 disk
sde 8:64 0 186.3G 0 disk
sdf 8:80 0 186.3G 0 disk
sr0 11:0 1 4G 0 rom
[root@localhost ~]# uname -a
Linux localhost.localdomain 3.10.0-327.3.1.el7.x86_64 #1 SMP Wed Dec 9 14:09:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]#
Crazy thing is I KNOW I rebooted into the 4.3.3 kernel and had it working at some point late last night…SMH
Guess I DO have FW v19 on the HBA, looks like CentOS 7.2 is loading driver v20. I know FreeNAS moved to driver 20 and firmware 20 otherwise the GUI complains. I could start there I suppose. I have a couple spare 9201-8i in IT v19 mode, will flash up to v20 and attach the stg appliance VM in vt-D mode adn see what happens.
Even in kernel 3.10 I have same mpt2sas driver so I may be chasing a ghost.
mpt2sas version 20.100.00.00 loaded
mpt2sas0: LSISAS2008: FWVersion(19.00.00.00), ChipRevision(0x03), BiosVersion(07.37.00.00)
[ 1.094891] mpt2sas0: Dell 6Gbps SAS HBA: Vendor(0x1000), Device(0x0072), SSVID(0x1028), SSDID(0x1F1C)
I think this card is a Dell Perc H310 which is essentially a re-branded LSI 9211-8i (all sas 2008 HBA’s), but all I reflash to LSI official IT mode firmware version 19. Maybe it’s time to move to version 20 of firmware as time/drivers/distro’s march on, v19’s been rock solid for me for at least the last 4-5 years though it seems.
Anyone else running a similar setup here? Essentially an AIO (all-in-one) build w/ a hypervisor and vt-d pass thru stg ctrls to the VM driving the array/nas?
EDIT: Manually setup raid10 btrfs pool across those 4 hussl sas ssd’s and configured NFS (still booted back into 3.10 kernel as the 4.3.3 kernel sh|t the bed apparently or had that one flukie ‘Hey I see devices’ complex’, now nothing), mounted to vSphere/ESXi hosts, crushing 350MB/s in/write and 300MB/s out/read to another ZFS NFS backed pool and my vSAN AFA floating a few VM’s around using sVMotion.
mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde
You guys are gonna force me to become a btrfs cli junkie yet! But I LOVE the dashboard!
KNEW I wasnt losing my damn mind, now w/ 4.3.3 booted it AGAIN see’s disks…inconsistent to say the least. Just flashed up a spare LSI 2008 6Gbps HBA/ctrl to firmware v20. Gonna see if that helps.
Have NOT given up yet!
Flash of a LSI 6Gbps HBA to v20 didnt seem to help much, still very inconsistent drive identification/sensing. My CentOS box DOES currently see the disks w/ 4.3.3 kernel, I tried to unblacklist util-linux goodies and apply a yum update -y and it re-upgraded the 4-5 util-linux and deps, even updated to RS 3.8-10.03. NO GO
Tried to lay down a 4.3.3 kernel on RS and boot into that, akin to what I did on my CentOS 7.2 build w/ 3.10 kernel to take it to 4.3.3. NO GO (move from RS 4.2.5 kernel to ELREPO 4.3.3).
I stink.
@whitey I have exactly the same problem. I’m using a Dell H310 with P20 IT firmware with vt-d passthrough in ESXi to linux. Any time I try something with kernel 4.2 or newer, I have the same timeout after 30 seconds.
[ 3.944999] mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (4047108 kB)
[ 4.230006] mpt2sas0: MSI-X vectors supported: 1, no of cores: 2, max_msix_vectors: 8
[ 4.232353] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 60
[ 4.232355] mpt2sas0: iomem(0x00000000fd3f0000), mapped(0xffffc90000780000), size(65536)
[ 4.232356] mpt2sas0: ioport(0x0000000000006000), size(256)
[ 4.670263] mpt2sas0: Allocated physical memory: size(7445 kB)
[ 4.670266] mpt2sas0: Current Controller Queue Depth(3307), Max Controller Queue Depth(3432)
[ 4.670267] mpt2sas0: Scatter Gather Elements per IO(128)
[ 34.931893] mpt2sas0: _base_event_notification: timeout
[ 34.931917] mpt2sas0: sending message unit reset !!
[ 34.939852] mpt2sas0: message unit reset: SUCCESS
[ 35.090297] mpt2sas0: failure at /build/linux-AxjFAn/linux-4.2.0/drivers/scsi/mpt2sas/mpt2sas_scsih.c:8237/_scsih_probe()!
The only reliable way I’ve found to make it work in 4.2 is to boot into something older like 4.1 first and then reboot back into 4.2. Every time I do that it works.
4.1.6:
[ 1.084177] mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (4047648 kB)
[ 1.171214] mpt2sas0: MSI-X vectors supported: 1, no of cores: 2, max_msix_vectors: 8
[ 1.172392] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 60
[ 1.172394] mpt2sas0: iomem(0x00000000fd3f0000), mapped(0xffffc90000800000), size(65536)
[ 1.172395] mpt2sas0: ioport(0x0000000000006000), size(256)
[ 1.289273] mpt2sas0: Allocated physical memory: size(7445 kB)
[ 1.289275] mpt2sas0: Current Controller Queue Depth(3307), Max Controller Queue Depth(3432)
[ 1.289276] mpt2sas0: Scatter Gather Elements per IO(128)
[ 1.347695] mpt2sas0: LSISAS2008: FWVersion(20.00.04.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[ 1.347698] mpt2sas0: Dell 6Gbps SAS HBA: Vendor(0x1000), Device(0x0072), SSVID(0x1028), SSDID(0x1F1C)
[ 1.347698] mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[ 1.347976] mpt2sas0: sending port enable !!
[ 2.952518] mpt2sas0: host_add: handle(0x0001), sas_addr(0x5b8ca3a0f9f4e000), phys(8)
[ 9.087861] mpt2sas0: port enable: SUCCESS````
4.2.8 after rebooting from 4.1.6:
````[ 0.819676] mpt2sas version 20.100.00.00 loaded
[ 0.826784] mpt2sas0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (4047696 kB)
[ 0.903642] mpt2sas0: MSI-X vectors supported: 1, no of cores: 2, max_msix_vectors: 8
[ 0.903946] mpt2sas0-msix0: PCI-MSI-X enabled: IRQ 60
[ 0.903948] mpt2sas0: iomem(0x00000000fd3f0000), mapped(0xffffc90000780000), size(65536)
[ 0.903949] mpt2sas0: ioport(0x0000000000006000), size(256)
[ 1.022265] mpt2sas0: Allocated physical memory: size(7445 kB)
[ 1.022268] mpt2sas0: Current Controller Queue Depth(3307), Max Controller Queue Depth(3432)
[ 1.022268] mpt2sas0: Scatter Gather Elements per IO(128)
[ 1.081273] mpt2sas0: LSISAS2008: FWVersion(20.00.04.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
[ 1.081276] mpt2sas0: Dell 6Gbps SAS HBA: Vendor(0x1000), Device(0x0072), SSVID(0x1028), SSDID(0x1F1C)
[ 1.081277] mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
[ 1.081649] mpt2sas0: sending port enable !!
[ 1.084301] mpt2sas0: host_add: handle(0x0001), sas_addr(0x5b8ca3a0f9f4e000), phys(8)
[ 1.124829] mpt2sas0: port enable: SUCCESS````
YEA, 3.8.11 build for me just worked w/ my AIO config and my picky huss4010bss600 sas ssd’s! Even laid down a pool and then deleted pool, they showed right back up as importable or wipe. wooohoooo
Pounding some sVMotions into a r10 from vSphere infra. 25K iops consistent!
Probably not the right place for it but it sure would be nice on the Disk activity chart to have another ‘view’ of I/O latency/response time right there in the same dashboard as well as your already provided IOPS across the pool/disks view. I know it can get crowded on a dashboard and you don’t want it to look ‘too busy/scattered’ but maybe something to consider.
I also DID recently do a lab upgrade/refresh from an E3 to E5 platform across my 3-node vSphere cluster but I doubt that had a thing to do w/ it. More resources sure are nice though (double cores/quadruple memory)