Infiniband/vsphere 6 support

Hello,
I’ve been searching for a solid home lab data store os and came across Rockstor. I like the idea of Linux vs openbsd since I have no experience with bsd. My main requirements are, infiniband and vsphere 6 compatibility.

Since Rocksotr is based on CentOS, am i correct in assuming i can simply install the Mallenox Centos driver?

Does anyone have any experience with RockStor and vsphere6?

thank you in advance for your time

You are right, nothing prevents you from customizing Rockstor as you would with CentOS. Please do share your experience so others in the community can benefit. Thanks!

Thank you for the follow up. I will most likely be playing with a few different systems this weekend. I will be sure to post back on steps taken, outcome, and experience with the process.

So i did a bit of playing around this morning, i have two different mellanox cards i can work with. Got the following out comes so far. Bit of a linux newbie, so bare with me :grinning:

MHGA28-XTC:
rockstor image fresh install with kernel 4.1.0-1.e17.elrepo.x86_64. mellanox card seen
updated to stable with kernel 4.2.2-1.e17.elrepo.x86_64 mellanox card disappears
try to install ofed driver (after installing all required files and dependances), kernel not supported by the install script

i then switched to the newer card so i could see if it was there and try the mellanox driver

MHQH29B-XTR:
stable with kernel 4.2.2-1.e17.elrepo.x86_64 mellanox card not seen (ill may try a fresh install later to see if the card is present like the other)

attempting to install via the mellanox script reported missing kernel-devel, so i install via yum

yum install kernel-devel, this resulted in error: kernel-devel-3.10.0-229 already install. was installed in the prior test, but clearly the wrong kernel. so i tried to force a download of the right files

yum install kernel-devel-$(uname-r), result no package found

so im guessing i just need to add the proper repo and will investigate this later.

You can always. build your own Kernel if needed.

https://www.howtoforge.com/kernel_compilation_centos - this is a fairly old tutorial but I find it one of the easiest to follow, actually you can ignore the bit about patches and modprobe.conf as it’s usually not needed

Essentially install the build tools group and ncurses-devel

You can also ignore 2.7 it gets automatically added to grub when you install the resulting RPM.

I appreciate the suggestion, but that would require that i have source for the kernel and i’m having a little difficulty finding the proper version. It seems elrepo has moved on and upgraded to 4.2.4-1, so i will keep looking over the weekend to see what i can come up with though. I’m probably just missing something simple.

@dfortier. I’ve started testing 4.2.5 kernel from elrepo. If testing goes well, it could be the kernel for next update(3.8-10). The kernel, devel package and headers are now in our testing repo.

For the kind of testing you are doing, you want to be on Testing repo anyway. Once you are on it, you can just yum update kernel-ml to get 4.2.5. It should become default, however, on every boot, the default kernel will be set to 4.2.2. Until we make 4.2.5 official/semi-official, you’ll need to get around this by issuing grubby --set-default=/boot/vmlinuz-4.2.5-1.el7.elrepo.x86_64 before every reboot to ensure that 4.2.5 boots up next time.

To install the devel package, run yum install kernel-ml-devel

Let me know how it goes.

1 Like

If you are building your own kernel then it will generate a kernel headers RPM to go with it :wink:

Thanks for the info. I’m make some headway and i will report back with instructions once i get it sorted. Hoping to have it all figured out this weekend, but my lack of knowledge is slowing progress down a good bit. Nothing wrong with learning something new though

I’m going to compile 4.3rc7 anyway, This time I’ll base it around the Rockstor kernel .config (So won’t strip out multimedia.etc)

If it works i’ll post the RPM.

I’ve built RPM’s from the kernel.org sources for 4.3rc7

This uses the .config from the Rockstor official kernel as a base with the following changes

  1. BTRFS is compiled into a Kernel, instead of a Module
  2. compression for the Kernel image is lzma instead of gzip

http://scrat.dragon2611.net/rockstor/RPMS/x86_64/

http://scrat.dragon2611.net/rockstor/ for source code (I believe therefore I’m complying with the GPL but if I’m not please let me know)

These can be installed by downloading to the rockstor box and then using either rpm or yum to install

I.e yum install kernel-4.3.0_rc7_Rockstor_Unofficial-3.x86_64.rpm

You will need to select this kernel from the boot menu it won’t boot into it automatically by default.

The kernel headers/kernel-devel RPMS are only needed if you intend to compile kernel modules.

I’ll give it a try shortly, and post the results. Doing a fresh install now, i’ve been doing a lot of reading/playing so i want to have a clean starting point.

thank you for your time

While i did manage to get the shares working over infiniband, im having some issue with lock up during the boot process (not every time) and the speeds arent as good as i would expect (topping out at avg 400mbs, saw 600).

I will continue working with it and post back once i get some reliable loading and performance results.

I can’t comment on the speed since my rockstor Box is remote so my “Wan” is the limiting factor, the only time I push it anywhere near it’s maximum speed is when transferring data between VM’s within the same host.

Good morning,
I’m not too concerned with the speeds right now, they are still a lot better than 1gbit. Its the lockup that bothers me. The nas machine i have been using has “new to me” hardware (infiniband card while older model, is new though), and while i did run prime95 to test it out, it was only for 8 hours. So im going to set that up and let it run for a few days, just to make sure there is no hardware issues.

In the mean time i have another test machine i can try to setup. had that one a lot longer, never had any issues with it. may be the weekend before i get more time to play around though.

thanks again for the help

Did the new kernel and devel rpms from Testing repo help you build the new driver? If you can, please post steps, links to docs etc…

Good evening,
The driver was already present in the new kernel, so it was just a matter of configuring it. It took some time for me to figure this out. Since im not really familiar with linux and all the documentation i found focused on building/installing the drivers. My guess is that the driver may be in the stable branch as well, and that it simply disappeared from the gui during patching (issue/fix 799 maybe has something to do with it? there were changes to the network edit pages).

I have a bunch of notes from all the testing, but the function still isnt great. The speeds are below what i would expect and not consistent. I saw over 600mps once, but avg peak seems to be closer to 200-250, and that is only in bursts (im assuming this may be the disks, going to try a ssd next time). Then there is the lock up on boot, that really bothers me, but i was playing around alot so who know what i messed up.

Once i get some time and sort out the proper clean install and some notes that will make sense to someone other than me, I will certainly post it here. I’ll try to get it done sooner than this weekend if at all possible.

1 Like

Its early so i apologize in advance for any typos or incoherent thoughts, i will try to come back later today and revisit this and clean it up if needed. :grinning:

This install is based on RockStor iso V3.8-9 using a Mellenox MHGA28-XTC

My assumption that the a driver was present in this build was correct. It doesnt seem to be Mellanox or open fabric though since the tools they listed for testing are not found. This lead to some confusion on my part since the tests they had suggested relied on them and i thought the driver was not installed… Below is the basics of what i have found so far, as i educate myself further on centos (and linux in general) i will update with changes. This is on an older system with a connect-x card, but i dont see why it wouldnt apply to newer generations.

Verify that IPoIB is loaded:
lsmod | grep ib_ipoib

Check that the card was detected:
ifconfig ib0

assign the port an ip address to test connection
ifconfig ib0 192.168.50.20

Verify it was assigned properly
ifconfig ib0

verify the connection from another pc on the network via ping

configure ib0 to load on boot (still not sure if the is 100% right and im sure it needs tweaking to get the performance up). Change ips/domain to reflect your network

vi /etc/sysconfig/network-scripts/ifcfg-ib0

NAME=“ib0”
TYPE=“Infiniband”
BOOTPROTO=“static”
ONBOOT=“yes”
IPADDR0=“192.168.50.20”
NETMASK=“255.255.255.0”
GATEWAY0=“192.168.50.2”
DNS1=“192.168.50.2”
DOMAIN=infiniband.local

Reboot your NAS to and retest with ping to make sure it is still accessible. If it is great, turn on SMB and give it a try

Most of this information was pulled from a number of sources and i didnt save any of the links except the one below. Once i found it i was able to figure out what was really going on with the drivers.

https://docs.oracle.com/cd/E19436-01/820-3522-10/ch4-linux.html

Once i have finished running burn-in testing on my nas hardware i will reload it with a fresh copy and re-test. It has a newer card and sdd so i will also have a better idea of speeds and can tweak the settings.

I’m still new to linux and infiniband, but if anyone has questions i will do my best to answer them or point you in a direction.

Hope this helps

2 Likes

A newbie here to the Rockstor forums. Since I have also played with Infiniband adapters via the cli under Rockstor, I thought that I would share my experiences so far.

My hardware is older PCIe 2.0 ConnectX hardware which has dual ports which can each run in either 40Gb IB mode or 10GbE mode. My lspci output says:
01:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)

I followed the Infiniband guide that is online here:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/ch-Configure_InfiniBand_and_RDMA_Networks.html

Since the IP over Infiniband is faster than Ethernet for my configuration, I enabled IPoIB by adding the following line to “/etc/rdma/mlx4.conf” which forces each port on my adapter into Infiniband mode (rather than Ethernet). I did NOT need to do anything unique to the vanilla kernel to get the mlx4 drivers working on this card.
0000:01:00.0 ib ib

I currently have two systems directly cabled and so I do not use a managed IB switch. To enable IB fabric services, I also installed and started the “opensm” tool.
$ sudo yum install opensm libmlx4 librdmacm librdmacm-utils ibacm libibverbs-utils infiniband-diags perftest qperf
$ sudo systemctl enable opensm

I then manually configured the ifcfg files for my two ports.
$ cat /etc/sysconfig/network-scripts/ifcfg-ib0
DEVICE=ib0
TYPE=InfiniBand
ONBOOT=yes
BOOTPROTO=none
IPADDR=10.0.1.101
PREFIX=24
NETWORK=10.0.1.0
BROADCAST=10.0.1.255
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
MTU=65520
CONNECTED_MODE=yes
NAME=ib0

Once that was done on both hosts (one was Rockstor and the other was a standard linux system), I was able to rsync files into Rockstor at 100% drive write throughput. The large MTUs (64K MTU in this example) really help to keep CPU usage low.

Please reply here if you have any questions about my setup or would like me to try specific tests. This is a home system and I have some flexibility to try different things.