InifiBand 40GB Network -Mellanox MLNX_OFED_LINUX -SOLUTION solve

Hello my Name is Roberto from Wenatchee WA USA. I’m setting up a NAS for production as a video editing work flow system. I’m to install Mellanox NIC drivers for IniniBand.
My NIC is a PCI-e 8x

Every time I try to install the 40GB NIC drivers from mellanox website I get this error.
Current operation system is not supported!
Does any know what the actual CentOS version used on this Rockstor Flavor?

[root@datrom iso]# uname -mrs
Linux 4.10.6-1.el7.elrepo.x86_64 x86_64
[root@datrom iso]# hostnamectl
Static hostname: datrom
Pretty hostname: Datrom
Icon name: computer-laptop
Chassis: laptop
Machine ID: 94e2809df7ad41fe8ca42871d1113048
Boot ID: c53158e3b46a4f198d7326461ea39de2
Operating System: Rockstor 3 (Core)
CPE OS Name: cpe:/o:rockstor:rockstor:3
Kernel: Linux 4.10.6-1.el7.elrepo.x86_64
Architecture: x86-64

How can I chage the Operarting system to say CentOS and its version or flavor instead of Rockstor 3?
Is there anyone with InfiniBand and Mellanox drivers for CentOS or linux experience that want put me into the right direction or help me on this proyect?

Hello @roberto0610, and welcome to the community!

I unfortunately do not know specifically about installing the Mellanox drivers, but I may help with a few other things:

It appears you are still running kernel 4.10, which seems to indicate you have a few updates to apply. Regardless of the channel activated (Testing or Stable), you should receive kernel 4.12 after yum update and be able to reboot into it.
I do not know whether it is related to your error, but it certainly will help with other things.

I do believe it is CentOS_7.6…

The rest of your questions will most likely be answered by other forum members a lot more knowledgeable than I am in this domain, but I thought I would share these little bits of information in case it helps move forward.

1 Like

Thank you. It definitly upgrading 722 packets.
I just installed Rockstor yesterday witch i downloaded from rockstor official site. I thought it was updated.
I downloading the drivers for CentOS 7.6 and try again after upgrade.

I’m still getting
Current operation system is not supported!

As I say I run the yum update.
but it doesn’t show up the 4.12 version as expected. this the new print out
#uname -mrs
Linux 4.10.6-1.e17.elrepo.x86_64 x86_64.

By the way this is a fresh install. I’ve download the ISO from rockstor officia site.

@roberto0610 I can chip in on:

Could it be that you have yet to select an update channel as per @Flox’s suggestion:

But the nub of your issue here is that the drive you are trying to install, i.e. designed for CentOS 7.6, will mostly likely assume a default kernel: which is where drivers live. But as the default kernel in CenOS and more specifically it’s btrfs subsystems, are way too old for practical btrfs use, we have had to ship our iso installs with a far newer kernel 4.10 and associated btrfs-progs. And then in turn have, via the testing and stable repos, again updated this to 4.12.

So the hope is that if you select an update channel you will then be offered the 4.12 kernel and that this in turn may cover your drive needs but if it doesn’t then you are in a tricky situation as we can’t feasably recommend the default CentOS kernel and btrfs progs yet you may be in a position where your currently required driver is only available for set kernels. Do they offer a driver that can be compiled against a given kernel version? If so then you could try that option as it will then be able to compile against the given Rockstor kernel.

Yes our iso is now quite old, hence the many updates. We are well on the way to migrating our offerings over to being based on openSUSE Leap15.1 and potentially Tumbleweed soon there after. But this is as yet not ready (but very much on the way). There is a very low likelyhood we will be releasing a new iso until that transition is complete and we have feature/function parity with our current CentOS offering; or near enough anyway.

Hope that helps.

I have found the Source Codes for Mellanox OFED drivers.
Updating here. I have reinstall the RS system again to star with a clean install.
After reinstall I have run the #yum -y update but this has not upgrade the system to 4.12.
Do I have to do something else other then reboot?
I have re run the #yum update command… No it says nothing to update. #os-installation

I just found out that after the #yum update command you have to select an option to boot either from 4.10 or 4.12 of the kernel.
The bad thing tho is that you have to select this option every-time. Since it won’t keep your selection settings.

How can I keep my selection of kernel from boot grub screen?

Now I’m trying to compile my Mellanox Drivers from source Office Site.
Link to Mellanox OFED official driver

@roberto0610 Hello again.

Apologies I should have been clearer. What you need to do upon first getting Rockstor installed is to follow the directions of the initial pop up dialog:

Initial-stable-updates-dialog

You will then be taken to what is also the page you arrive at if you select from the top menu bar:
System - Software Update

Where the relevant selection is presented thus:

Essentially either selection will lead to an additional repository being added and both have a 4.12 kernel in.

We also have an official document entry for this are of the Web-UI:

Update Channels

Or you can get to the documentation as a whole via:
Support -Documentation
again from the top menu.

But note that our testing channel is now behind the stable but for your purposes they both contain the exact same kernel version so either channel will do for testing out the compilation of your problematic network driver. You can always change channels later if things work out. But do please post your experiences here as their are known bugs here and there many of the forum will be able to help you with if you have issues.

But if all is well then the 4.12 kernel that comes with either of these update channels may end up recognising and enabling your card anyway.

Hope that helps.

Ok looks like I’m too late with my above gide. I’ll post anyway as it may help others.

I was straggling with my grub2 boot menu kernel selection not going to Kernel 4.12. As it was going to 4.10 by default.
I have use this process to get my RS box to boot to the right kernel 4.12.

I need some help on Copiling the Mellanox Connect-X 3 Pro VIP Driver onto my RSbox. Has anybody wiling to help me. I’ve never done it. and the web have many way that I can’t figure it out.
This is the site where I’m getting the Source files from.
https://community.mellanox.com/s/article/howto-compile-mlnx-ofed-drivers--160---mlnx-ofa-kernel-example-x

To be honest. I do not understand most of it. #os-installation #troubleshooting

Hi there, @roberto0610,

I had what started as a quick look at the instructions at the link you provided and then fell down a rabbit hole as I somehow did not want to stop until I get those drivers compiled. The compilation itself took a while so it’s only this morning that I can see it having completed successfully. Note, however, that I do not know whether or not it will fix your issue, but it seems it, at least, did what you wanted from the link you just provided.
I ended following the instructions provided in Mellanox’s documentation (linked below) recommending to use their installation script:
https://docs.mellanox.com/display/MLNXOFEDv461000/Installing+Mellanox+OFED#InstallingMellanoxOFED-InstallationScript

I’ll try to list the different steps I took (mostly from memory, so I hope I’m not missing a critical step). It may not be the best option either, but it seems to work:

  1. Download the drivers ISO (run yum install wget if you don’t already have it):
cd /tmp
wget www.mellanox.com/downloads/ofed/MLNX_OFED-4.6-1.0.1.1/MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-x86_64.iso
  1. Mount the ISO:
mkdir /mnt/iso
mount -t iso9660 -o loop MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-x86_64.iso /mnt/iso
  1. Go the ISO root dir and start the install script:
cd /mnt/iso/
./mlnxofedinstall --add-kernel-support --distro rhel7.6

I added the --add-kernel-support as you had some issue with the kernel not being supported out-of-the-box. I also specified that we were using CentOS7.6 as otherwise the script tries to detect it automatically and end up seeing Rockstor 3, which it doesn’t recognize.

This last step will take a while, and will most likely fail the first few times until all missing packages are installed. Fortunately, this install script will tell you exactly which ones are missing and will even print the exact yum install command to the console. You then simply need to paste that command and retry the ./mlnxofedinstall script.
There was one exception, however, in my case. Indeed, it needed the kernel-devel package (asking for 4.12.xxx headers). If you see this as well, simply install these files using yum install kernel-ml-devel).

After a long while, the script should succeed. Here’s the full output in my case:

[root@rocktest iso]# ./mlnxofedinstall --add-kernel-support --distro rhel7.6
Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.6 under /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64 directory.
See log file /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/mlnx_iso.29486_logs/mlnx_ofed_iso.29486.log


Checking if all needed packages are installed...
Building MLNX_OFED_LINUX RPMS . Please wait...
Creating metadata-rpms for 4.12.4-1.el7.elrepo.x86_64 ...
WARNING: If you are going to configure this package as a repository, then please note
WARNING: that it contains unsigned rpms, therefore, you need to disable the gpgcheck
WARNING: by setting 'gpgcheck=0' in the repository conf file.
Created /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-ext.tgz
Installing /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-ext
/tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-ext/mlnxofedinstall --force --distro rhel7.6
Logs dir: /tmp/MLNX_OFED_LINUX.22123.logs
General log file: /tmp/MLNX_OFED_LINUX.22123.logs/general.log
This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.


Starting MLNX_OFED_LINUX-4.6-1.0.1.1 installation ...

Installing mlnx-ofa_kernel 4.6 RPM
Preparing...                          ########################################
Updating / installing...
mlnx-ofa_kernel-4.6-OFED.4.6.1.0.1.1.g########################################
Configured /etc/security/limits.conf
Installing mlnx-ofa_kernel-modules 4.6 RPM
Preparing...                          ########################################
Updating / installing...
mlnx-ofa_kernel-modules-4.6-OFED.4.6.1########################################
Installing mlnx-ofa_kernel-devel 4.6 RPM
Preparing...                          ########################################
Updating / installing...
mlnx-ofa_kernel-devel-4.6-OFED.4.6.1.0########################################
Installing kernel-mft 4.12.0 RPM
Preparing...                          ########################################
Updating / installing...
kernel-mft-4.12.0-105.kver.4.12.4_1.el########################################
Installing knem 1.1.3.90mlnx1 RPM
Preparing...                          ########################################
Updating / installing...
knem-1.1.3.90mlnx1-OFED.4.4.2.5.2.1.g9########################################
Installing knem-modules 1.1.3.90mlnx1 RPM
Preparing...                          ########################################
Updating / installing...
knem-modules-1.1.3.90mlnx1-OFED.4.4.2.########################################
Installing iser 4.6 RPM
Preparing...                          ########################################
Updating / installing...
iser-4.6-OFED.4.6.1.0.1.1.ga2cfe08.kve########################################
Installing srp 4.6 RPM
Preparing...                          ########################################
Updating / installing...
srp-4.6-OFED.4.6.1.0.1.1.ga2cfe08.kver########################################
Installing isert 4.6 RPM
Preparing...                          ########################################
Updating / installing...
isert-4.6-OFED.4.6.1.0.1.1.ga2cfe08.kv########################################
Installing mlnx-rdma-rxe 4.6 RPM
Preparing...                          ########################################
Updating / installing...
mlnx-rdma-rxe-4.6-OFED.4.6.1.0.1.1.ga2########################################
Installing rshim 1.6 RPM
Preparing...                          ########################################
Updating / installing...
rshim-1.6-0.g6aa30c7.kver.4.12.4_1.el7########################################
Installing mpi-selector RPM
Preparing...                          ########################################
Updating / installing...
mpi-selector-1.0.3-1.46101            ########################################
Installing user level RPMs:
Preparing...                          ########################################
ofed-scripts-4.6-OFED.4.6.1.0.1       ########################################
Preparing...                          ########################################
libibverbs-41mlnx1-OFED.4.6.0.4.1.4610########################################
Preparing...                          ########################################
libibverbs-devel-41mlnx1-OFED.4.6.0.4.########################################
Preparing...                          ########################################
libibverbs-devel-static-41mlnx1-OFED.4########################################
Preparing...                          ########################################
libibverbs-utils-41mlnx1-OFED.4.6.0.4.########################################
Preparing...                          ########################################
libmlx4-41mlnx1-OFED.4.5.0.0.3.46101  ########################################
Preparing...                          ########################################
libmlx4-devel-41mlnx1-OFED.4.5.0.0.3.4########################################
Preparing...                          ########################################
libmlx5-41mlnx1-OFED.4.6.0.0.4.46101  ########################################
Preparing...                          ########################################
libmlx5-devel-41mlnx1-OFED.4.6.0.0.4.4########################################
Preparing...                          ########################################
librxe-41mlnx1-OFED.4.4.2.4.6.46101   ########################################
Preparing...                          ########################################
librxe-devel-static-41mlnx1-OFED.4.4.2########################################
Preparing...                          ########################################
libibcm-41mlnx1-OFED.4.1.0.1.0.46101  ########################################
Preparing...                          ########################################
libibcm-devel-41mlnx1-OFED.4.1.0.1.0.4########################################
Preparing...                          ########################################
libibumad-43.1.1.MLNX20190422.87b4d9b-########################################
Preparing...                          ########################################
libibumad-devel-43.1.1.MLNX20190422.87########################################
Preparing...                          ########################################
libibumad-static-43.1.1.MLNX20190422.8########################################
Preparing...                          ########################################
libibmad-5.4.0.MLNX20190423.1d917ae-0.########################################
Preparing...                          ########################################
libibmad-devel-5.4.0.MLNX20190423.1d91########################################
Preparing...                          ########################################
libibmad-static-5.4.0.MLNX20190423.1d9########################################
Preparing...                          ########################################
ibsim-0.7mlnx1-0.11.g85c342b.46101    ########################################
Preparing...                          ########################################
ibacm-41mlnx1-OFED.4.3.3.0.0.46101    ########################################
Preparing...                          ########################################
librdmacm-41mlnx1-OFED.4.6.0.0.1.46101########################################
Preparing...                          ########################################
librdmacm-utils-41mlnx1-OFED.4.6.0.0.1########################################
Preparing...                          ########################################
librdmacm-devel-41mlnx1-OFED.4.6.0.0.1########################################
Preparing...                          ########################################
opensm-libs-5.4.0.MLNX20190422.ed81811########################################
Preparing...                          ########################################
opensm-5.4.0.MLNX20190422.ed81811-0.1.########################################
Preparing...                          ########################################
opensm-devel-5.4.0.MLNX20190422.ed8181########################################
Preparing...                          ########################################
opensm-static-5.4.0.MLNX20190422.ed818########################################
Preparing...                          ########################################
dapl-2.1.10mlnx-OFED.3.4.2.1.0.46101  ########################################
Preparing...                          ########################################
dapl-devel-2.1.10mlnx-OFED.3.4.2.1.0.4########################################
Preparing...                          ########################################
dapl-devel-static-2.1.10mlnx-OFED.3.4.########################################
Preparing...                          ########################################
dapl-utils-2.1.10mlnx-OFED.3.4.2.1.0.4########################################
Preparing...                          ########################################
perftest-4.4-0.5.g1ceab48.46101       ########################################
Preparing...                          ########################################
mstflint-4.11.0-1.14.g840c9c2.46101   ########################################
Preparing...                          ########################################
mft-4.12.0-105                        ########################################
Preparing...                          ########################################
srptools-41mlnx1-5.46101              ########################################
Preparing...                          ########################################
ibutils2-2.1.1-0.104.MLNX20190408.gb55########################################
Preparing...                          ########################################
ibutils-1.5.7.1-0.12.gdcaeae2.46101   ########################################
Preparing...                          ########################################
cc_mgr-1.0-0.41.g750eb1e.46101        ########################################
Preparing...                          ########################################
dump_pr-1.0-0.37.g750eb1e.46101       ########################################
Preparing...                          ########################################
ar_mgr-1.0-0.42.g750eb1e.46101        ########################################
Preparing...                          ########################################
ibdump-5.0.0-3.46101                  ########################################
Preparing...                          ########################################
infiniband-diags-5.4.0.MLNX20190422.d1########################################
Preparing...                          ########################################
infiniband-diags-compat-5.4.0.MLNX2019########################################
Preparing...                          ########################################
qperf-0.4.9-9.46101                   ########################################
Preparing...                          ########################################
mxm-3.7.3111-1.46101                  ########################################
Preparing...                          ########################################
ucx-1.6.0-1.46101                     ########################################
Preparing...                          ########################################
ucx-devel-1.6.0-1.46101               ########################################
Preparing...                          ########################################
sharp-1.8.1.MLNX20190422.6c05a05-1.461########################################
Preparing...                          ########################################
ucx-cma-1.6.0-1.46101                 ########################################
Preparing...                          ########################################
ucx-ib-1.6.0-1.46101                  ########################################
Preparing...                          ########################################
ucx-ib-cm-1.6.0-1.46101               ########################################
Preparing...                          ########################################
ucx-rdmacm-1.6.0-1.46101              ########################################
Preparing...                          ########################################
ucx-knem-1.6.0-1.46101                ########################################
Preparing...                          ########################################
hcoll-4.3.2708-1.46101                ########################################
Preparing...                          ########################################
openmpi-4.0.2a1-1.46101               ########################################
Preparing...                          ########################################
mlnx-ethtool-4.19-1.46101             ########################################
Preparing...                          ########################################
mlnx-iproute2-4.20.0-1.46101          ########################################
Preparing...                          ########################################
mlnxofed-docs-4.6-1.0.1.1             ########################################
Preparing...                          ########################################
mpitests_openmpi-3.2.20-e1a0676.46101 ########################################

Installation finished successfully.


Preparing...                          ################################# [100%]
Updating / installing...
   1:mlnx-fw-updater-4.6-1.0.1.1      ################################# [100%]

Added 'RUN_FW_UPDATER_ONBOOT=no to /etc/infiniband/openib.conf

Attempting to perform Firmware update...
No devices found!

To load the new driver, run:
/etc/init.d/openibd restart

Hope this helps,

1 Like

Flox. Please correct me if I’m wrong. What you are saying is that I do not need to compile the drivers?

I have downloaded the .ISO driver and have them mounted.
# ./mlnxofedinstall --add-kernel-support --distro rhe17.6
Then I’m running this command as you instruct and this the error message that I’m getting.
Error: The current MLNX_OFED_LINUX is intended for rhel7.6

for you to know I’m also providing the fallowing information from this clean Install RS.
Well I did updated it to 4.12 and changed the grub config sot it booths to 4.12. I have also updated the the Rockstor by activating the free channel. I even tho I puchased the $20 dollar membership I’m still have not get the serial activation cod for my RS box.

Output form current system.
[root@datrom iso]# uname -a
Linux datrom 4.12.4-1.el7.elrepo.x86_64 #1 SMP Thu Jul 27 20:03:28 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

[root@datrom iso]# hostnamectl
Static hostname: datrom
Pretty hostname: Datrom
Icon name: computer-laptop
Chassis: laptop
Machine ID: 6f3e281317a54c8bbc73ba2808bd0664
Boot ID: b6dd34068be24f838c55fabc5906a9ac
Operating System: Rockstor 3 (Core)
CPE OS Name: cpe:/o:rockstor:rockstor:3
Kernel: Linux 4.12.4-1.el7.elrepo.x86_64
Architecture: x86-64

I wonder if I’m really executing the right command.

Very brief idea that may be the cause… In the command, the last flag should be the letters RHEL (in lower case, and followed by 7.6) and I believe what you pasted is the number seventeen.
Could you give that a try?

I was able to execute the command.
I got some dependencies but I got them install already.
Now. I’m getting this output.

[root@datrom iso]# ./mlnxofedinstall --add-kernel-support --distro rhel7.6
Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.6 under /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64 directory.
See log file /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/mlnx_iso.6358_logs/mlnx_ofed_iso.6358.log

Checking if all needed packages are installed…
/lib/modules/4.12.4-1.el7.elrepo.x86_64/build/scripts is required to build mlnx-ofa_kernel RPM.
Please install the corresponding kernel-source or kernel-devel RPM.

Error: One or more required packages for installing OFED-internal are missing.
Please install the missing packages using your Linux distribution Package Management tool.
Run:
yum install kernel-devel-4.12.4-1.el7.elrepo.x86_64
Failed to build MLNX_OFED_LINUX for 4.12.4-1.el7.elrepo.x86_64
[root@datrom iso]# nano /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/mlnx_iso.
mlnx_iso.5634/ mlnx_iso.5634_logs/ mlnx_iso.6358/ mlnx_iso.6358_logs/
[root@datrom iso]# nano /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/mlnx_iso.6358_logs/mlnx_ofed_iso.6358.log
[root@datrom iso]# cat /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/mlnx_iso.6358_logs/mlnx_ofed_iso.6358.log
[root@datrom iso]# cat /tmp/MLNX_OFED_LINUX-4.6-1.0.1.1-4.12.4-1.el7.elrepo.x86_64/mlnx_iso.6358_logs/mlnx_ofed_iso.6358.log

What it is that I’m doing wrong.

I’m also getting this error message while tying to execute the # yum command.

Another app is currently holding the yum lock; waiting for it to exit…
The other application is: yum-cron
Memory : 34 M RSS (1.4 GB VSZ)
Started: Sun Aug 11 14:01:01 2019 - 09:20 ago
State : Sleeping, pid: 5446

I have also executed
# yum install kernel-devel-4.12.4-1.el7.elrepo.x86_64
as instructed by the scrip.
I get this as an output.
# No package kernel-devel-4.12.4-1.el7.elrepo.x86_64 available.
# Error: Nothing to do

I’m on my phone so please excuse any typo.

This error looks like the exception I was talking about in my post above. I think you need to run:

yum install kernel-ml-devel

This is simply due to the fact that the scheduled yum check for update is running. Try again shortly after and it should be running fine.

1 Like

Installation complete for the Mellanox OFED driver for ConnectX3
You will need to install many components. Good thing tho is that the installation script will guide you on this process.
Fallow the instructions from the link at the official site to download the .iso file and mount it. for compatibility on the installation process you’ll need to use this installation flags instead of the the ones indicated on the Mellanox Official site (for RockStor).

Official manual link:
Mellanox OFED Installation Script

Pre-Installation Notes: You may have to use or install this Required components prior executing the Mellanox OFED installation script.
# yum -y install kernel-ml-devel
# yum -y install python-devel lsof redhat-rpm-config rpm-build kernel-devel-4.12.4-1.el7.elrepo.x86_64 gcc
# yum -y install gtk2 tcl gcc-gfortran tcsh tk
# yum -y install createrepo

then you can execute the installation script.
Original driver on Mellanox website.
# cd /tmp
# wget www.mellanox.com/downloads/ofed/MLNX_OFED-4.6-1.0.1.1/MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-x86_64.iso

Continue to mount and install the script.
# mkdir /mnt/iso
# mount -t iso9660 -o loop MLNX_OFED_LINUX-4.6-1.0.1.1-rhel7.6-x86_64.iso /mnt/iso
# cd /mnt/iso
# ./mlnxofedinstall --add-kernel-support --distro rhel7.6

Well this is it on how to install Mellanox OFED drivers on Rockstore 3 box based on Centos 7.6.

Special thanks to @Flox @phillxnet

PD. I hope this help someone else. Please excuse my grammar or type. ESL.

2 Likes

Hi there. New here. Sorry for being late to the party – I’m not sure if you would be in a position to the try the following with a RS install:

# yum -y groupinstall 'Infiniband Support'
# yum -y install opensm

If RockStor is built on top of CentOS 7.6 (or maybe even CentOS 7.7), you might be able to do use that and use the “inbox” CentOS Mellanox drivers rather than the driver from Mellanox.

The reason for this difference is because version 2 of the MLNX_OFED drivers – Mellanox disabled NFS over RDMA (which I think is stupid and I have confirmed that twice with them now) whereas with the CentOS “inbox” drivers, you can still run NFS over RDMA.

My current cluster is running Mellanox ConnectX-4 dual port VPI 4x EDR 100 Gbps Infiniband cards acrossed four slave/compute nodes and two headnodes, all connected to a Mellanox MSB-7890 externally managed 36-port 4x EDR 100 Gbps Infiniband switch.

You might want to try that and see if that works for you.

But if you’re using your ConnectX-3 cards in ethernet mode rather than Infiniband mode, then this might not necessarily apply for you as much.

But if you’re running in Infiniband mode, than this might be useful to you. The potential downside that I can see is that the “normal” reporting tools for network bandwidth usage may not show any traffic going through it due to the fact that you can and might be running NFS over RDMA, and therefore; by virtue of that, the “standard” reporting tools won’t be aware of the RDMA traffic.

But otherwise, yes, this procedure that has been described/outlined here will or can be very useful to me if I decide to ditch my Qnap NAS units in favour of RockStor because it has the pretty GUI and you (apparently) can run at least 40 Gbps IB on it (which Qnap can’t/doesn’t support).

And I didn’t want to run just “vanilla” CentOS. I’ve gotten used to the pretty GUIs thanks to Qnap.

Thanks.

3 Likes

Thank you for your recommendations @alpha754293 'll try it. By the way. I’m running on Inifinibad. My Win10 machines shows the NIC as 40Gbps. But I’ve never got transfers over 10Gbps. Never got Transfer over 1200Mb/sec. Even tho testing it with RAM disk on DDR4 and coming from a F80 Sun Oracle FCIe Accelerator on my Rockstor with 32GB RAM DDR4 with cache enable. My be it is because of the settins you recommend. I have tons of other IB questions but I guess I can’t go further more because I don’t have a Manged Switch like you have. But having to transport Internet layer with be really good so that way I can unplug my CAT6 gigabit Ethernet that I currently use to get access to On my video editing working rigs. IB networking has definitly imprube my working flow. But not much different than just using 10G SFTP that Mikrotik can offer for about $160 now. I definitly going to to give it a try.

There is coming this little toy to my server to serve as Editing and Temp Video Ache for Adobe Premiere and After Effects. I hope I can solve my 10gb threshold max this babe.
Sandik SystemX 6400GB SX350 <<Here it is the specs for you to check

What will be the command to remove MLNX_OFED? It’s the one I currently use to initiate my IB OPensm network on my server (as I don’t have a managed switch to carry my OpenSM part).
Second question: Is your suggested alternative going to allow me to Initiate my IB OpenSM network?

But I’ve never gotten transfer over 10Gbps.

That will depend on what your storage system is set up. I have four HGST 6 TB SATA 6 Gbps 7200 rpm HDDs in RAID0 with a Broadcom/Avago/LSI MegaRAID SAS 12 Gbps 9341-8i so the theorectical maximum bandwidth should be only about 24 Gbps (4 * 6 Gbps). I can hit that if I transfer data locally to my other RAID0 array locally on the system, but over RDMA, I think that I’ve maxed out at around 16 Gbps (out of a theorectically possible 24 Gbps) running NFSoRDMA.

In other words, I don’t have enough spindles to be able to max that out any further.

Just to clarify, I don’t have a managed switch. My switch is externally managed, so one of my headnodes runs OpenSM as the subnet manager and that is what makes the switch work. Without it, my IB switch won’t work/run.

Yeah, I’m moving my systems pretty much all to 100 Gbps IB. The reason for it is since I already got the switch (which is probably one of the most expensive pieces of the entire puzzle), now that I have 36-ports at 100 Gbps capacity, the rest is just buying the cards used off eBay (which is how I got into it in the first place, after watching one of the videos from Linus Sebastian of Linus Tech Tips. And compared to the retail prices, I’m buying the cards at a discount of more than 50%. And I don’t mind the used hardware. The cables were the next most expensive component. For short distances, I used copper direct attach cables (QSFP28 to QSFP28 to the switch). And I got the switch because I have four nodes, which means that I have a minimum of 6 pair-wise connections that I can make. The cards each have two ports, but that means that I don’t have enough total ports to just be able to direct connect four nodes to each other. So I had to get the switch to be able to do that. And then I added in my first headnode to manage the cluster, and now I am bringing online a second headnode because it has the LTO8 backup drive. So one of the headnodes processes and prepackages the data to be written to tape and the other headnode writes it to tape. (Writing about 10 TB of data over GbE was estimated to take 25 hours, so I need to bring up the second headnode with 100 Gbps IB so that it can write at the LTO drive speed of 300 MB/s and cut that down to a theorectical 8 hours instead.)

For the price, again, with the exception of the switch and the cables, the $ per gigabit per port is actually cheaper for me to go with 100 Gbps Infiniband than it would have been for me to go with 10GbE (either Cat6 RJ45 or SFP+ fiber). I still don’t see to many switches where they have like 10-12 10GbE port with a 100GbE uplink port(s). So with me using 100 Gbps IB, my network won’t be the bottleneck anymore. It’s other stuff. (Storage arrays, etc.)

The Mellanox installation manual for the MLNX_OFED drivers also has the instructions to uninstall it should you choose.

Again, it’s really only critical, I think, if you plan on using NFSoRDMA. I think that if you’re using a Windows system, that’ll depend on whether your IB host is also Windows or whether it’s Linux. If you have a Linux system that can run the opensm, but then your IB host runs Windows, you might actually try to use SMBDirect instead, at which point, it would be functionally and practically irrelevant whether you use the inbox driver or the MLNX_OFED driver because you’re only using it to run opensm.

To the best of my knowledge, I don’t know if you can “criss cross” protocols like that – meaning having Linux side run NFSoRDMA and the Windows side run SMB Direct and having both of those talk to each other. I don’t think that it works like that. I also don’t know if Windows can mount a NFSoRDMA mount (like Linux can).

If your clients are Windows and you want really fast speeds, then that suggests that you would need or should need the host to be Windows as well so that you can run SMB Direct, which also means that you need a third system that’s running Linux to run opensm.

Otherwise, I’m not sure if you will see much of a benefit because the real advantage of Infiniband is its RDMA capabilities.

Conversely, if you have one of the Mellanox ConnectX-3 cards (or ConnectX-4 40 Gbps IB cards) where the ports can be configured, you might be able to set the port(s) to run in ethernet mode (mlxconfig -d device SET_LINK_TYPE_P1=2 or something like that), and that would make it so that you don’t need opensm.

And yes, the other method that I have described/proposed works in regards to running opensm. You just have to make sure that you install the opensm package in the OS, in addition to

# yum -y groupinstall 'Infiniband Support'

so that you will be able to get opensm up and running.

Like I said, this is how I have configured my cluster right now and it works because I wanted the NFSoRDMA capabilities because my entire cluster runs on Linux. (And because I have a Linux system that runs opensm, in theory, I can make everything else run IB on Windows since the opensm runs my externally managed switch, but I don’t have it set up right now.)

The one catch/caution is that if you’re going to get an IB switch, the ports on the IB switch is NOT configurable to what their port types are. (Which I think is stupid because they have that capability on their cards, but they choose not to put it on their switches. The 100 GbE switches cost a lot more per port than their 100 Gbps IB switches.)

So just keep that in mind. But it should work for you.

Let me know if you have questions.

Thanks.

2 Likes