Anyone given the nvidia-docker runtime a try?

Just wanted to say that I did finally accomplish this and it is not for the faint of heart!

This was probably due to my insistence earlier this year to start updating the kernel and btrfs-progs beyond what is available in the Rockstor3 repos. (As an aside, that went fine due to the great instructions in this thread: Upgrade Kernel to 4.16 )

But because of that a whole host of other things were needed including developer tools, a newer version of GCC, as well as the aforementioned verion-locked nvidia-docker2 and nvidia-container-runtime packages from the nvidia repos.

Honestly getting the nvidia display and cuda driver installed was the hardest part - probably because of the updated kernel.

So, all of that to say that is is possible, but I wouldn’t suggest it.

#Add the Nvidia docker repo, make sure it still has packages for your docker version, install those
curl -s -L https://nvidia.github.io/nvidia-docker/centos7/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
yum search --showduplicates nvidia-docker2 nvidia-container-runtime
yum --setopt=obsoletes=0 install nvidia-docker2-2.0.0-1.docker17.09.0.ce nvidia-container-runtime-1.0.0-1.docker17.09.0

#Blacklist nouveau driver
bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"

#Backup initramfs and create a new one that includes the nouveau blacklist
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

#Install a newer version of GCC and associated tools you'll need for the nvidia driver install.  You may not need this if you haven't been fudging around with kernel versions and whatnot.
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install centos-release-scl
yum install devtoolset-8
scl enable devtoolset-8 bash

#Get the latest Nvidia driver, install per instructions, and reboot.  I chose to do this without DKMS, but you could always try that for yourself
wget https://download.nvidia.com/XFree86/Linux-x86_64/455.45.01/NVIDIA-Linux-x86_64-455.45.01.run
systemctl isolate multi-user.target
bash NVIDIA-Linux-x86_64-455.45.01.run
reboot

If I missed anything or anyone has any tips on how this process could be improved, I’m all ears. That said, I’m only a home-user, so if this explodes my NAS I don’t really care.

2 Likes