Anyone given the nvidia-docker runtime a try?

A little background,

my system is starting to look a little long in the tooth when it comes to plex transcodes and I’d like to be able to leverage the nvidia GPU I have in the system to do a little bit of the heavy-lifting, but that of course requires installing the nvidia proprietary drivers (something that I’m not sure rockstor supports by default) and then installing the nvidia-docker runtime which might break something in the rockstor docker implementation. (I know it would also take some edits to the rock-on file used for plex, but honestly I’m not worried about that as changing a docker file to the rock-on json is pretty trivial)

I know I could just take a full backup of my boot disk and try, but I wanted to see if anyone had attempted this or something similar already. Are there any features planned to take advantage of setups like this?

Hi @criddle!

That looks like an interesting endeavor… Before going further, however, and in case you aren’t already using it, you may be able to help transcoding using QuickSync if you have compatible hardware. It’s nothing like leveraging your GPU, of course, but it should be a lot easier to do and may be able to help your transcoding in the meantime.

DISCLAIMER: I haven’t tested for any conflict or compatibility issue with what I’ve described below, so try at your own risks.

To test this, I tried to follow the nvidia-docker documentation on its installation on RHEL7 systems, but quickly ran into something a little worrying. Indeed, upon activating their rhel7.6 repo and trying to install nvidia-docker2, I ran into some compatibility issue:

Error: Package: nvidia-docker2-2.0.3-3.docker18.09.7.ce.noarch (nvidia-docker)
           Requires: docker-ce = 3:18.09.7
           Installed: docker-ce-17.09.0.ce-1.el7.centos.x86_64 (@Rockstor-Stable)
               docker-ce = 17.09.0.ce-1.el7.centos

As you can see, the version of docker-ce installed is incompatible with nvidia-docker2. Now, you technically can manually update docker’s version, but I don’t know whether or not this will create some conflict(s) with Rockstor.
A better way to address this would be to use a version of nvidia-docker2 compatible with Rockstor’s docker version, as per nvidia’s doc:

yum --setopt=obsoletes=0 install nvidia-docker2-2.0.0-1.docker17.09.0.ce nvidia-container-runtime-1.0.0-1.docker17.09.0

I unfortunately cannot test further as I don’t have NVIDIA hardware available but that should be a good start if you were to go on this journey.

1 Like

Just wanted to say that I did finally accomplish this and it is not for the faint of heart!

This was probably due to my insistence earlier this year to start updating the kernel and btrfs-progs beyond what is available in the Rockstor3 repos. (As an aside, that went fine due to the great instructions in this thread: Upgrade Kernel to 4.16 )

But because of that a whole host of other things were needed including developer tools, a newer version of GCC, as well as the aforementioned verion-locked nvidia-docker2 and nvidia-container-runtime packages from the nvidia repos.

Honestly getting the nvidia display and cuda driver installed was the hardest part - probably because of the updated kernel.

So, all of that to say that is is possible, but I wouldn’t suggest it.

#Add the Nvidia docker repo, make sure it still has packages for your docker version, install those
curl -s -L https://nvidia.github.io/nvidia-docker/centos7/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
yum search --showduplicates nvidia-docker2 nvidia-container-runtime
yum --setopt=obsoletes=0 install nvidia-docker2-2.0.0-1.docker17.09.0.ce nvidia-container-runtime-1.0.0-1.docker17.09.0

#Blacklist nouveau driver
bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"

#Backup initramfs and create a new one that includes the nouveau blacklist
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

#Install a newer version of GCC and associated tools you'll need for the nvidia driver install.  You may not need this if you haven't been fudging around with kernel versions and whatnot.
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install centos-release-scl
yum install devtoolset-8
scl enable devtoolset-8 bash

#Get the latest Nvidia driver, install per instructions, and reboot.  I chose to do this without DKMS, but you could always try that for yourself
wget https://download.nvidia.com/XFree86/Linux-x86_64/455.45.01/NVIDIA-Linux-x86_64-455.45.01.run
systemctl isolate multi-user.target
bash NVIDIA-Linux-x86_64-455.45.01.run
reboot

If I missed anything or anyone has any tips on how this process could be improved, I’m all ears. That said, I’m only a home-user, so if this explodes my NAS I don’t really care.

1 Like