nVidia vGPU Guide – Linux (CentOS 7.7)

The steps in this nVidia vGPU guide were written for CentOS 7.7 but this guide can be used as reference for other versions/distributions of Linux.

Pre-Requisites

Ensure that the ESXi host drivers have been downloaded and installed on the ESXi host. Verify the drivers are working properly on the ESXi host using the nvidia-smi command.

Ensure you have access to the driver .zip file. Instructions to download the file are found here.

Preparing the VM Settings

  1. VM Boot Firmware can be BIOS or UEFI
    1. If Secure Boot is used with UEFI then the nVidia driver package will need to be signed during installation
  2. Ensure that VMware Tools are installed
  3. Under the Video Card settings in the VM ensure that 3D support is unchecked
  4. Reserve all of the RAM

Note: I’ve had better luck with the E1000 network adapter but the VMXNET3 adapter should work as long as you have VMware Tools installed.

Attaching the vGPU to the VM

  1. Ensure the VM is currently on an ESXi host with the GPU and Driver
  2. Right-click the VM and select Edit Settings
  3. Click Add New Device
  4. Select Shared PCI Device
  5. Scroll down to the new PCI Device
  6. Ensure that NVIDIA GRID vGPU is selected in the dropdown next to PCI Device
  7. Expand the PCI device
  8. Choose the appropriate profile for this vGPU

Note: Only profiles which end in B or Q are supported within a Linux VM. Profiles control the amount of GPU resources that are assigned to the vGPU.

Install the Driver

  • Boot up the VM and login
  • Run the following command to install gcc and kernel-devel
sudo yum install gcc
sudo yum install kernel-devel-$(uname -r)
  • Change the current run level to 3:
sudo init 3
  • Copy the .run file to the Linux VM. This file will be part of the driver .zip file that was obtained using the Driver Selection process.

Update and run the following command to start the nVidia vGPU Driver install:

sudo sh ./NVIDIA-Linux-x86_64-418.130-grid.run
  • Yes to 32-bit libraries
  • Don’t install libglvnd files
    • Ignore error regarding path to libglvnd config files
  • Do not automatically update X configuration

Edit /etc/default/grub and append the following to the “GRUB_CMDLINE_LINUX” line:

rd.driver.blacklist=nouveau nouveau.modeset=0

Run the following command to generate a new grub configuration with the changes

sudo grub2-mkconfig -o /boot/grub2/grub.cfg 

Edit or create /etc/modprobe.d/blacklist.conf and add the following line

blacklist nouveau

Run the following commands to backup and build a new initramfs

sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)nouveau.img
sudo dracut /boot/initramfs-$(uname -r).img $(uname -r)

Configure X Server

This step seems like it can cause the most headaches. One of the reasons for this is because of the plethora of options within the nvidia-xconfig command.

Find the PCI Bus for the GPU with the follow command:

lspci

Identify the card and the PCI Bus. For example, 02:02.0 is the PCI Bus I will use in my example.

Run the following command to generate a new xorg.conf file for X Server that tells the VM to use the nVidia driver for the GPU.

sudo nvidia-xconfig --busid="PCI:2:2:0" --allow-empty-initial-configuration

Note: If you receive a validation error it can be safely ignored.

Verify X Server Health

The first thing to do before you reboot is to see if X Server will start properly. Issue the following command and ensure no errors are received:

sudo startx

If that works properly then you’re safe to reboot. If X Server fails to start then there is an issue with the /etc/X11/xorg.conf file. You can try running the nvidia-xconfig command again to ensure you didn’t make any errors.

License the vGPU

This step requires that the Licensing Server is installed, available and has the proper licenses configured. If this has not been completed please follow these steps.

Run the following command to create the licensing config file:

sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf

Run the following command to edit the newly created licensing config file:

sudo vi /etc/nvidia/gridd.conf

Update the ServerAddress, ServerPort and FeatureType fields.

Run the following command to start the nVidia Gridd service

sudo /etc/init.d/nvidia-gridd restart

Run the following command to verify that a license was obtained:

sudo cat /var/log/messages | grep -i license

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.