The steps in this nVidia vGPU guide were written for CentOS 8 but this guide can be used as reference for other versions/distributions of Linux.
Pre-Requisites
Ensure that the ESXi host drivers have been downloaded and installed on the ESXi host. Verify the drivers are working properly on the ESXi host using the nvidia-smi command.
Ensure you have access to the driver .zip file. Instructions to download the file are found here.
Preparing the VM Settings
- VM Boot Firmware can be BIOS or UEFI
- If Secure Boot is used with UEFI then the nVidia driver package will need to be signed during installation
- Ensure that VMware Tools are installed
- Under the Video Card settings in the VM ensure that 3D support is unchecked
- Reserve all of the RAM
Note: I’ve had better luck with the E1000 network adapter but the VMXNET3 adapter should work as long as you have VMware Tools installed.
Attaching the vGPU to the VM
- Ensure the VM is currently on an ESXi host with the GPU and Driver
- Right-click the VM and select Edit Settings
- Click Add New Device
- Select Shared PCI Device
- Scroll down to the new PCI Device
- Ensure that NVIDIA GRID vGPU is selected in the dropdown next to PCI Device
- Expand the PCI device
- Choose the appropriate profile for this vGPU
Note: Only profiles which end in B or Q are supported within a Linux VM. Profiles control the amount of GPU resources that are assigned to the vGPU.
Install the Driver
- Boot up the VM and login
- Run the following command to install gcc and kernel-devel
sudo yum install gcc
sudo yum install kernel-devel-$(uname -r)
sudo yum install make
sudo yum install elfutils-libelf-devel
- Change the current run level to 3:
sudo init 3
- Copy the .run file to the Linux VM. This file will be part of the driver .zip file that was obtained using the Driver Selection process.
Update and run the following command to start the nVidia vGPU Driver install:
sudo sh ./NVIDIA-Linux-x86_64-418.130-grid.run
- Yes to 32-bit libraries
- Don’t install libglvnd files
- Ignore error regarding path to libglvnd config files
- Do not automatically update X configuration
Edit /etc/default/grub and append the following to the “GRUB_CMDLINE_LINUX” line:
rd.driver.blacklist=nouveau nouveau.modeset=0
Run the following command to generate a new grub configuration with the changes
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
Edit or create /etc/modprobe.d/blacklist.conf and add the following line
blacklist nouveau
Edit /etc/gdm/custom.conf and uncomment the following line
WaylandEnable=false
Run the following commands to backup and build a new initramfs
sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)nouveau.img
sudo dracut /boot/initramfs-$(uname -r).img $(uname -r)
Configure X Server
This step seems like it can cause the most headaches. One of the reasons for this is because of the plethora of options within the nvidia-xconfig command.
Find the PCI Bus for the GPU with the follow command:
lspci
Identify the card and the PCI Bus. For example, 02:02.0 is the PCI Bus I will use in my example.
Run the following command to generate a new xorg.conf file for X Server that tells the VM to use the nVidia driver for the GPU.
sudo nvidia-xconfig --busid="PCI:2:2:0" --allow-empty-initial-configuration
Note: If you receive a validation error it can be safely ignored.
Verify X Server Health
The first thing to do before you reboot is to see if X Server will start properly. Issue the following command and ensure no errors are received:
sudo startx
If that works properly then you’re safe to reboot. If X Server fails to start then there is an issue with the /etc/X11/xorg.conf file. You can try running the nvidia-xconfig command again to ensure you didn’t make any errors.
License the vGPU
This step requires that the Licensing Server is installed, available and has the proper licenses configured. If this has not been completed please follow these steps.
Run the following command to create the licensing config file:
sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf
Run the following command to edit the newly created licensing config file:
sudo vi /etc/nvidia/gridd.conf
Update the ServerAddress, ServerPort and FeatureType fields.
Run the following command to start the nVidia Gridd service
sudo /etc/init.d/nvidia-gridd restart
Run the following command to verify that a license was obtained:
sudo cat /var/log/messages | grep -i license