Linux KVM vGPU driver 470.63 fails to compile on linux 5.11.22

Hello!

We’re trying to install NVIDIA-GRID-Linux-KVM-470.63.01-471.68. But the kernel interfaces seem to have changed.

Prior to upgrading this server using vGPUs we used linux 5.4. No problem there. In 5.11.22 we’re getting:

/var/lib/dkms/nvidia/470.63/build/nvidia/nv-dma.c:963: warning: "IMPORT_SGT_STUBS_NEEDED" redefined
  963 | #define IMPORT_SGT_STUBS_NEEDED 0
      | 
/var/lib/dkms/nvidia/470.63/build/nvidia/nv-dma.c:957: note: this is the location of the previous definition
  957 | #define IMPORT_SGT_STUBS_NEEDED 1
      | 
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/nv-caps.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/nv-frontend.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/nv_uvm_interface.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/nv-vgpu-vfio-interface.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/nvlink_linux.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/nvlink_caps.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/linux_nvswitch.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/procfs_nvswitch.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia/i2c_nvswitch.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/vgpu-devices.o
  CC [M]  /var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nv-pci-table.o
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c: In function ‘vgpu_msix_handler’:
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:2730:18: error: implicit declaration of function ‘get_fs’; did you mean ‘sget_fc’? [-Werror=implicit-function-declaration]
 2730 |         old_fs = get_fs();
      |                  ^~~~~~
      |                  sget_fc
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:2730:18: error: incompatible types when assigning to type ‘mm_segment_t’ from type ‘int’
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:2731:9: error: implicit declaration of function ‘set_fs’; did you mean ‘sget_fc’? [-Werror=implicit-function-declaration]
 2731 |         set_fs(KERNEL_DS);
      |         ^~~~~~
      |         sget_fc
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:2731:16: error: ‘KERNEL_DS’ undeclared (first use in this function); did you mean ‘KERNEL_3’?
 2731 |         set_fs(KERNEL_DS);
      |                ^~~~~~~~~
      |                KERNEL_3
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:2731:16: note: each undeclared identifier is reported only once for each function it appears in
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c: In function ‘nv_vgpu_inject_interrupt’:
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:3020:14: error: incompatible types when assigning to type ‘mm_segment_t’ from type ‘int’
 3020 |     old_fs = get_fs();
      |              ^~~~~~
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.c:3021:12: error: ‘KERNEL_DS’ undeclared (first use in this function); did you mean ‘KERNEL_3’?
 3021 |     set_fs(KERNEL_DS);
      |            ^~~~~~~~~
      |            KERNEL_3
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:288: /var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/nvidia-vgpu-vfio.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/vgpu-devices.c: In function ‘nv_vfio_vgpu_get_attach_device’:
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/vgpu-devices.c:729:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]
  729 | }
      | ^
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/vgpu-devices.c: In function ‘nv_vgpu_dev_ioctl’:
/var/lib/dkms/nvidia/470.63/build/nvidia-vgpu-vfio/vgpu-devices.c:356:1: warning: the frame size of 1120 bytes is larger than 1024 bytes [-Wframe-larger-than=]
  356 | }
      | ^
make[1]: *** [Makefile:1849: /var/lib/dkms/nvidia/470.63/build] Error 2

Any help to address this is appreciated :)

EDIT
Just noticed that they removed set_fs/get_fs in 5.10. So will there be an updated driver to address this issue?

Best regards
Marcus Nordenberg

Hi Marcus,

Are you still having issues with this setup? Not clear from your description which distribution of Linux KVM you are running.

List of supported distributions are available here: Supported Products :: NVIDIA Virtual GPU Software Documentation

It looks like you upgraded to a kernel version that is not compatible with vGPU 13.0. We have released vGPU 13.1 since your posting which hopefully will resolve the issue.

Recommend that you contact NVIDIA enterprise support if you continue to have issues.

:D: